Copyright © 2021-2022 W3C ® ( MIT , ERCIM , Keio , Beihang ). W3C liability , trademark and permissive document license rules apply.
This
document
defines
how
a
browser
viewport
can
be
used
as
the
source
of
a
media
stream
using
getViewportMedia
,
an
extension
to
the
Screen
Capture
API
[
screen-capture
].
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is not complete.
This document was published by the Web Real-Time Communications Working Group as an Editor's Draft.
Publication as an Editor's Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
This document is governed by the 2 November 2021 W3C Process Document .
This section is non-normative.
This document describes an extension to the Screen Capture API [ screen-capture ] that enables the acquisition of the browser viewport (the current tab), in the form of a video track. In some cases, tab audio is also captured in the form of an audio track. This enables use cases such as: recording an ongoing WebRTC [ WEBRTC ] video meeting, or a user in a video meeting sharing their presentation without having to locate it in a picker, by instead clicking a button in their presentation application.
This feature is only available to "cross-origin isolated" documents. This prevents applications from using this API to access potentially confidential information from other origins, content that should remain inaccessible due to the protections offered by the user agent sandbox.
This feature has security implications, and requires a permission prompt. Sharing the rendered viewport may expose user information such as browsing history (through link purpling), personal details like address or payment info (through user agent or web extension features like form autofill), or personal preferences (like font size).
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY , MUST , and MUST NOT in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript [ ECMA-262 ] to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [ WEBIDL ], as this specification uses that specification and terminology.
The
following
example
demonstrates
a
request
for
viewport
capture
using
the
navigator.mediaDevices.getViewportMedia
method
defined
in
this
document.
try {
const stream = await navigator.mediaDevices.getViewportMedia();
videoElement.srcObject = stream;
} catch (e) {
console.log('Unable to acquire viewport capture: ' + e);
}
This
document
uses
the
definitions
of
MediaStream
,
MediaStreamTrack
,
MediaStreamConstraints
and
ConstrainablePattern
from
[
GETUSERMEDIA
],
and
the
definitions
of
display
surface
and
browser
display
surface
from
[
screen-capture
].
Capture
of
the
viewport
is
enabled
through
the
addition
of
a
new
getViewportMedia
method
on
the
MediaDevices
interface,
that
is
similar
to
getDisplayMedia
()
,
except
it
only
captures
the
top-level
document's
viewport
(current
tab),
using
a
permission
prompt
instead
of
presenting
the
user
with
a
picker.
For
security
reasons,
it
also
only
works
from
"cross-origin
isolated"
documents
that
opt-in
with
a
document-policy.
WebIDLpartial interface MediaDevices
{
Promise<MediaStream> getViewportMedia
(
optional
optional ViewportMediaStreamConstraints
constraints = {});
};
getViewportMedia
Prompts the user for permission to live-capture the viewport (current tab).
The user agent MUST apply any provided constraints to the produced media after permission has been granted.
In
the
case
of
audio,
the
user
agent
MAY
present
the
end-user
with
an
option
to
include
audio
from
the
current
viewport
in
the
capture,
if
available.
Like
getDisplayMedia
()
with
regards
to
audio+video,
the
user
agent
is
allowed
to
not
return
audio
even
if
the
audio
constraint
is
present.
If
the
user
agent
knows
no
audio
will
be
shared
for
the
lifetime
of
the
stream
it
MUST
NOT
include
an
audio
track
in
the
resulting
stream.
The
user
agent
MAY
accept
a
request
for
audio
and
video
by
only
returning
a
video
track
in
the
resulting
stream,
or
it
MAY
accept
the
request
by
returning
both
an
audio
track
and
a
video
track
in
the
resulting
stream.
The
user
agent
MUST
reject
audio-only
requests.
Like
getDisplayMedia
()
,
the
"
granted
"
permission
cannot
be
persisted.
When
the
getViewportMedia
()
method
is
called,
the
user
agent
MUST
run
the
following
steps:
If
the
current
settings
object
's
cross-origin
isolated
capability
is
false,
return
a
promise
rejected
with
a
DOMException
object
whose
name
attribute
has
the
value
SecurityError
.
Document
's
top-level
browsing
context
's
required
document
policy
does
not
contain
Require-Document-Policy:
viewport-capture
and
Document-Policy:
viewport-capture
(TODO:
use
correct
algorithm),
return
a
promise
rejected
with
a
DOMException
object
whose
name
attribute
has
the
value
SecurityError
.
If
the
relevant
global
object
of
this
does
not
have
transient
activation
,
return
a
promise
rejected
with
a
DOMException
object
whose
name
attribute
has
the
value
InvalidStateError
.
Let constraints be the method's first argument.
If
constraints.video
is
false
,
return
a
promise
rejected
with
a
newly
created
TypeError
.
For each existing member in constraints whose value, CS , is a dictionary, run the following steps:
If
CS
contains
a
member
named
advanced
,
return
a
promise
rejected
with
a
newly
created
TypeError
.
If
CS
contains
a
member
whose
name
specifies
a
constrainable
property
applicable
to
display
surface
s,
and
whose
value
in
turn
is
a
dictionary
containing
a
member
named
either
min
or
exact
,
return
a
promise
rejected
with
a
newly
created
TypeError
.
If
CS
contains
a
member
whose
name
specifies
a
constrainable
property
applicable
to
display
surface
s,
and
whose
value
in
turn
is
a
dictionary
containing
a
member
named
max
,
and
that
member's
value
in
turn
is
less
than
the
constrainable
property's
floor
value
,
then
let
failedConstraint
be
the
name
of
the
member,
let
message
be
either
undefined
or
an
informative
human-readable
message,
and
return
a
promise
rejected
with
a
new
OverconstrainedError
created
by
calling
OverconstrainedError(
failedConstraint
,
message
)
.
Let
requestedMediaTypes
be
the
set
of
media
types
in
constraints
with
either
a
dictionary
value
or
a
value
of
true
.
If
the
relevant
global
object
's
associated
Document
is
NOT
fully
active
or
does
NOT
have
focus
,
return
a
promise
rejected
with
a
DOMException
object
whose
name
attribute
has
the
value
InvalidStateError
.
Let p be a new promise.
Run the following steps in parallel:
For each media type T in requestedMediaTypes ,
If
no
sources
of
type
T
are
available,
reject
p
with
a
new
DOMException
object
whose
name
attribute
has
the
value
NotFoundError
.
Read
the
current
permission
state
for
obtaining
sources
of
type
T
in
the
current
browsing
context.
If
the
permission
state
is
"
denied
",
jump
to
the
step
labeled
PermissionFailure
below.
Optionally, e.g., based on a previously-established user preference, for security reasons, or due to platform limitations, jump to the step labeled Permission Failure below.
Request
permission
to
use
viewport
capture,
for
a
PermissionDescriptor
with
its
name
set
to
"viewport-capture"
,
resulting
in
a
set
of
provided
media.
The
provided
media
MUST
include
precisely
one
video
track,
which
MUST
be
a
live-capture
of
the
browser
display
surface
of
the
relevant
global
object
's
associated
Document
's
top-level
browsing
context
's
viewport
.
The
provided
media
MUST
include
at
most
one
audio
track,
which,
if
provided,
MUST
be
the
combined
audio
produced
by
the
sum
of
documents
that
consist
of
the
relevant
global
object
's
associated
Document
's
top-level
browsing
context
's
active
document
,
and
all
active
document
s
in
nested
browsing
context
s
of
the
relevant
global
object
's
associated
Document
's
top-level
browsing
context
.
This
audio
track
MUST
NOT
be
included
if
audio
was
not
specified
in
requestedMediaTypes
,
or
if
it
was
specified
as
false
.
The
source
of
a
MediaStreamTrack
MUST
NOT
change.
If
the
result
of
the
request
is
"
granted
",
then
for
each
device
that
is
sourcing
the
provided
media,
using
a
stable
and
private
id
for
the
device,
deviceId
,
set
[[devicesLiveMap]]
[deviceId]
to
true
,
if
it
isn’t
already
true
,
and
set
the
[[devicesAccessibleMap]]
[deviceId]
to
true
,
if
it
isn’t
already
true
.
The
user
agent
MUST
NOT
store
a
"
granted
"
permission
entry.
If
the
result
is
"
denied
",
jump
to
the
step
labeled
Permission
Failure
below.
If
the
user
never
responds,
this
algorithm
stalls
on
this
step.
If
the
user
grants
permission
but
a
hardware
error
such
as
an
OS/program/webpage
lock
prevents
access,
reject
p
with
a
new
DOMException
object
whose
name
attribute
has
the
value
NotReadableError
and
abort
these
steps.
If
the
result
is
"
granted
"
but
device
access
fails
for
any
reason
other
than
those
listed
above,
reject
p
with
a
new
DOMException
object
whose
name
attribute
has
the
value
AbortError
and
abort
these
steps.
Let
stream
be
the
MediaStream
object
for
which
the
user
granted
permission.
Run
the
ApplyConstraints
algorithm
on
all
tracks
in
stream
with
the
appropriate
constraints.
Should
this
fail,
let
failedConstraint
be
the
result
of
the
algorithm
that
failed,
and
let
message
be
either
undefined
or
an
informative
human-readable
message,
and
then
reject
p
with
a
new
OverconstrainedError
created
by
calling
OverconstrainedError(
failedConstraint
,
message
)
.
Resolve p with stream and abort these steps.
Permission
Failure
:
Reject
p
with
a
new
DOMException
object
whose
name
attribute
has
the
value
NotAllowedError
.
Return p .
The user agent MUST NOT capture content that's behind a partially transparent captured display surface .
The user agent MUST NOT share the audio other than audio emitted from the captured tab, and MUST NOT share audio of the entire system.
The
constraints
relevant
to
The
If
If
getViewportMedia
are
only
those
relevant
to
getDisplayMedia
()
,
as
defined
in
5.4
Constrainable
Properties
for
Captured
Display
Surfaces
.
ViewportMediaStreamConstraints
dictionary
is
used
to
instruct
the
user
agent
what
sort
of
MediaStreamTrack
s
may
be
included
in
the
MediaStream
returned
by
getViewportMedia
.WebIDL
dictionary
ViewportMediaStreamConstraints
{
(boolean or MediaTrackConstraints) video
= true;
(boolean or MediaTrackConstraints) audio
= false;
};
Dictionary
ViewportMediaStreamConstraints
Members
video
of
type
(boolean
or
,
defaulting
to
MediaTrackConstraints
)
true
true
,
it
requests
that
the
returned
MediaStream
contain
a
video
track.
If
a
MediaTrackConstraints
structure
is
provided,
it
further
specifies
desired
processing
options
to
be
applied
to
the
video
track
rendition
of
the
display
surface
chosen
by
the
user.
If
false
,
the
request
will
be
rejected
with
a
TypeError
,
as
per
the
getViewportMedia
algorithm.
audio
of
type
(boolean
or
,
defaulting
to
MediaTrackConstraints
)
false
true
,
it
signals
an
interest
that
the
returned
MediaStream
contain
an
audio
track,
if
supported
and
audio
is
available.
If
a
MediaTrackConstraints
structure
is
provided,
it
further
specifies
desired
processing
options
to
be
applied
to
the
audio
track.
If
false
,
the
MediaStream
will
not
contain
an
audio
track.
Viewport
Capture
is
a
powerful
feature
which
is
identified
by
the
name
"viewport-capture"
,
requiring
express
permission
to
be
used.
As required for integration with the Permissions specification, this specification defines the following:
prompt
"
and
"
denied
".
The
user
agent
MUST
NOT
ever
set
this
descriptor's
permission
state
to
"
granted
".
This
specification
defines
a
policy-controlled
feature
identified
by
the
string
"viewport-capture"
.
Its
default
allowlist
is
"self"
.
A
document
's
permissions
policy
determines
whether
any
content
in
that
document
is
allowed
to
use
getViewportMedia
.
If
disabled
in
any
document,
no
content
in
the
document
will
be
allowed
to
use
getViewportMedia
.
This
is
enforced
by
the
request
permission
to
use
algorithm.
This
specification
extends
the
Privacy
Indicator
Requirements
of
getDisplayMedia
()
to
include
getViewportMedia
.
This section is informative; however, it notes some serious risks to platform security if the advice it contains are not adhered to.
TBD.
Referenced in:
Referenced in: