1. Introduction
This section is non-normativeThis specification defines an API to query the user agent with regards to its audio and video decoding and encoding capabilities, based on information such as the codecs, profile, resolution, bitrates, etc., of the media. The API indicates if the configuration is supported and whether the playback is expected to be smooth and/or power efficient.
This specification focuses on encoding and decoding capabilities. It is expected to be used with other web APIs that provide information about the display properties, such as supported color gamut or dynamic range capabilities, which enable web applications to pick the right content for the display and to, for example, avoid providing HDR content to an SDR display.
2. Decoding and Encoding Capabilities
2.1. Media Configurations
2.1.1. MediaConfiguration
dictionary {MediaConfiguration VideoConfiguration ;video AudioConfiguration ; };audio
dictionary :MediaDecodingConfiguration MediaConfiguration {required MediaDecodingType ;type MediaCapabilitiesKeySystemConfiguration ; };keySystemConfiguration
dictionary :MediaEncodingConfiguration MediaConfiguration {required MediaEncodingType ; };type
The
input
to
the
decoding
capabilities
is
represented
by
a
MediaDecodingConfiguration
dictionary
and
the
input
to
the
encoding
capabilities
by
a
MediaEncodingConfiguration
dictionary.
For
a
MediaConfiguration
to
be
a
valid
MediaConfiguration
,
all
of
the
following
conditions
MUST
be
true:
-
audioand/orvideoMUST exist . -
audioMUST be a valid audio configuration if it exists . -
videoMUST be a valid video configuration if it exists .
For
a
MediaDecodingConfiguration
to
be
a
valid
MediaDecodingConfiguration
,
all
of
the
following
conditions
MUST
be
true:
- It MUST be a valid MediaConfiguration .
-
If
keySystemConfigurationexists :
For
a
MediaDecodingConfiguration
to
describe
[ENCRYPTED-MEDIA]
,
a
keySystemConfiguration
MUST
exist
.
2.1.2. MediaDecodingType
enum {MediaDecodingType "file" ,"media-source" ,"webrtc" };
A
MediaDecodingConfiguration
has
three
types:
-
fileis used to represent a configuration that is meant to be used for playback of media sources other thanMediaSourceas defined in [media-source] andRTCPeerConnectionas defined in [webrtc] . -
media-sourceis used to represent a configuration that is meant to be used for playback of aMediaSource. -
webrtcis used to represent a configuration that is meant to be received usingRTCPeerConnection.
2.1.3. MediaEncodingType
enum {MediaEncodingType "record" ,"webrtc" };
A
MediaEncodingConfiguration
can
have
one
of
two
types:
-
recordis used to represent a configuration for recording of media, e.g., usingMediaRecorderas defined in [mediastream-recording] . -
webrtcis used to represent a configuration that is meant to be transmitted usingRTCPeerConnectionas defined in [webrtc] ).
2.1.4. VideoConfiguration
dictionary {VideoConfiguration required DOMString contentType ;required unsigned long width ;required unsigned long height ;required unsigned long long bitrate ;required double framerate ;boolean hasAlphaChannel ;HdrMetadataType hdrMetadataType ;ColorGamut colorGamut ;TransferFunction transferFunction ;DOMString scalabilityMode ;boolean spatialScalability ; };
The
contentType
member
represents
the
MIME
type
of
the
video
track.
To
check
if
a
VideoConfiguration
configuration
is
a
valid
video
configuration
,
the
following
steps
MUST
be
run:
-
If
framerateis not finite or is not greater than 0, returnfalseand abort these steps. -
If
an
optional
member
is
specified
for
a
MediaDecodingTypeorMediaEncodingTypeto which it’s not applicable, returnfalseand abort these steps. See applicability rules in the member definitions below. -
Let
mimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
contentType. -
If
mimeType
is
failure, returnfalse. -
Return
the
result
of
running
check
MIME
type
validity
with
mimeType
and
video.
The
width
and
height
members
represent
respectively
the
visible
horizontal
and
vertical
encoded
pixels
in
the
encoded
video
frames.
The
bitrate
member
represents
the
bitrate
of
the
video
track
given
in
units
of
bits
per
second.
In
the
case
of
a
video
stream
encoded
at
a
constant
bit
rate
(
CBR
)
this
shall
represent
the
average
bitrate
of
the
video
track.
For
the
case
of
variable
bit
rate
(
VBR
)
encoding,
this
value
shall
represent
the
maximum
bitrate
of
the
video
track.
The
framerate
member
represents
the
framerate
of
the
video
track.
The
framerate
is
the
number
of
frames
used
in
one
second
(frames
per
second).
It
is
represented
as
a
double.
In
the
case
of
a
video
stream
with
a
variable
framerate,
this
represents
the
maximum
framerate
of
the
stream.
The
hasAlphaChannel
member
represents
whether
the
video
track
contains
alpha
channel
information.
If
true,
the
encoded
video
stream
can
produce
per-pixel
alpha
channel
information
when
decoded.
If
false,
the
video
stream
cannot
produce
per-pixel
alpha
channel
information
when
decoded.
If
undefined,
the
UA
should
determine
whether
the
video
stream
encodes
alpha
channel
information
based
on
the
indicated
contentType
,
if
possible.
Otherwise,
the
UA
should
presume
that
the
video
stream
cannot
produce
alpha
channel
information.
If
present,
the
hdrMetadataType
member
represents
that
the
video
track
includes
the
specified
HDR
metadata
type,
which
the
UA
needs
to
be
capable
of
interpreting
for
tone
mapping
the
HDR
content
to
a
color
volume
and
luminance
of
the
output
device.
Valid
inputs
are
defined
by
HdrMetadataType
.
hdrMetadataType
is
only
applicable
to
MediaDecodingConfiguration
for
types
media-source
and
file
.
If
present,
the
colorGamut
member
represents
that
the
video
track
is
delivered
in
the
specified
color
gamut,
which
describes
a
set
of
colors
in
which
the
content
is
intended
to
be
displayed.
If
the
attached
output
device
also
supports
the
specified
color,
the
UA
needs
to
be
able
to
cause
the
output
device
to
render
the
appropriate
color,
or
something
close
enough.
If
the
attached
output
device
does
not
support
the
specified
color,
the
UA
needs
to
be
capable
of
mapping
the
specified
color
to
a
color
supported
by
the
output
device.
Valid
inputs
are
defined
by
ColorGamut
.
colorGamut
is
only
applicable
to
MediaDecodingConfiguration
for
types
media-source
and
file
.
If
present,
the
transferFunction
member
represents
that
the
video
track
requires
the
specified
transfer
function
to
be
understood
by
the
UA.
Transfer
function
describes
the
electro-optical
algorithm
supported
by
the
rendering
capabilities
of
a
user
agent,
independent
of
the
display,
to
map
the
source
colors
in
the
decoded
media
into
the
colors
to
be
displayed.
Valid
inputs
are
defined
by
TransferFunction
.
transferFunction
is
only
applicable
to
MediaDecodingConfiguration
for
types
media-source
and
file
.
If
present,
the
scalabilityMode
member
represents
the
scalability
mode
as
defined
in
[webrtc-svc]
.
If
absent,
the
implementer
defined
default
mode
for
this
contentType
is
assumed
(i.e.,
the
mode
you
get
if
you
don’t
specify
one
via
setParameters()
).
scalabilityMode
is
only
applicable
to
MediaEncodingConfiguration
for
type
webrtc
.
If
the
scalabilityMode
indicates
that
there
are
multiple
spatial
layers,
the
width
and
height
values
in
VideoConfiguration
correspond
to
the
largest
spatial
layer
that
is
encoded.
If
present,
the
spatialScalability
member
represents
the
ability
to
do
spatial
prediction,
that
is,
using
frames
of
a
resolution
different
than
the
current
resolution
as
dependencies.
If
absent,
spatialScalability
defaults
to
false
.
If
spatialScalability
is
set
to
true
,
the
decoder
can
decode
any
scalabilityMode
that
the
encoder
can
encode
for
the
configured
codec.
If
spatialScalability
is
set
to
false
,
the
decoder
cannot
decode
spatial
scalability
modes,
but
can
decode
all
other
scalabilityMode
values
that
the
encoder
can
encode
for
the
configured
codec.
spatialScalability
is
only
applicable
to
MediaDecodingConfiguration
for
types
media-source
,
file
,
and
webrtc
.
2.1.5. HdrMetadataType
enum {HdrMetadataType "smpteSt2086" ,"smpteSt2094-10" ,"smpteSt2094-40" };
If
present,
HdrMetadataType
describes
the
capability
to
interpret
HDR
metadata
of
the
specified
type.
The
VideoConfiguration
may
contain
one
of
the
following
types:
-
smpteSt2086, representing the static metadata type defined by [SMPTE-ST-2086] . -
smpteSt2094-10, representing the dynamic metadata type defined by [SMPTE-ST-2094] . -
smpteSt2094-40, representing the dynamic metadata type defined by [SMPTE-ST-2094] .
2.1.6. ColorGamut
enum {ColorGamut "srgb" ,"p3" ,"rec2020" };
The
VideoConfiguration
may
contain
one
of
the
following
types:
2.1.7. TransferFunction
enum {TransferFunction "srgb" ,"pq" ,"hlg" };
The
VideoConfiguration
may
contain
one
of
the
following
types:
-
srgb, representing the transfer function defined by [sRGB] . -
pq, representing the "Perceptual Quantizer" transfer function defined by [SMPTE-ST-2084] . -
hlg, representing the "Hybrid Log Gamma" transfer function defined by BT.2100.
2.1.8. AudioConfiguration
dictionary {AudioConfiguration required DOMString contentType ;DOMString channels ;unsigned long long bitrate ;unsigned long samplerate ;boolean spatialRendering ; };
The
contentType
member
represents
the
MIME
type
of
the
audio
track.
To
check
if
a
AudioConfiguration
configuration
is
a
valid
audio
configuration
,
the
following
steps
MUST
be
run:
-
Let
mimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
contentType. -
If
mimeType
is
failure, returnfalse. -
Return
the
result
of
running
check
MIME
type
validity
with
mimeType
and
audio.
The
channels
member
represents
the
audio
channels
used
by
the
audio
track.
channels
is
only
applicable
to
the
decoding
types
media-source
,
file
,
and
webrtc
and
the
encoding
type
webrtc
.
The
channels
needs
to
be
defined
as
a
double
(2.1,
4.1,
5.1,
...),
an
unsigned
short
(number
of
channels)
or
as
an
enum
value.
The
current
definition
is
a
placeholder.
The
bitrate
member
represents
the
average
bitrate
of
the
audio
track.
The
bitrate
is
the
number
of
bits
used
to
encode
a
second
of
the
audio
track.
The
samplerate
member
represents
the
sample
rate
of
the
audio
track.
The
sample
rate
is
the
number
of
samples
of
audio
carried
per
second.
samplerate
is
only
applicable
to
the
decoding
types
media-source
,
file
,
and
webrtc
and
the
encoding
type
webrtc
.
The
samplerate
is
expressed
in
Hz
(ie.
number
of
samples
of
audio
per
second).
Sometimes
the
samplerates
value
are
expressed
in
kHz
which
represents
the
number
of
thousands
of
samples
of
audio
per
second.
44100
Hz
is
equivalent
to
44.1
kHz
.
The
spatialRendering
member
indicates
that
the
audio
SHOULD
be
rendered
spatially.
The
details
of
spatial
rendering
SHOULD
be
inferred
from
the
contentType
.
If
it
does
not
exist
,
the
UA
MUST
presume
spatial
rendering
is
not
required.
When
true
,
the
user
agent
SHOULD
only
report
this
configuration
as
supported
if
it
can
support
spatial
rendering
for
the
current
audio
output
device
without
failing
back
to
a
non-spatial
mix
of
the
stream.
spatialRendering
is
only
applicable
to
MediaDecodingConfiguration
for
types
media-source
and
file
.
2.1.9. MediaCapabilitiesKeySystemConfiguration
dictionary {MediaCapabilitiesKeySystemConfiguration required DOMString keySystem ;DOMString initDataType = "";MediaKeysRequirement distinctiveIdentifier = "optional";MediaKeysRequirement persistentState = "optional";sequence <DOMString >sessionTypes ;KeySystemTrackConfiguration audio ;KeySystemTrackConfiguration video ; };
This
dictionary
refers
to
a
number
of
types
defined
by
[ENCRYPTED-MEDIA]
(EME).
Sequences
of
EME
types
are
flattened
to
a
single
value
whenever
the
intent
of
the
sequence
was
to
have
requestMediaKeySystemAccess()
choose
a
subset
it
supports.
With
MediaCapabilities,
callers
provide
the
sequence
across
multiple
calls,
ultimately
letting
the
caller
choose
which
configuration
to
use.
The
keySystem
member
represents
a
keySystem
name
as
described
in
[ENCRYPTED-MEDIA]
.
The
initDataType
member
represents
a
single
value
from
the
initDataTypes
sequence
described
in
[ENCRYPTED-MEDIA]
.
The
distinctiveIdentifier
member
represents
a
distinctiveIdentifier
requirement
as
described
in
[ENCRYPTED-MEDIA]
.
The
persistentState
member
represents
a
persistentState
requirement
as
described
in
[ENCRYPTED-MEDIA]
.
The
sessionTypes
member
represents
a
sequence
of
required
sessionTypes
as
described
in
[ENCRYPTED-MEDIA]
.
The
audio
member
represents
a
KeySystemTrackConfiguration
associated
with
the
AudioConfiguration
.
The
video
member
represents
a
KeySystemTrackConfiguration
associated
with
the
VideoConfiguration
.
2.1.10. KeySystemTrackConfiguration
dictionary {KeySystemTrackConfiguration DOMString robustness = "";DOMString ?encryptionScheme =null ; };
The
robustness
member
represents
a
robustness
level
as
described
in
[ENCRYPTED-MEDIA]
.
The
encryptionScheme
member
represents
an
encryptionScheme
as
described
in
[ENCRYPTED-MEDIA-DRAFT]
.
2.2. Media Capabilities Information
dictionary {MediaCapabilitiesInfo required boolean supported ;required boolean smooth ;required boolean powerEfficient ; };
dictionary :MediaCapabilitiesDecodingInfo MediaCapabilitiesInfo {required MediaKeySystemAccess ?keySystemAccess ;required MediaDecodingConfiguration configuration ; };
dictionary :MediaCapabilitiesEncodingInfo MediaCapabilitiesInfo {required MediaEncodingConfiguration configuration ; };
A
MediaCapabilitiesInfo
has
associated
supported
,
smooth
,
powerEfficient
fields
which
are
booleans.
Encoding or decoding is considered power efficient when the power draw is optimal. The definition of optimal power draw for encoding or decoding is left to the user agent. However, a common implementation strategy is to consider hardware usage as indicative of optimal power draw. User agents SHOULD NOT mark hardware encoding or decoding as power efficient by default, as non-hardware-accelerated codecs can be just as efficient, particularly with low-resolution video. User agents SHOULD NOT take the device’s power source into consideration when determining encoding power efficiency unless the device’s power source has side effects such as enabling different encoding or decoding modules.
A
MediaCapabilitiesDecodingInfo
has
associated
keySystemAccess
which
is
a
MediaKeySystemAccess
or
null
as
appropriate.
If
the
encrypted
decoding
configuration
is
supported,
the
resulting
MediaCapabilitiesInfo
will
include
a
MediaKeySystemAccess
.
Authors
may
use
this
to
create
MediaKeys
and
setup
encrypted
playback.
A
MediaCapabilitiesDecodingInfo
has
an
associated
configuration
which
is
the
decoding
configuration
properties
used
to
generate
the
MediaCapabilitiesDecodingInfo
.
A
MediaCapabilitiesEncodingInfo
has
an
associated
configuration
which
is
the
encoding
configuration
properties
used
to
generate
the
MediaCapabilitiesEncodingInfo
.
2.3. Algorithms
2.3.1. Create a MediaCapabilitiesEncodingInfo
To
create
a
MediaCapabilitiesEncodingInfo
,
given
a
MediaEncodingConfiguration
configuration
,
run
the
following
steps.
They
return
a
MediaCapabilitiesEncodingInfo
:
-
Let
info
be
a
new
MediaCapabilitiesEncodingInfoinstance. Unless stated otherwise, reading and writing apply to info for the next steps. -
Set
configurationto be a newMediaEncodingConfiguration. For every property in configuration create a new property with the same name and value inconfiguration. -
Let
videoSupported
be
unknown. -
If
videois present in configuration , run the following steps:-
Let
videoMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
contentType. -
Set
videoSupported
to
the
result
of
running
check
MIME
type
support
with
videoMimeType
configuration
’s
type.
-
Let
videoMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
-
Let
audioSupported
be
unknown. -
If
audiois present in configuration , run the following steps:-
Let
audioMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
contentType. -
Set
audioSupported
to
the
result
of
running
check
MIME
type
support
with
audioMimeType
configuration
’s
type.
-
Let
audioMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
-
If
either
videoSupported
or
audioSupported
is
unsupported, setsupportedtofalse,smoothtofalse,powerEfficienttofalse, and return info . -
Otherwise,
set
supportedtotrue. -
If
the
user
agent
is
able
to
encode
the
media
represented
by
configuration
at
the
indicated
framerate,
set
smoothtotrue. Otherwise set it tofalse. -
If
the
user
agent
is
able
to
encode
the
media
represented
by
configuration
in
a
power
efficient
manner,
set
powerEfficienttotrue. Otherwise set it tofalse. - Return info .
2.3.2. Create a MediaCapabilitiesDecodingInfo
To
create
a
MediaCapabilitiesDecodingInfo
,
given
a
MediaDecodingConfiguration
configuration
,
perform
the
following
steps.
They
return
a
MediaCapabilitiesDecodingInfo
:
-
Let
info
be
a
new
MediaCapabilitiesDecodingInfoinstance. Unless stated otherwise, reading and writing apply to info for the next steps. -
Set
configurationto be a newMediaDecodingConfiguration. For every property in configuration create a new property with the same name and value inconfiguration. -
If
configuration.keySystemConfigurationexists :-
Set
keySystemAccessto the result of running the Check Encrypted Decoding Support algorithm with configuration . -
If
keySystemAccessisnull, setsupportedtofalse,smoothtofalse,powerEfficienttofalse, and return info . -
Otherwise,
set
supportedtotrueand continue with step 6.
-
Set
-
Otherwise,
run
the
following
steps:
-
Set
keySystemAccesstonull. -
Let
videoSupported
be
unknown. -
If
videois present in configuration , run the following steps:-
Let
videoMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
contentType. -
Set
videoSupported
be
the
result
of
running
check
MIME
type
support
with
videoMimeType
,
configuration
’s
type, configuration ’scolorGamut, and configuration ’stransferFunction.
-
Let
videoMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
-
Let
audioSupported
be
unknown. -
If
audiois present in configuration , run the following steps:-
Let
audioMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
contentType. -
Set
audioSupported
to
the
result
of
running
check
MIME
type
support
with
audioMimeType
configuration
’s
type.
-
Let
audioMimeType
be
the
result
of
running
parse
a
MIME
type
with
configuration
’s
-
If
either
videoSupported
or
audioSupported
is
unsupported, setsupportedtofalse,smoothtofalse,powerEfficienttofalse, and return info .
-
Set
-
Set
supportedtotrue. -
If
the
user
agent
is
able
to
decode
the
media
represented
by
configuration
at
the
indicated
framerate
without
dropping
frames,
set
smoothtotrue. Otherwise set it tofalse. -
If
the
user
agent
is
able
to
decode
the
media
represented
by
configuration
in
a
power
efficient
manner,
set
powerEfficienttotrue. Otherwise set it tofalse. - Return info .
2.3.3. Check MIME Type Validity
To check MIME type validity given a MIME type record mimeType and a string media , run the following steps:
-
If
the
type
of
mimeType
per
[RFC9110]
is
neither
media
nor
application, returnfalse. -
If
the
combined
typeandsubtypemembers of mimeType allow a single media codec and theparametersmember of mimeType is not empty, returnfalse. -
If
the
combined
typeandsubtypemembers of mimeType allow multiple media codecs, run the following steps:-
If
the
parametersmember of mimeType does not contain a single key named "codecs", returnfalse.Why does it matter if a single media codec is listed? [Issue #235]
-
If
the
value
of
mimeType.parameters["codecs"]does not describe a single media codec, returnfalse.
-
If
the
-
Return
true.
Does
this
logic
apply
to
webrtc
?
[Issue
#238]
2.3.4. Check MIME Type Support
To
check
MIME
type
support
,
given
a
MIME
type
record
mimeType
,
a
MediaEncodingType
or
MediaDecodingType
encodingOrDecodingType
,
an
optional
colorGamut
from
colorGamut
,
and
an
optional
transferFunction
from
transferFunction
,
perform
the
following
steps.
They
return
supported
if
the
MIME
type
is
supported
by
the
user
agent
,
unsupported
if
the
MIME
type
is
not
supported:
-
If
encodingOrDecodingType
is
webrtc(MediaEncodingType) orwebrtc(MediaDecodingType) and mimeType is not one that is used with RTP (as defined in the specifications of the corresponding RTP payload formats [IANA-MEDIA-TYPES] [RFC6838] ), returnunsupported.The codec name is typically specified as subtype and zero or more parameters may be present depending on the codec.
-
If
colorGamut
is
present
and
is
not
valid
for
mimeType
,
return
unsupported. -
If
transferFunction
is
present
and
is
not
valid
for
mimeType
,
return
unsupported.User agents should refer to the video codec specification for the codec named by mimeType to determine valid values for colorGamut and transferFunction .
How do we ensure interop in validation steps here? [Issue #245]
-
If
mimeType
is
not
supported
by
the
user
agent
,
return
unsupported. -
Return
supported.
2.3.5. Check Encrypted Decoding Support
To
check
encrypted
decoding
support
,
given
a
MediaDecodingConfiguration
config
where
keySystemConfiguration
exists
,
perform
the
following
steps.
They
return
a
MediaKeySystemAccess
or
null
as
appropriate:
-
If
the
keySystemmember ofconfig.keySystemConfigurationis not one of the Key Systems supported by the user agent, returnnull. String comparison is case-sensitive. -
Let
origin
be
the
origin
of
the
calling
context’s
Document. -
Let
implementation
be
the
implementation
of
config.keySystemConfiguration.keySystem. -
Let
emeConfiguration
be
a
new
MediaKeySystemConfiguration, and initialize it as follows:-
Set
the
initDataTypesattribute to a sequence containingconfig.keySystemConfiguration.initDataType. -
Set
the
distinctiveIdentifierattribute toconfig.keySystemConfiguration.distinctiveIdentifier. -
Set
the
persistentStateattribute toconfig.keySystemConfiguration.peristentState. -
Set
the
sessionTypesattribute toconfig.keySystemConfiguration.sessionTypes. -
If
audioexists in config , set theaudioCapabilitiesattribute to a sequence containing a singleMediaKeySystemMediaCapability, initialized as follows:-
Set
the
contentTypeattribute toconfig.audio.contentType. -
If
config.keySystemConfiguration.audioexists :-
If
config.keySystemConfiguration.audio.robustnessexists and is notnull, set therobustnessattribute toconfig.keySystemConfiguration.audio.robustness. -
Set
the
encryptionSchemeattribute toconfig.keySystemConfiguration.audio.encryptionScheme.
-
If
-
Set
the
-
If
videoexists in config , set the videoCapabilities attribute to a sequence containing a singleMediaKeySystemMediaCapability, initialized as follows:-
Set
the
contentTypeattribute toconfig.video.contentType. -
If
config.keySystemConfiguration.videoexists :-
If
config.keySystemConfiguration.video.robustnessexists and is notnull, set therobustnessattribute toconfig.keySystemConfiguration.video.robustness. -
Set
the
encryptionSchemeattribute toconfig.keySystemConfiguration.video.encryptionScheme.
-
If
-
Set
the
-
Set
the
- Let supported configuration be the result of executing the Get Supported Configuration algorithm [ENCRYPTED-MEDIA] on implementation , emeConfiguration , and origin .
-
If
supported
configuration
is
NotSupported, returnnull. -
Let
access
be
a
new
MediaKeySystemAccessobject, and initialize it as follows:-
Set
the
keySystemattribute toemeConfiguration.keySystem. - Let the configuration value be supported configuration .
- Let the cdm implementation value be implementation .
-
Set
the
- Return access .
2.4. Navigator and WorkerNavigator extension
[Exposed =Window ]partial interface Navigator { [SameObject ]readonly attribute MediaCapabilities ; };mediaCapabilities
[Exposed =Worker ]partial interface WorkerNavigator { [SameObject ]readonly attribute MediaCapabilities ; };mediaCapabilities
2.5. Media Capabilities Interface
[Exposed =(Window ,Worker )]interface { [MediaCapabilities NewObject ]Promise <MediaCapabilitiesDecodingInfo >(decodingInfo MediaDecodingConfiguration ); [configuration NewObject ]Promise <MediaCapabilitiesEncodingInfo >(encodingInfo MediaEncodingConfiguration ); };configuration
2.5.1. Media Capabilities Task Source
The task source for the tasks mentioned in this specification is the media capabilities task source .
When an algorithm queues a Media Capabilities task T , the user agent MUST queue a global task T on the media capabilities task source using the global object of the the current realm record .
2.5.2. decodingInfo() Method
The
decodingInfo()
method
MUST
run
the
following
steps:
-
If
configuration
is
not
a
valid
MediaDecodingConfiguration
,
return
a
Promise
rejected
with
a
newly
created
TypeError. -
If
configuration.keySystemConfigurationexists , run the following substeps:-
If
the
global
object
is
of
type
WorkerGlobalScope, return a Promise rejected with a newly createdDOMExceptionwhose name isInvalidStateError. -
If
the
global
object’s
relevant
settings
object
is
a
non-secure
context
,
return
a
Promise
rejected
with
a
newly
created
DOMExceptionwhose name isSecurityError.
-
If
the
global
object
is
of
type
- Let p be a new Promise.
-
Run
the
following
steps
in
parallel
:
- Run the Create a MediaCapabilitiesDecodingInfo algorithm with configuration .
- Queue a Media Capabilities task to resolve p with its result.
- Return p .
Note,
calling
decodingInfo()
with
a
keySystemConfiguration
present
may
have
user-visible
effects,
including
requests
for
user
consent.
Such
calls
should
only
be
made
when
the
author
intends
to
create
and
use
a
MediaKeys
object
with
the
provided
configuration.
2.5.3. encodingInfo() Method
The
encodingInfo()
method
MUST
run
the
following
steps:
-
If
configuration
is
not
a
valid
MediaConfiguration
,
return
a
Promise
rejected
with
a
newly
created
TypeError. - Let p be a new Promise.
-
Run
the
following
steps
in
parallel
:
- Run the Create a MediaCapabilitiesEncodingInfo algorithm with configuration .
- Queue a Media Capabilities task to resolve p with its result.
- Return p .
3. Security and Privacy Considerations
This specification does not introduce any security-sensitive information or APIs but it provides easier access to some information that can be used to fingerprint users.
3.1. Capabilities Model
This
specification
supports
MediaDecodingType
values
of
file
,
media-source
or
webrtc
as
well
as
MediaEncodingType
values
of
record
and
webrtc
.
In realtime communications supported by [webrtc] , media is transported between peers. Although sites are responsible for exchanging the information necessary to negotiate media parameters common to both user agents, they are typically not involved in media transport, encoding, or decoding. For 1-1 calls, user agents negotiate the media to be sent and received.
In a conferencing scenario, a user agent can send media for reception by dozens or even hundreds of receivers. To improve scalability, applications make use of external servers, such as selective forwarding units or conferencing bridges. These servers negotiate media parameters with participants, ensuring consistency across senders and receivers. This is more scalable than negotiation between user agents, which would require N * (N -1) negotiations. Typically senders encode with a single codec, and conferencing servers do not support transcoding, so a user agent cannot simply "pick the one they like best".
3.2. Decoding/Encoding and Fingerprinting
The information exposed by the decoding/encoding capabilities can already be discovered via experimentation with the exception that the API will likely provide more accurate and consistent information. This information is expected to have a high correlation with other information already available to web pages as a given class of device is expected to have very similar decoding/encoding capabilities. In other words, high end devices from a certain year are expected to decode some type of videos while older devices may not. Therefore, it is expected that the entropy added with this API isn’t going to be significant.
HDR
detection
is
more
nuanced.
Adding
colorGamut
,
transferFunction
,
and
hdrMetadataType
has
the
potential
to
add
significant
entropy.
However,
for
UAs
whose
decoders
are
implemented
in
software
and
therefore
whose
capabilities
are
fixed
across
devices,
this
feature
adds
no
effective
entropy.
Additionally,
for
many
cases,
devices
tend
to
fall
into
large
categories,
within
which
capabilities
are
similar
thus
minimizing
effective
entropy.
An alternative design approach in which sites expose the available media formats and browsers evaluate these against capabilities, returning only the chosen format was considered. However, this would not in fact offer a privacy benefit since sites could use the API repeatedly to obtain the complete capability set. Stringent rate limiting of the API could interfere with normal site behaviors such as speculative preparation across multiple playback items.
If an implementation wishes to implement a fingerprint-proof version of this specification, it would be recommended to fake a given set of capabilities (i.e., decode up to 1080p VP9, etc.) instead of returning always yes or always no as the latter approach could considerably degrade the user’s experience. Another mitigation could be to limit these Web APIs to top-level browsing contexts. Yet another is to use a privacy budget that throttles and/or blocks calls to the API above a threshold. Additionally, browsers may consider whether a site goes on to make use of the capabilities it detects and apply more stringent controls to sites that are observed not to do so.
4. Examples
4.1.
Query
playback
capabilities
with
decodingInfo()
The
following
example
shows
how
to
use
decodingInfo()
to
query
media
playback
capabilities
when
using
Media
Source
Extensions
[media-source]
.
< script> const contentType= 'video/mp4;codecs=avc1.640028' ; const configuration= { type: 'media-source' , video: { contentType: contentType, width: 640 , height: 360 , bitrate: 2000 , framerate: 29.97 } }; navigator. mediaCapabilities. decodingInfo( configuration) . then(( result) => { console. log( 'Decoding of ' + contentType+ ' is' + ( result. supported? '' : ' NOT' ) + ' supported,' + ( result. smooth? '' : ' NOT' ) + ' smooth and' + ( result. powerEfficient? '' : ' NOT' ) + ' power efficient' ); }) . catch (( err) => { console. error( err, ' caused decodingInfo to reject' ); }); < /script>
The
following
examples
show
how
to
use
decodingInfo()
to
query
WebRTC
receive
capabilities
[webrtc]
.
< script> const contentType= 'video/VP8' ; const configuration= { type: 'webrtc' , video: { contentType: contentType, width: 640 , height: 360 , bitrate: 2000 , framerate: 25 } }; navigator. mediaCapabilities. decodingInfo( configuration) . then(( result) => { console. log( 'Decoding of ' + contentType+ ' is' + ( result. supported? '' : ' NOT' ) + ' supported,' + ( result. smooth? '' : ' NOT' ) + ' smooth and' + ( result. powerEfficient? '' : ' NOT' ) + ' power efficient' ); }) . catch (( err) => { console. error( err, ' caused decodingInfo to reject' ); }); < /script>
< script> const contentType= 'video/H264;level-asymmetry-allowed=1;' + 'packetization-mode=1;profile-level-id=42e01f' ; const configuration= { type: 'webrtc' , video: { contentType: contentType, width: 640 , height: 360 , bitrate: 2000 , framerate: 25 } }; navigator. mediaCapabilities. decodingInfo( configuration) . then(( result) => { console. log( 'Decoding of ' + contentType+ ' is' + ( result. supported? '' : ' NOT' ) + ' supported,' + ( result. smooth? '' : ' NOT' ) + ' smooth and' + ( result. powerEfficient? '' : ' NOT' ) + ' power efficient' ); }) . catch (( err) => { console. error( err, ' caused decodingInfo to reject' ); }); < /script>
The
following
example
shows
how
to
use
decodingInfo()
to
query
media
playback
capabilities
when
using
Encrypted
Media
Extensions
[ENCRYPTED-MEDIA]
.
< script> const encryptedMediaConfig= { type: 'media-source' , // or 'file' audio: { contentType: 'audio/webm; codecs=opus' , channels: '2' , // audio channels used by the track bitrate: 132266 , // number of bits used to encode a second of audio samplerate: 48000 // number of samples of audio carried per second }, video: { contentType: 'video/webm; codecs="vp09.00.10.08"' , width: 1920 , height: 1080 , bitrate: 2646242 , // number of bits used to encode a second of video framerate: 25 // number of frames used in one second }, keySystemConfiguration: { keySystem: 'com.widevine.alpha' , videoRobustness: 'SW_SECURE_DECODE' // Widevine L3 } }; navigator. mediaCapabilities. decodingInfo( encryptedMediaConfig). then( result=> { if ( ! result. supported) { console. log( 'Argh! This encrypted media configuration is not supported.' ); return ; } if ( ! result. keySystemAccess) { console. log( 'Argh! Encrypted media support is not available.' ) return ; } console. log( 'This encrypted media configuration is supported.\n' + 'Playback should be' + ( result. smooth? '' : ' NOT' ) + ' smooth and' + ( result. powerEfficient? '' : ' NOT' ) + ' power efficient.' ); }); < /script>
4.2.
Query
recording
capabilities
with
encodingInfo()
encodingInfo()
to
query
WebRTC
send
capabilities
[webrtc]
including
the
optional
field
scalabilityMode
.
< script> const contentType= 'video/VP9' ; const configuration= { type: 'webrtc' , video: { contentType: contentType, width: 640 , height: 480 , bitrate: 10000 , framerate: 29.97 , scalabilityMode: "L3T3_KEY" } }; navigator. mediaCapabilities. encodingInfo( configuration) . then(( result) => { console. log( contentType+ ' is:' + ( result. supported? '' : ' NOT' ) + ' supported,' + ( result. smooth? '' : ' NOT' ) + ' smooth and' + ( result. powerEfficient? '' : ' NOT' ) + ' power efficient' ); }) . catch (( err) => { console. error( err, ' caused encodingInfo to reject' ); }); < /script>
< script> const contentType= 'video/webm;codecs=vp8' ; const configuration= { type: 'record' , video: { contentType: contentType, width: 640 , height: 480 , bitrate: 10000 , framerate: 29.97 } }; navigator. mediaCapabilities. encodingInfo( configuration) . then(( result) => { console. log( contentType+ ' is:' + ( result. supported? '' : ' NOT' ) + ' supported,' + ( result. smooth? '' : ' NOT' ) + ' smooth and' + ( result. powerEfficient? '' : ' NOT' ) + ' power efficient' ); }) . catch (( err) => { console. error( err, ' caused encodingInfo to reject' ); }); < /script>