1. Introduction
This section is non-normativeThis specification relies on exposing the following sets of properties:
-
An API to query the user agent with regards to the decoding and encoding abilities of the device based on information such as the codecs, profile, resolution, bitrates, etc. The API exposes information such as whether the playback should be smooth and power efficient.
The intent of purposes of the decoding capabilities API is to provide a powerful replacement to API such as
isTypeSupported()
orcanPlayType()
which are vague and mostly help the callers to know if something can not be decoded but not how well it should perform. -
Better information about the display properties such as supported color gamut or dynamic range abilities in order to pick the right content for the display and avoid providing HDR content to an SDR display.
-
Real time feedback about the playback so an adaptative streaming can alter the quality of the content based on actual user perceived quality. Such information will allow websites to react to a pick of CPU/GPU usage in real time. It is expected that this will be tackled as part of the [media-playback-quality] specification.
2. Decoding and Encoding Capabilities
2.1. Media Configurations
2.1.1. MediaConfiguration
dictionary {
MediaConfiguration ; ;VideoConfiguration ;
video AudioConfiguration ; };
audio
dictionary :
MediaDecodingConfiguration MediaConfiguration {required MediaDecodingType ;
type MediaCapabilitiesKeySystemConfiguration ; };
keySystemConfiguration
dictionary :
MediaEncodingConfiguration MediaConfiguration {required MediaEncodingType ; };
type
The
input
to
the
decoding
capabilities
is
represented
by
a
MediaDecodingConfiguration
dictionary
and
the
input
of
the
encoding
capabilities
by
a
MediaEncodingConfiguration
dictionary.
For
a
MediaConfiguration
to
be
a
valid
MediaConfiguration
,
all
of
the
following
conditions
MUST
be
true:
-
audio
orand/orvideo
MUST be present . audio
MUST be a valid audio configuration if present .video
MUST be a valid video configuration if present .
For
a
MediaDecodingConfiguration
to
be
a
valid
MediaDecodingConfiguration
,
all
of
the
following
conditions
MUST
be
true:
- It MUST be a valid MediaConfiguration .
-
If
keySystemConfiguration
is present :
For
a
MediaDecodingConfiguration
to
describe
[ENCRYPTED-MEDIA]
,
a
keySystemConfiguration
MUST
be
present
.
2.1.2. MediaDecodingType
enum {
MediaDecodingType "file" ,"media-source" , };
A
MediaDecodingConfiguration
has
two
types:
-
file
is used to represent a configuration that is meant to be used for a plain file playback. -
media-source
is used to represent a configuration that is meant to be used for playback of aMediaSource
as defined in the [media-source] specification.
2.1.3. MediaEncodingType
enum {
MediaEncodingType "record" ,"transmission" };
A
MediaEncodingConfiguration
can
have
one
of
two
types:
-
record
is used to represent a configuration for recording of media, e.g. usingMediaRecorder
as defined in [mediastream-recording] . -
transmission
is used to represent a configuration meant to be transmitted over electronic means (e.g. usingRTCPeerConnection
).
2.1.4. MIME types
In
the
context
of
this
specification,
a
MIME
type
is
also
called
content
type.
A
valid
media
MIME
type
is
a
string
that
is
a
valid
MIME
type
per
[mimesniff]
.
If
the
MIME
type
does
not
imply
a
codec,
the
string
MUST
also
have
one
and
only
one
parameter
that
is
named
codecs
with
a
value
describing
a
single
media
codec.
Otherwise,
it
MUST
contain
no
parameters.
A
valid
audio
MIME
type
is
a
string
that
is
valid
media
MIME
type
and
for
which
the
type
per
[RFC7231]
is
either
audio
or
application
.
A
valid
video
MIME
type
is
a
string
that
is
a
valid
media
MIME
type
and
for
which
the
type
per
[RFC7231]
is
either
video
or
application
.
2.1.5. VideoConfiguration
dictionary {
VideoConfiguration required DOMString contentType ;required unsigned long width ;required unsigned long height ;required unsigned long long bitrate ;required DOMString framerate ; };
The
contentType
member
represents
the
MIME
type
of
the
video
track.
To
check
if
a
VideoConfiguration
configuration
is
a
valid
video
configuration
,
the
following
steps
MUST
be
run:
-
If
configuration
’s
contentType
is not a valid video MIME type , returnfalse
and abort these steps. -
If
none
of
the
following
is
true,
return
false
and abort these steps:-
Applying
the
rules
for
parsing
floating-point
number
values
to
configuration
’s
framerate
results in a number that is finite and greater than 0. -
configuration
’s
framerate
contains oneoccurenceoccurrence of U+002F SLASH character (/) and the substrings before and after this character, when applying the rules for parsing floating-point number values results in a number that is finite and greater than 0.
-
Applying
the
rules
for
parsing
floating-point
number
values
to
configuration
’s
-
Return
true
.
The
width
and
height
members
represent
respectively
the
visible
horizontal
and
vertical
encoded
pixels
in
the
encoded
video
frames.
The
bitrate
member
represents
the
average
bitrate
of
the
video
track
given
in
units
of
bits
per
second.
In
the
case
of
a
video
stream
encoded
at
a
constant
bit
rate
(CBR)
this
value
should
be
accurate
over
a
short
term
window.
For
the
case
of
variable
bit
rate
(VBR)
encoding,
this
value
should
be
usable
to
allocate
any
necessary
buffering
and
throughput
capability
to
provide
for
the
un-interrupted
decoding
of
the
video
stream
over
the
long-term
based
on
the
indicated
contentType
.
The
framerate
member
represents
the
framerate
of
the
video
track.
The
framerate
is
the
number
of
frames
used
in
one
second
(frames
per
second).
It
is
represented
either
as
a
double
or
as
a
fraction.
2.1.6. AudioConfiguration
dictionary {
AudioConfiguration required DOMString contentType ;DOMString channels ;unsigned long long bitrate ;unsigned long samplerate ; };
The
contentType
member
represents
the
MIME
type
of
the
audio
track.
To
check
if
a
AudioConfiguration
configuration
is
a
valid
audio
configuration
,
the
following
steps
MUST
be
run:
-
If
configuration
’s
contentType
is not a valid audio MIME type , returnfalse
and abort these steps. -
Return
true
.
The
channels
member
represents
the
audio
channels
used
by
the
audio
track.
The
channels
needs
to
be
defined
as
a
double
(2.1,
4.1,
5.1,
...),
an
unsigned
short
(number
of
channels)
or
as
an
enum
value.
The
current
definition
is
a
placeholder.
The
bitrate
member
represents
the
number
of
average
bitrate
of
the
audio
track.
The
bitrate
is
the
number
of
bits
used
to
encode
a
second
of
the
audio
track.
The
samplerate
represents
the
samplerate
of
the
audio
track
in.
The
samplerate
is
the
number
of
samples
of
audio
carried
per
second.
The
samplerate
is
expressed
in
Hz
(ie.
number
of
samples
of
audio
per
second).
Sometimes
the
samplerates
value
are
expressed
in
kHz
which
represents
the
number
of
thousands
of
samples
of
audio
per
second.
44100
Hz
is
equivalent
to
44.1
kHz
.
2.1.7. MediaCapabilitiesKeySystemConfiguration
dictionary {
MediaCapabilitiesKeySystemConfiguration required DOMString keySystem ;DOMString initDataType = "";DOMString audioRobustness = "";DOMString videoRobustness = "";MediaKeysRequirement distinctiveIdentifier = "optional";MediaKeysRequirement persistentState = "optional";sequence <DOMString >sessionTypes ; };
This
dictionary
refers
to
a
number
of
types
defined
by
[ENCRYPTED-MEDIA]
(EME).
Sequences
of
EME
types
are
flattened
to
a
single
value
whenever
the
intent
of
the
sequence
was
to
have
requestMediaKeySystemAccess()
choose
a
subset
it
supports.
With
MediaCapabilities,
callers
provide
the
sequence
across
multiple
calls,
ultimately
letting
the
caller
choose
which
configuration
to
use.
The
keySystem
member
represents
a
keySystem
name
as
described
in
[ENCRYPTED-MEDIA]
.
The
initDataType
member
represents
a
single
value
from
the
initDataTypes
sequence
described
in
[ENCRYPTED-MEDIA]
.
The
audioRobustness
member
represents
an
audio
robustness
level
as
described
in
[ENCRYPTED-MEDIA]
.
The
videoRobustness
member
represents
a
video
robustness
level
as
described
in
[ENCRYPTED-MEDIA]
.
The
distinctiveIdentifier
member
represents
a
distinctiveIdentifier
requirement
as
described
in
[ENCRYPTED-MEDIA]
.
The
persistentState
member
represents
a
persistentState
requirement
as
described
in
[ENCRYPTED-MEDIA]
.
The
sessionTypes
member
represents
a
sequence
of
required
sessionTypes
as
described
in
[ENCRYPTED-MEDIA]
.
2.2. Media Capabilities Information
dictionary {
MediaCapabilitiesInfo required boolean ;
supported required boolean ;
smooth required boolean ; };
powerEfficient
dictionary :
MediaCapabilitiesDecodingInfo MediaCapabilitiesInfo {required MediaKeySystemAccess ; };
keySystemAccess
The
MediaCapabilitiesInfo
has
an
associated
configuration
which
is
a
MediaDecodingConfiguration
or
MediaEncodingConfiguration
.
A
MediaCapabilitiesInfo
has
associated
supported
,
smooth
,
powerEfficient
fields
which
are
booleans.
Authors
can
use
powerEfficient
in
concordance
with
the
Battery
Status
API
[battery-status]
in
order
to
determine
whether
the
media
they
would
like
to
play
is
appropriate
for
the
user
configuration.
It
is
worth
noting
that
even
when
a
device
is
not
power
constrained,
high
power
usage
has
side
effects
such
as
increasing
the
temperature
or
the
fans
noise.
When
A
MediaCapabilitiesDecodingInfo
has
associated
keySystemAccess
which
is
a
MediaKeySystemAccess
or
null
as
appropriate.
If
the
encrypted
decoding
configuration
is
supported,
the
resulting
MediaCapabilitiesInfo
will
include
a
MediaKeySystemAccess
.
Authors
may
use
this
to
create
MediaKeys
and
setup
encrypted
playback.
2.3. Algorithms
2.3.1.
Create
a
MediaCapabilitiesInfo
algorithm
with
Given
a
MediaEncodingConfiguration
configuration
is
invoked,
the
user
agent
MUST
run
the
,
this
algorithm
returns
a
MediaCapabilitiesInfo
.
The
following
steps:
steps
are
run:
-
Let
info
be
a
new
MediaCapabilitiesInfo
instance. Unless stated otherwise, reading and writing apply to info for the next steps. - Set configuration to configuration .
-
If
configuration is of type MediaDecodingConfiguration , run the following substeps: Ifthe user agent is able todecodeencode the media represented by configuration,, set supported totrue
. Otherwise set it tofalse
. -
If
the
user
agent
is
able
to
decodeencode the media represented by configuration at a pace that allowsa smooth playback,encoding frames at the same pace as they are sent to the encoder, set smooth totrue
. Otherwise set it tofalse
. -
If
the
user
agent
is
able
to
decodeencode the media represented by configuration in a power efficient manner, set powerEfficient totrue
. Otherwise set it tofalse
. The user agent SHOULD NOT take into consideration the current power source in order to determine thedecodingencoding power efficiency unless the device’s power source has side effects such as enabling differentdecodingencoding modules. - Return info .
2.3.2. Create a MediaCapabilitiesDecodingInfo
Given
a
MediaDecodingConfiguration
configuration
,
this
algorithm
returns
a
MediaCapabilitiesDecodingInfo
.
The
following
steps
are
run:
-
If
configuration.keySystemConfiguration
is present :- Set keySystemAccess to the result of running the Check Encrypted Decoding Support algorithm with configuration .
-
If
keySystemAccess
is
of type MediaEncodingConfigurationnotnull
set supported,totrue
. Otherwise set it tofalse
.
-
Otherwise,
run
the
following
substeps:steps:-
Set
keySystemAccess
to
null
. -
If
the
user
agent
is
able
to
encodedecode the media represented by configuration,, set supported totrue
.Otherwise -
Otherwise,
set
it
to
false
.
-
Set
keySystemAccess
to
-
If
the
user
agent
is
able
to
encodedecode the media represented by configuration at a pace that allowsencoding frames at the same pace as they are sent to the encoder,a smooth playback, set smooth totrue
. Otherwise set it tofalse
. -
If
the
user
agent
is
able
to
encodedecode the media represented by configuration in a power efficient manner, set powerEfficient totrue
. Otherwise set it tofalse
. The user agent SHOULD NOT take into consideration the current power source in order to determine theencodingdecoding power efficiency unless the device’s power source has side effects such as enabling differentencodingdecoding modules. - Return info .
2.3.3. Check Encrypted Decoding Support
Given
a
MediaDecodingConfiguration
config
with
a
keySystemConfiguration
present
,
this
algorithm
returns
a
MediaKeySystemAccess
or
null
as
appropriate.
The
following
steps
are
run:
-
If
the
keySystem
member ofconfig.keySystemConfiguration
is not one of the Key Systems supported by the user agent, returnnull
. String comparison is case-sensitive. - Let origin be the origin of the calling context’s Document .
-
Let
implementation
be
the
implementation
of
config.keySystemConfiguration.keySystem
-
Let
emeConfiguration
be
a
new
MediaKeySystemConfiguration
, and initialize it as follows:-
Set
the
initDataTypes
attributeMUST return supported . Theto a sequence containingsmoothconfig.keySystemConfiguration.initDataType -
Set
the
distinctiveIdentifier
attribute toconfig.keySystemConfiguration.distinctiveIdentifier
. -
Set
the
persistentState
attributeMUST return smooth . Thetoconfig.keySystemConfiguration.peristentState
. -
Set
the
sessionTypes
attribute topowerEfficientconfig.keySystemConfiguration.sessionTypes -
If
an
audio
is present in config , set theaudioCapabilities
attribute to a sequence containing a singleMediaKeySystemMediaCapability
, initialized as follows:-
Set
the
contentType
attributeMUST return powerEfficient . Authors can usetoconfig.audio.contentType
. -
Set
the
powerEfficientrobustnessconfig.keySystemConfiguration.audioRobustness
.
-
Set
the
-
If
a
video
is present inconcordance withconfig , set theBattery Status API [battery-status]videoCapabilities attribute to a sequence containing a singleMediaKeySystemMediaCapability
, initialized as follows:in order-
Set
the
contentType
attribute todetermine whetherconfig.video.contentType
. -
Set
the
media they would likerobustness
attribute toplay is appropriate forconfig.keySystemConfiguration.videoRobustness
.
-
Set
the
-
Set
the
-
Let
supported
configuration
be
the
user configuration. Itresult of executing the Get Supported Configuration algorithm on implementation , emeConfiguration , and origin . -
If
supported
configuration
is
worth noting that even whenNotSupported
, returnnull
and abort these steps. -
Let
access
be
a
device is not power constrained, high power usage has side effects suchnewMediaKeySystemAccess
object, and initialize it asincreasingfollows:-
Set
the
temperature orkeySystem
attribute toemeConfiguration.keySystem
. -
Let
the
fans noise.configuration value be supported configuration . - Let the cdm implementation value be implementation .
-
Set
the
- Return access
2.3.
2.4.
Navigator
and
WorkerNavigator
extension
[Exposed =Window ]partial interface Navigator { [SameObject ]readonly attribute MediaCapabilities ; };
mediaCapabilities
[Exposed =Worker ]partial interface WorkerNavigator { [SameObject ]readonly attribute MediaCapabilities ; };
mediaCapabilities
2.4.
2.5.
Media
Capabilities
Interface
[Exposed =(Window ,Worker )]interface {
MediaCapabilities [); [);[NewObject ]Promise <MediaCapabilitiesDecodingInfo >(
decodingInfo MediaDecodingConfiguration ); [
configuration NewObject ]Promise <MediaCapabilitiesInfo >(
encodingInfo MediaEncodingConfiguration ); };
configuration
The
decodingInfo()
method
method
MUST
run
the
following
steps:
method
and
the
encodingInfo()
-
If
configuration
is
not
a
valid
MediaConfigurationMediaDecodingConfiguration , return a Promise rejected with a newly createdTypeError
. -
If
configuration.videoconfiguration.keySystemConfiguration-
If
the
global
object
andisnot a valid video configuration ,of typeWorkerGlobalScope
, return a Promise rejected with a newly createdTypeError
. -
If
configuration.audiothe result of running Is the environment settings object settings a secure context? [secure-contexts] with the global object’s relevant settings object is not "Secure", return a Promise rejected with a newly createdDOMException
whose name ispresentSecurityError .
-
If
the
global
object
- Let p be a new promise.
- In parallel , run the Create a MediaCapabilitiesDecodingInfo algorithm with configuration and resolve p with its result.
- Return p .
Note,
calling
decodingInfo()
with
a
keySystemConfiguration
present
may
have
user-visible
effects,
including
requests
for
user
consent.
Such
calls
should
only
be
made
when
the
author
intends
to
create
and
use
a
MediaKeys
object
with
the
provided
configuration.
The
encodingInfo()
method
MUST
run
the
following
steps:
-
If
configuration
is
not
a
valid
audio configurationMediaConfiguration , return a Promise rejected with a newly createdTypeError
. - Let p be a new promise.
-
In
parallel
,
run
the
createCreate a MediaCapabilitiesInfoalgorithmalgorithm with configuration and resolve p with its result. - Return p .
3. Display Capabilities
This section is still Work In Progress and has no shipping implementation. Please look into it in details before implementing it.
3.1. Screen Luminance
interface {
ScreenLuminance readonly attribute double min ;readonly attribute double max ;readonly attribute double maxAverage ; };
The
ScreenLuminance
object
represents
the
known
luminance
characteristics
of
the
screen.
The
min
attribute
MUST
return
the
minimal
screen
luminance
that
a
pixel
of
the
screen
can
emit
in
candela
per
square
metre.
The
minimal
screen
luminance
is
the
luminance
used
when
showing
the
darkest
color
a
pixel
on
the
screen
can
display.
The
max
attribute
MUST
return
the
maximal
screen
luminance
that
a
pixel
of
the
screen
can
emit
in
candela
per
square
metre.
The
maximal
screen
luminance
is
the
luminance
used
when
showing
the
whitest
color
a
pixel
on
the
screen
can
display.
The
maxAverage
attribute
MUST
return
the
maximal
average
screen
luminance
that
the
screen
can
emit
in
candela
per
square
metre.
The
maximal
average
screen
luminance
is
the
maximal
luminance
value
such
as
all
the
pixels
of
the
screen
emit
the
same
luminance.
The
value
returned
by
maxAverage
is
expected
to
be
different
from
max
as
screens
usually
can’t
apply
the
maximal
screen
luminance
to
the
entire
panel.
3.2. Screen Color Gamut
enum {
ScreenColorGamut "srgb" ,"p3" ,"rec2020" , };
The
ScreenColorGamut
represents
the
color
gamut
supported
by
a
Screen
,
that
means
the
range
of
color
that
the
screen
can
display.
The
ScreenColorGamut
values
are:
3.3. Screen extension
Part
of
this
section
is
🐵
patching
of
the
CSSOM
View
Module.
Issue
#4
is
tracking
merging
the
changes.
This
partial
interface
requires
the
Screen
interface
to
become
an
EventTarget
.
partial interface Screen {readonly attribute ScreenColorGamut colorGamut ;readonly attribute ScreenLuminance ?luminance ;attribute EventHandler onchange ; };
The
colorGamut
attribute
SHOULD
return
the
ScreenColorGamut
approximately
supported
by
the
screen.
In
other
words,
the
screen
does
not
need
to
fully
support
the
given
color
gamut
but
needs
to
be
close
enough.
If
the
user
agent
does
not
know
the
color
gamut
supported
by
the
screen,
if
the
supported
color
gamut
is
lower
than
srgb
,
or
if
the
user
agent
does
not
want
to
expose
this
information
for
privacy
consideration,
it
SHOULD
return
srgb
as
a
default
value.
The
value
returned
by
colorGamut
MUST
match
the
value
returned
by
the
color-gamut
CSS
media
query.
The
luminance
attribute
SHOULD
return
a
ScreenLuminance
object
that
will
expose
the
luminance
characteristics
of
the
screen.
If
the
user
agent
has
no
access
to
the
luminance
characteristics
of
the
screen,
it
MUST
return
null
.
The
user
agent
MAY
also
return
null
if
it
does
not
want
to
expose
the
luminance
information
for
privacy
reasons.
The
onchange
attribute
is
an
event
handler
whose
corresponding
event
handler
event
type
is
change
.
Whenever
the
user
agent
is
aware
that
the
state
of
the
Screen
object
has
changed,
that
is
if
one
the
value
exposed
on
the
Screen
object
or
in
an
object
exposed
on
the
Screen
object,
it
MUST
queue
a
task
to
fire
an
event
named
change
on
Screen
.
4. Security and Privacy Considerations
This specification does not introduce any security-sensitive information or APIs but is provides an easier access to some information that can be used to fingerprint users.
4.1. Decoding/Encoding and Fingerprinting
The information exposed by the decoding/encoding capabilities can already be discovered via experimentation with the exception that the API will likely provide more accurate and consistent information. This information is expected to have a high correlation with other information already available to the web pages as a given class of device is expected to have very similar decoding/encoding capabilities. In other words, high end devices from a certain year are expected to decode some type of videos while older devices may not. Therefore, it is expected that the entropy added with this API isn’t going to be significant.
If an implementation wishes to implement a fingerprint-proof version of this specification, it would be recommended to fake a given set of capabilities (ie. decode up to 1080p VP9, etc.) instead of returning always yes or always no as the latter approach could considerably degrade the user’s experience.
4.2. Display and Fingerprinting
The information exposed by the display capabilities can already be accessed via CSS for the most part. The specification also provides default values when the user agent does not which to expose the feature for privacy reasons.
5. Examples
5.1.
Query
recording
capabilities
with
encodingInfo()
< script> const configuration= { type: 'record' , video: { contentType: 'video/webm;codecs=vp8' , width: 640 , height: 480 , bitrate: 10000 , framerate: '30' } }; navigator. mediaCapabilities. encodingInfo( configuration) . then(( result) => { console. log( result. contentType+ ' is:' + ( result. supported? '' : ' NOT' ) + ' supported,' + ( result. smooth? '' : ' NOT' ) + ' smooth and' + ( result. powerEfficient? '' : ' NOT' ) + ' power efficient' ); }) . catch (( err) => { console. error( err, ' caused encodingInfo to throw' ); }); < /script>