1. Definitions
- Codec
-
Refers generically to an instance of
AudioDecoder
,AudioEncoder
,VideoDecoder
, orVideoEncoder
. - Key Chunk
-
An encoded chunk that does not depend on any other frames for decoding. Also commonly referred to as a "key frame".
- Internal Pending Output
-
Codec outputs such as
VideoFrame
s that currently reside in the internal pipeline of the underlying codec implementation. The underlying codec implementation MAY emit new outputs only when new inputs are provided. The underlying codec implementation MUST emit all outputs in response to a flush. - Codec System Resources
-
Resources including CPU memory, GPU memory, and exclusive handles to specific decoding/encoding hardware that MAY be allocated by the User Agent as part of codec configuration or generation of
AudioData
andVideoFrame
objects. Such resources MAY be quickly exhausted and SHOULD be released immediately when no longer in use. - Temporal Layer
-
A grouping of
EncodedVideoChunk
s whose timestamp cadence produces a particular framerate. SeescalabilityMode
. - Progressive Image
-
An image that supports decoding to multiple levels of detail, with lower levels becoming available while the encoded data is not yet fully buffered.
- Progressive Image Frame Generation
-
A generational identifier for a given Progressive Image decoded output. Each successive generation adds additional detail to the decoded output. The mechanism for computing a frame’s generation is implementer defined.
- Primary Image Track
-
An image track that is marked by the given image file as being the default track. The mechanism for indicating a primary track is format defined.
- RGB Format
-
A
VideoPixelFormat
containing red, green, and blue color channels in any order or layout (interleaved or planar), and irrespective of whether an alpha channel is present. - sRGB Color Space
-
A
VideoColorSpace
object, initialized as follows:-
[[primaries]]
is set tobt709
, -
[[transfer]]
is set toiec61966-2-1
, -
[[matrix]]
is set torgb
, -
[[full range]]
is set totrue
-
- Display P3 Color Space
-
A
VideoColorSpace
object, initialized as follows:-
[[primaries]]
is set tosmpte432
, -
[[transfer]]
is set toiec61966-2-1
, -
[[matrix]]
is set torgb
, -
[[full range]]
is set totrue
-
- REC709 Color Space
-
A
VideoColorSpace
object, initialized as follows:-
[[primaries]]
is set tobt709
, -
[[transfer]]
is set tobt709
, -
[[matrix]]
is set tobt709
, -
[[full range]]
is set tofalse
-
- Codec Saturation
-
The state of an underlying codec implementation where the number of active decoding or encoding requests has reached an implementation specific maximum such that it is temporarily unable to accept more work. The maximum may be any value greater than 1, including infinity (no maximum). While saturated, additional calls to
decode()
orencode()
will be buffered in the control message queue , and will increment the respectivedecodeQueuSize
andencodeQueueSize
attributes. The codec implementation will become unsaturated after making sufficient progress on the current workload.
2. Codec Processing Model
2.1. Background
The
codec
interfaces
defined
by
the
specification
are
designed
such
that
new
codec
tasks
can
be
scheduled
while
previous
tasks
are
still
pending.
For
example,
web
authors
can
call
decode()
without
waiting
for
a
previous
decode()
to
complete.
This
is
achieved
by
offloading
underlying
codec
tasks
to
a
separate
parallel
queue
for
parallel
execution.
This section describes threading behaviors as they are visible from the perspective of web authors. Implementers can choose to use more threads, as long as the externally visible behaviors of blocking and sequencing are maintained as follows.
2.2. Control Messages
A
control
message
defines
a
sequence
of
steps
corresponding
to
a
method
invocation
on
a
codec
instance
(e.g.
encode()
).
A control message queue is a queue of control messages . Each codec instance has a control message queue stored in an internal slot named [[control message queue]] .
Queuing a control message means enqueuing the message to a codec ’s [[control message queue]] . Invoking codec methods will generally queue a control message to schedule work.
Running a control message means performing a sequence of steps specified by the method that enqueued the message.
The
steps
of
a
given
control
message
can
block
processing
later
messages
in
the
control
message
queue.
Each
codec
instance
has
a
boolean
internal
slot
named
[[message
queue
blocked]]
that
is
set
to
true
when
this
occurs.
A
blocking
message
will
conclude
by
setting
[[message
queue
blocked]]
to
false
and
rerunning
the
Process
the
control
message
queue
steps.
All
control
messages
will
return
either
"processed"
or
"not
processed"
.
Returning
"processed"
indicates
the
message
steps
are
being
(or
have
been)
executed
and
the
message
may
be
removed
from
the
control
message
queue
.
"not
processed"
indicates
the
message
must
not
be
processed
at
this
time
and
should
remain
in
the
control
message
queue
to
be
retried
later.
To Process the control message queue , run these steps:
-
While [[message queue blocked]] is
false
and [[control message queue]] is not empty:-
Let front message be the first message in [[control message queue]] .
-
Let outcome be the result of running the control message steps described by front message .
-
If outcome equals
"not processed"
, break. -
Otherwise, dequeue front message from the [[control message queue]] .
-
2.3. Codec Work Parallel Queue
Each codec instance has an internal slot named [[codec work queue]] that is a parallel queue .
Each codec instance has an internal slot named [[codec implementation]] that refers to the underlying platform encoder or decoder. Except for the initial assignment, any steps that reference [[codec implementation]] will be enqueued to the [[codec work queue]] .
Each codec instance has a unique codec task source . Tasks queued from the [[codec work queue]] to the event loop will use the codec task source .
3. AudioDecoder Interface
[Exposed =(Window ,DedicatedWorker ),SecureContext ]interface :
AudioDecoder EventTarget {constructor (AudioDecoderInit );
init readonly attribute CodecState state ;readonly attribute unsigned long decodeQueueSize ;attribute EventHandler ondequeue ;undefined configure (AudioDecoderConfig );
config undefined decode (EncodedAudioChunk );
chunk Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <AudioDecoderSupport >isConfigSupported (AudioDecoderConfig ); };
config dictionary {
AudioDecoderInit required AudioDataOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
AudioDataOutputCallback undefined (AudioData );
output
3.1. Internal Slots
-
[[control message queue]]
-
A queue of control messages to be performed upon this codec instance. See [[control message queue]] .
-
[[message queue blocked]]
-
A boolean indicating when processing the
[[control message queue]]
is blocked by a pending control message . See [[message queue blocked]] . -
[[codec implementation]]
-
Underlying decoder implementation provided by the User Agent. See [[codec implementation]] .
-
[[codec work queue]]
-
A parallel queue used for running parallel steps that reference the
[[codec implementation]]
. See [[codec work queue]] . -
[[codec saturated]]
-
A boolean indicating when the
[[codec implementation]]
is unable to accept additional decoding work. -
[[output callback]]
-
Callback given at construction for decoded outputs.
-
[[error callback]]
-
Callback given at construction for decode errors.
-
[[key chunk required]]
-
A boolean indicating that the next chunk passed to
decode()
MUST describe a key chunk as indicated by[[type]]
. -
[[state]]
-
The current
CodecState
of thisAudioDecoder
. -
[[decodeQueueSize]]
-
The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input.
-
[[pending flush promises]]
-
A list of unresolved promises returned by calls to
flush()
. -
[[dequeue event scheduled]]
-
A boolean indicating whether a
dequeue
event is already scheduled to fire. Used to avoid event spam.
3.2. Constructors
AudioDecoder(init)
-
Let d be a new
AudioDecoder
object. -
Assign a new queue to
[[control message queue]]
. -
Assign
false
to[[message queue blocked]]
. -
Assign
null
to[[codec implementation]]
. -
Assign the result of starting a new parallel queue to
[[codec work queue]]
. -
Assign
false
to[[codec saturated]]
. -
Assign init.output to
[[output callback]]
. -
Assign init.error to
[[error callback]]
. -
Assign
true
to[[key chunk required]]
. -
Assign
"unconfigured"
to[[state]]
-
Assign
0
to[[decodeQueueSize]]
. -
Assign a new list to
[[pending flush promises]]
. -
Assign
false
to[[dequeue event scheduled]]
. -
Return d.
3.3. Attributes
-
state
, of type CodecState , readonly -
Returns the value of
[[state]]
. -
decodeQueueSize
, of type unsigned long , readonly -
Returns the value of
[[decodeQueueSize]]
. -
ondequeue
, of type EventHandler -
An event handler IDL attribute whose event handler event type is
dequeue
.
3.4. Event Summary
-
dequeue
-
Fired at the
AudioDecoder
when thedecodeQueueSize
has decreased.
3.5. Methods
-
configure(config)
-
Enqueues
a
control
message
to
configure
the
audio
decoder
for
decoding
chunks
as
described
by
config
.
NOTE: This method will trigger a
NotSupportedError
if the User Agent does not support config . Authors are encouraged to first check support by callingisConfigSupported()
with config . User Agents don’t have to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid AudioDecoderConfig , throw a
TypeError
. -
If
[[state]]
is“closed”
, throw anInvalidStateError
. -
Set
[[state]]
to"configured"
. -
Set
[[key chunk required]]
totrue
. -
Queue a control message to configure the decoder with config .
Running a control message to configure the decoder means running these steps:
-
Assign
true
to[[message queue blocked]]
. -
Enqueue the following steps to
[[codec work queue]]
:-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
If supported is
false
, queue a task to run the Close AudioDecoder algorithm withNotSupportedError
and abort these steps. -
If needed, assign
[[codec implementation]]
with an implementation supporting config . -
Configure
[[codec implementation]]
with config . -
queue a task to run the following steps:
-
Assign
false
to[[message queue blocked]]
.
-
-
-
Return
"processed"
.
-
-
decode(chunk)
-
Enqueues
a
control
message
to
decode
the
given
chunk
.
When invoked, run these steps:
-
If
[[state]]
is not"configured"
, throw anInvalidStateError
. -
If
[[key chunk required]]
istrue
:-
Implementers SHOULD inspect the chunk ’s
[[internal data]]
to verify that it is truly a key chunk . If a mismatch is detected, throw aDataError
. -
Otherwise, assign
false
to[[key chunk required]]
.
-
Increment
[[decodeQueueSize]]
. -
Queue a control message to decode the chunk .
Running a control message to decode the chunk means performing these steps:
-
If
[[codec saturated]]
equalstrue
, return"not processed"
. -
If decoding chunk will cause the
[[codec implementation]]
to become saturated , assigntrue
to[[codec saturated]]
. -
Decrement
[[decodeQueueSize]]
and run the Schedule Dequeue Event algorithm. -
Enqueue the following steps to the
[[codec work queue]]
:-
Attempt to use
[[codec implementation]]
to decode the chunk. -
If decoding results in an error, queue a task to run the Close AudioDecoder algorithm with
EncodingError
and return. -
If
[[codec saturated]]
equalstrue
and[[codec implementation]]
is no longer saturated , queue a task to perform the following steps:-
Assign
false
to[[codec saturated]]
.
-
-
Let decoded outputs be a list of decoded audio data outputs emitted by
[[codec implementation]]
. -
If decoded outputs is not empty, queue a task to run the Output AudioData algorithm with decoded outputs .
-
-
Return
"processed"
.
-
-
flush()
-
Completes
all
control
messages
in
the
control
message
queue
and
emits
all
outputs.
When invoked, run these steps:
-
If
[[state]]
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Set
[[key chunk required]]
totrue
. -
Let promise be a new Promise.
-
Append promise to
[[pending flush promises]]
. -
Queue a control message to flush the codec with promise .
-
Return promise .
Running a control message to flush the codec means performing these steps with promise .
-
Enqueue the following steps to the
[[codec work queue]]
:-
Signal
[[codec implementation]]
to emit all internal pending outputs . -
Let decoded outputs be a list of decoded audio data outputs emitted by
[[codec implementation]]
. -
Queue a task to perform these steps:
-
If decoded outputs is not empty, run the Output AudioData algorithm with decoded outputs .
-
Remove promise from
[[pending flush promises]]
. -
Resolve promise .
-
-
-
Return
"processed"
.
-
-
reset()
-
Immediately
resets
all
state
including
configuration,
control
messages
in
the
control
message
queue
,
and
all
pending
callbacks.
When invoked, run the Reset AudioDecoder algorithm with an
AbortError
DOMException
. -
close()
-
Immediately
aborts
all
pending
work
and
releases
system
resources
.
Close
is
final.
When invoked, run the Close AudioDecoder algorithm with an
AbortError
DOMException
. -
isConfigSupported(config)
-
Returns
a
promise
indicating
whether
the
provided
config
is
supported
by
the
User
Agent.
NOTE: The returned
AudioDecoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparingconfig
to their provided config .When invoked, run these steps:
-
If config is not a valid AudioDecoderConfig , return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue .
-
Enqueue the following steps to checkSupportQueue :
-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
Queue a task to run the following steps:
-
Let decoderSupport be a newly constructed
AudioDecoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config . -
Set
supported
to supported .
-
-
Resolve p with decoderSupport .
-
-
-
Return p .
-
3.6. Algorithms
- Schedule Dequeue Event
-
-
If
[[dequeue event scheduled]]
equalstrue
, return. -
Assign
true
to[[dequeue event scheduled]]
. -
Queue a task to run the following steps:
-
Assign
false
to[[dequeue event scheduled]]
.
-
- Output AudioData (with outputs )
-
Run
these
steps:
-
For each output in outputs :
-
Let data be an
AudioData
, initialized as follows:-
Assign
false
to[[Detached]]
. -
Let resource be the media resource described by output .
-
Let resourceReference be a reference to resource .
-
Assign resourceReference to
[[resource reference]]
. -
Let timestamp be the
[[timestamp]]
of theEncodedAudioChunk
associated with output . -
Assign timestamp to
[[timestamp]]
. -
If output uses a recognized
AudioSampleFormat
, assign that format to[[format]]
. Otherwise, assignnull
to[[format]]
. -
Assign values to
[[sample rate]]
,[[number of frames]]
, and[[number of channels]]
as determined by output .
-
-
Invoke
[[output callback]]
with data .
-
-
- Reset AudioDecoder (with exception )
-
Run
these
steps:
-
If
[[state]]
is"closed"
, throw anInvalidStateError
. -
Set
[[state]]
to"unconfigured"
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Remove all control messages from the
[[control message queue]]
. -
If
[[decodeQueueSize]]
is greater than zero:-
Set
[[decodeQueueSize]]
to zero. -
Run the Schedule Dequeue Event algorithm.
-
-
For each promise in
[[pending flush promises]]
:-
Reject promise with exception .
-
Remove promise from
[[pending flush promises]]
.
-
-
- Close AudioDecoder (with exception )
-
Run
these
steps:
-
Run the Reset AudioDecoder algorithm with exception .
-
Set
[[state]]
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources . -
If exception is not an
AbortError
DOMException
, invoke the[[error callback]]
with exception .
-
4. VideoDecoder Interface
[Exposed =(Window ,DedicatedWorker ),SecureContext ]interface :
VideoDecoder EventTarget {constructor (VideoDecoderInit );
init readonly attribute CodecState state ;readonly attribute unsigned long decodeQueueSize ;attribute EventHandler ondequeue ;undefined configure (VideoDecoderConfig );
config undefined decode (EncodedVideoChunk );
chunk Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <VideoDecoderSupport >isConfigSupported (VideoDecoderConfig ); };
config dictionary {
VideoDecoderInit required VideoFrameOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
VideoFrameOutputCallback undefined (VideoFrame );
output
4.1. Internal Slots
-
[[control message queue]]
-
A queue of control messages to be performed upon this codec instance. See [[control message queue]] .
-
[[message queue blocked]]
-
A boolean indicating when processing the
[[control message queue]]
is blocked by a pending control message . See [[message queue blocked]] . -
[[codec implementation]]
-
Underlying decoder implementation provided by the User Agent. See [[codec implementation]] .
-
[[codec work queue]]
-
A parallel queue used for running parallel steps that reference the
[[codec implementation]]
. See [[codec work queue]] . -
[[codec saturated]]
-
A boolean indicating when the
[[codec implementation]]
is unable to accept additional decoding work. -
[[output callback]]
-
Callback given at construction for decoded outputs.
-
[[error callback]]
-
Callback given at construction for decode errors.
-
[[active decoder config]]
-
The
VideoDecoderConfig
that is actively applied. -
[[key chunk required]]
-
A boolean indicating that the next chunk passed to
decode()
MUST describe a key chunk as indicated bytype
. -
[[state]]
-
The current
CodecState
of thisVideoDecoder
. -
[[decodeQueueSize]]
-
The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input.
-
[[pending flush promises]]
-
A list of unresolved promises returned by calls to
flush()
. -
[[dequeue event scheduled]]
-
A boolean indicating whether a
dequeue
event is already scheduled to fire. Used to avoid event spam.
4.2. Constructors
VideoDecoder(init)
-
Let d be a new
VideoDecoder
object. -
Assign a new queue to
[[control message queue]]
. -
Assign
false
to[[message queue blocked]]
. -
Assign
null
to[[codec implementation]]
. -
Assign the result of starting a new parallel queue to
[[codec work queue]]
. -
Assign
false
to[[codec saturated]]
. -
Assign init.output to
[[output callback]]
. -
Assign init.error to
[[error callback]]
. -
Assign
null
to[[active decoder config]]
. -
Assign
true
to[[key chunk required]]
. -
Assign
"unconfigured"
to[[state]]
-
Assign
0
to[[decodeQueueSize]]
. -
Assign a new list to
[[pending flush promises]]
. -
Assign
false
to[[dequeue event scheduled]]
. -
Return d.
4.3. Attributes
-
state
, of type CodecState , readonly -
Returns the value of
[[state]]
. -
decodeQueueSize
, of type unsigned long , readonly -
Returns the value of
[[decodeQueueSize]]
. -
ondequeue
, of type EventHandler -
An event handler IDL attribute whose event handler event type is
dequeue
.
4.4. Event Summary
-
dequeue
-
Fired at the
VideoDecoder
when thedecodeQueueSize
has decreased.
4.5. Methods
-
configure(config)
-
Enqueues
a
control
message
to
configure
the
video
decoder
for
decoding
chunks
as
described
by
config
.
NOTE: This method will trigger a
NotSupportedError
if the User Agent does not support config . Authors are encouraged to first check support by callingisConfigSupported()
with config . User Agents don’t have to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid VideoDecoderConfig , throw a
TypeError
. -
If
[[state]]
is“closed”
, throw anInvalidStateError
. -
Set
[[state]]
to"configured"
. -
Set
[[key chunk required]]
totrue
. -
Queue a control message to configure the decoder with config .
Running a control message to configure the decoder means running these steps:
-
Assign
true
to[[message queue blocked]]
. -
Enqueue the following steps to
[[codec work queue]]
:-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
If supported is
false
, queue a task to run the Close VideoDecoder algorithm withNotSupportedError
and abort these steps. -
If needed, assign
[[codec implementation]]
with an implementation supporting config . -
Configure
[[codec implementation]]
with config . -
queue a task to run the following steps:
-
Assign
false
to[[message queue blocked]]
.
-
-
-
Return
"processed"
.
-
-
decode(chunk)
-
Enqueues
a
control
message
to
decode
the
given
chunk
.
NOTE: Authors are encouraged to call
close()
on outputVideoFrame
s immediately when frames are no longer needed. The underlying media resource s are owned by theVideoDecoder
and failing to release them (or waiting for garbage collection) can cause decoding to stall.NOTE:
VideoDecoder
requires that frames are output in the order they expect to be presented, commonly known as presentation order. When using some[[codec implementation]]
s the User Agent will have to reorder outputs into presentation order.When invoked, run these steps:
-
If
[[state]]
is not"configured"
, throw anInvalidStateError
. -
If
[[key chunk required]]
istrue
:-
Implementers SHOULD inspect the chunk ’s
[[internal data]]
to verify that it is truly a key chunk . If a mismatch is detected, throw aDataError
. -
Otherwise, assign
false
to[[key chunk required]]
.
-
Increment
[[decodeQueueSize]]
. -
Queue a control message to decode the chunk .
Running a control message to decode the chunk means performing these steps:
-
If
[[codec saturated]]
equalstrue
, return"not processed"
. -
If decoding chunk will cause the
[[codec implementation]]
to become saturated , assigntrue
to[[codec saturated]]
. -
Decrement
[[decodeQueueSize]]
and run the Schedule Dequeue Event algorithm. -
Enqueue the following steps to the
[[codec work queue]]
:-
Attempt to use
[[codec implementation]]
to decode the chunk. -
If decoding results in an error, queue a task to run the Close VideoDecoder algorithm with
EncodingError
and return. -
If
[[codec saturated]]
equalstrue
and[[codec implementation]]
is no longer saturated , queue a task to perform the following steps:-
Assign
false
to[[codec saturated]]
.
-
-
Let decoded outputs be a list of decoded video data outputs emitted by
[[codec implementation]]
in presentation order. -
If decoded outputs is not empty, queue a task to run the Output VideoFrame algorithm with decoded outputs .
-
-
Return
"processed"
.
-
-
flush()
-
Completes
all
control
messages
in
the
control
message
queue
and
emits
all
outputs.
When invoked, run these steps:
-
If
[[state]]
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Set
[[key chunk required]]
totrue
. -
Let promise be a new Promise.
-
Append promise to
[[pending flush promises]]
. -
Queue a control message to flush the codec with promise .
-
Return promise .
Running a control message to flush the codec means performing these steps with promise .
-
Enqueue the following steps to the
[[codec work queue]]
:-
Signal
[[codec implementation]]
to emit all internal pending outputs . -
Let decoded outputs be a list of decoded video data outputs emitted by
[[codec implementation]]
. -
Queue a task to perform these steps:
-
If decoded outputs is not empty, run the Output VideoFrame algorithm with decoded outputs .
-
Remove promise from
[[pending flush promises]]
. -
Resolve promise .
-
-
-
Return
"processed"
.
-
-
reset()
-
Immediately
resets
all
state
including
configuration,
control
messages
in
the
control
message
queue
,
and
all
pending
callbacks.
When invoked, run the Reset VideoDecoder algorithm with an
AbortError
DOMException
. -
close()
-
Immediately
aborts
all
pending
work
and
releases
system
resources
.
Close
is
final.
When invoked, run the Close VideoDecoder algorithm with an
AbortError
DOMException
. -
isConfigSupported(config)
-
Returns
a
promise
indicating
whether
the
provided
config
is
supported
by
the
User
Agent.
NOTE: The returned
VideoDecoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparingconfig
to their provided config .When invoked, run these steps:
-
If config is not a valid VideoDecoderConfig , return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue .
-
Enqueue the following steps to checkSupportQueue :
-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
Queue a task to run the following steps:
-
Let decoderSupport be a newly constructed
VideoDecoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config . -
Set
supported
to supported .
-
-
Resolve p with decoderSupport .
-
-
-
Return p .
-
4.6. Algorithms
- Schedule Dequeue Event
-
-
If
[[dequeue event scheduled]]
equalstrue
, return. -
Assign
true
to[[dequeue event scheduled]]
. -
Queue a task to run the following steps:
-
Assign
false
to[[dequeue event scheduled]]
.
-
- Output VideoFrames (with outputs )
-
Run
these
steps:
-
For each output in outputs :
-
Let timestamp and duration be the
timestamp
andduration
from theEncodedVideoChunk
associated with output . -
Let displayAspectWidth and displayAspectHeight be undefined.
-
If
displayAspectWidth
anddisplayAspectHeight
exist in the[[active decoder config]]
, assign their values to displayAspectWidth and displayAspectHeight respectively. -
Let colorSpace be the
VideoColorSpace
for output as detected by the codec implementation. If noVideoColorSpace
is detected, let colorSpace beundefined
.NOTE: The codec implementation can detect a
VideoColorSpace
by analyzing the bitstream. Detection is made on a best-effort basis. The exact method of detection is implementer defined and codec-specific. Authors can override the detectedVideoColorSpace
by providing acolorSpace
in theVideoDecoderConfig
. -
If
colorSpace
exists in the[[active decoder config]]
, assign its value to colorSpace . -
Assign the values of
rotation
andflip
to rotation and flip respectively. -
Let frame be the result of running the Create a VideoFrame algorithm with output , timestamp , duration , displayAspectWidth , displayAspectHeight , colorSpace , rotation , and flip .
-
Invoke
[[output callback]]
with frame .
-
-
- Reset VideoDecoder (with exception )
-
Run
these
steps:
-
If
state
is"closed"
, throw anInvalidStateError
. -
Set
state
to"unconfigured"
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Remove all control messages from the
[[control message queue]]
. -
If
[[decodeQueueSize]]
is greater than zero:-
Set
[[decodeQueueSize]]
to zero. -
Run the Schedule Dequeue Event algorithm.
-
-
For each promise in
[[pending flush promises]]
:-
Reject promise with exception .
-
Remove promise from
[[pending flush promises]]
.
-
-
- Close VideoDecoder (with exception )
-
Run
these
steps:
-
Run the Reset VideoDecoder algorithm with exception .
-
Set
state
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources . -
If exception is not an
AbortError
DOMException
, invoke the[[error callback]]
with exception .
-
5. AudioEncoder Interface
[Exposed =(Window ,DedicatedWorker ),SecureContext ]interface :
AudioEncoder EventTarget {constructor (AudioEncoderInit );
init readonly attribute CodecState state ;readonly attribute unsigned long encodeQueueSize ;attribute EventHandler ondequeue ;undefined configure (AudioEncoderConfig );
config undefined encode (AudioData );
data Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <AudioEncoderSupport >isConfigSupported (AudioEncoderConfig ); };
config dictionary {
AudioEncoderInit required EncodedAudioChunkOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
EncodedAudioChunkOutputCallback undefined (EncodedAudioChunk ,
output optional EncodedAudioChunkMetadata = {});
metadata
5.1. Internal Slots
-
[[control message queue]]
-
A queue of control messages to be performed upon this codec instance. See [[control message queue]] .
-
[[message queue blocked]]
-
A boolean indicating when processing the
[[control message queue]]
is blocked by a pending control message . See [[message queue blocked]] . -
[[codec implementation]]
-
Underlying encoder implementation provided by the User Agent. See [[codec implementation]] .
-
[[codec work queue]]
-
A parallel queue used for running parallel steps that reference the
[[codec implementation]]
. See [[codec work queue]] . -
[[codec saturated]]
-
A boolean indicating when the
[[codec implementation]]
is unable to accept additional encoding work. -
[[output callback]]
-
Callback given at construction for encoded outputs.
-
[[error callback]]
-
Callback given at construction for encode errors.
-
[[active encoder config]]
-
The
AudioEncoderConfig
that is actively applied. -
[[active output config]]
-
The
AudioDecoderConfig
that describes how to decode the most recently emittedEncodedAudioChunk
. -
[[state]]
-
The current
CodecState
of thisAudioEncoder
. -
[[encodeQueueSize]]
-
The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
-
[[pending flush promises]]
-
A list of unresolved promises returned by calls to
flush()
. -
[[dequeue event scheduled]]
-
A boolean indicating whether a
dequeue
event is already scheduled to fire. Used to avoid event spam.
5.2. Constructors
AudioEncoder(init)
-
Let e be a new
AudioEncoder
object. -
Assign a new queue to
[[control message queue]]
. -
Assign
false
to[[message queue blocked]]
. -
Assign
null
to[[codec implementation]]
. -
Assign the result of starting a new parallel queue to
[[codec work queue]]
. -
Assign
false
to[[codec saturated]]
. -
Assign init.output to
[[output callback]]
. -
Assign init.error to
[[error callback]]
. -
Assign
null
to[[active encoder config]]
. -
Assign
null
to[[active output config]]
. -
Assign
"unconfigured"
to[[state]]
-
Assign
0
to[[encodeQueueSize]]
. -
Assign a new list to
[[pending flush promises]]
. -
Assign
false
to[[dequeue event scheduled]]
. -
Return e.
5.3. Attributes
-
state
, of type CodecState , readonly -
Returns the value of
[[state]]
. -
encodeQueueSize
, of type unsigned long , readonly -
Returns the value of
[[encodeQueueSize]]
. -
ondequeue
, of type EventHandler -
An event handler IDL attribute whose event handler event type is
dequeue
.
5.4. Event Summary
-
dequeue
-
Fired at the
AudioEncoder
when theencodeQueueSize
has decreased.
5.5. Methods
-
configure(config)
-
Enqueues
a
control
message
to
configure
the
audio
encoder
for
encoding
audio
data
as
described
by
config
.
NOTE: This method will trigger a
NotSupportedError
if the User Agent does not support config . Authors are encouraged to first check support by callingisConfigSupported()
with config . User Agents don’t have to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid AudioEncoderConfig , throw a
TypeError
. -
If
[[state]]
is"closed"
, throw anInvalidStateError
. -
Set
[[state]]
to"configured"
. -
Queue a control message to configure the encoder using config .
Running a control message to configure the encoder means performing these steps:
-
Assign
true
to[[message queue blocked]]
. -
Enqueue the following steps to
[[codec work queue]]
:-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
If supported is
false
, queue a task to run the Close AudioEncoder algorithm withNotSupportedError
and abort these steps. -
If needed, assign
[[codec implementation]]
with an implementation supporting config . -
Configure
[[codec implementation]]
with config . -
queue a task to run the following steps:
-
Assign
false
to[[message queue blocked]]
.
-
-
-
Return
"processed"
.
-
-
encode(data)
-
Enqueues
a
control
message
to
encode
the
given
data
.
When invoked, run these steps:
-
If the value of data ’s
[[Detached]]
internal slot istrue
, throw aTypeError
. -
If
[[state]]
is not"configured"
, throw anInvalidStateError
. -
Let dataClone hold the result of running the Clone AudioData algorithm with data .
-
Increment
[[encodeQueueSize]]
. -
Queue a control message to encode dataClone .
Running a control message to encode the data means performing these steps:
-
If
[[codec saturated]]
equalstrue
, return"not processed"
. -
If encoding data will cause the
[[codec implementation]]
to become saturated , assigntrue
to[[codec saturated]]
. -
Decrement
[[encodeQueueSize]]
and run the Schedule Dequeue Event algorithm. -
Enqueue the following steps to the
[[codec work queue]]
:-
Attempt to use
[[codec implementation]]
to encode the media resource described by dataClone . -
If encoding results in an error, queue a task to run the Close AudioEncoder algorithm with
EncodingError
and return. -
If
[[codec saturated]]
equalstrue
and[[codec implementation]]
is no longer saturated , queue a task to perform the following steps:-
Assign
false
to[[codec saturated]]
.
-
-
Let encoded outputs be a list of encoded audio data outputs emitted by
[[codec implementation]]
. -
If encoded outputs is not empty, queue a task to run the Output EncodedAudioChunks algorithm with encoded outputs .
-
-
Return
"processed"
.
-
-
flush()
-
Completes
all
control
messages
in
the
control
message
queue
and
emits
all
outputs.
When invoked, run these steps:
-
If
[[state]]
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Let promise be a new Promise.
-
Append promise to
[[pending flush promises]]
. -
Queue a control message to flush the codec with promise .
-
Return promise .
Running a control message to flush the codec means performing these steps with promise .
-
Enqueue the following steps to the
[[codec work queue]]
:-
Signal
[[codec implementation]]
to emit all internal pending outputs . -
Let encoded outputs be a list of encoded audio data outputs emitted by
[[codec implementation]]
. -
Queue a task to perform these steps:
-
If encoded outputs is not empty, run the Output EncodedAudioChunks algorithm with encoded outputs .
-
Remove promise from
[[pending flush promises]]
. -
Resolve promise .
-
-
-
Return
"processed"
.
-
-
reset()
-
Immediately
resets
all
state
including
configuration,
control
messages
in
the
control
message
queue
,
and
all
pending
callbacks.
When invoked, run the Reset AudioEncoder algorithm with an
AbortError
DOMException
. -
close()
-
Immediately
aborts
all
pending
work
and
releases
system
resources
.
Close
is
final.
When invoked, run the Close AudioEncoder algorithm with an
AbortError
DOMException
. -
isConfigSupported(config)
-
Returns
a
promise
indicating
whether
the
provided
config
is
supported
by
the
User
Agent.
NOTE: The returned
AudioEncoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparingconfig
to their provided config .When invoked, run these steps:
-
If config is not a valid AudioEncoderConfig , return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue .
-
Enqueue the following steps to checkSupportQueue :
-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
Queue a task to run the following steps:
-
Let encoderSupport be a newly constructed
AudioEncoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config . -
Set
supported
to supported .
-
-
Resolve p with encoderSupport .
-
-
-
Return p .
-
5.6. Algorithms
- Schedule Dequeue Event
-
-
If
[[dequeue event scheduled]]
equalstrue
, return. -
Assign
true
to[[dequeue event scheduled]]
. -
Queue a task to run the following steps:
-
Assign
false
to[[dequeue event scheduled]]
.
-
- Output EncodedAudioChunks (with outputs )
-
Run
these
steps:
-
For each output in outputs :
-
Let chunkInit be an
EncodedAudioChunkInit
with the following keys: -
Let chunk be a new
EncodedAudioChunk
constructed with chunkInit . -
Let chunkMetadata be a new
EncodedAudioChunkMetadata
. -
Let encoderConfig be the
[[active encoder config]]
. -
Let outputConfig be a new
AudioDecoderConfig
that describes output . Initialize outputConfig as follows:-
Assign encoderConfig .
sampleRate
to outputConfig .sampleRate
. -
Assign to encoderConfig .
numberOfChannels
to outputConfig .numberOfChannels
. -
Assign outputConfig .
description
with a sequence of codec specific bytes as determined by the[[codec implementation]]
. The User Agent MUST ensure that the provided description could be used to correctly decode output.NOTE: The codec specific requirements for populating the
description
are described in the [WEBCODECS-CODEC-REGISTRY] .
-
If outputConfig and
[[active output config]]
are not equal dictionaries :-
Assign outputConfig to chunkMetadata .
decoderConfig
. -
Assign outputConfig to
[[active output config]]
.
-
-
Invoke
[[output callback]]
with chunk and chunkMetadata .
-
-
- Reset AudioEncoder (with exception )
-
Run
these
steps:
-
If
[[state]]
is"closed"
, throw anInvalidStateError
. -
Set
[[state]]
to"unconfigured"
. -
Set
[[active encoder config]]
tonull
. -
Set
[[active output config]]
tonull
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Remove all control messages from the
[[control message queue]]
. -
If
[[encodeQueueSize]]
is greater than zero:-
Set
[[encodeQueueSize]]
to zero. -
Run the Schedule Dequeue Event algorithm.
-
-
For each promise in
[[pending flush promises]]
:-
Reject promise with exception .
-
Remove promise from
[[pending flush promises]]
.
-
-
- Close AudioEncoder (with exception )
-
Run
these
steps:
-
Run the Reset AudioEncoder algorithm with exception .
-
Set
[[state]]
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources . -
If exception is not an
AbortError
DOMException
, invoke the[[error callback]]
with exception .
-
5.7. EncodedAudioChunkMetadata
The following metadata dictionary is emitted by the
EncodedAudioChunkOutputCallback
alongside
an
associated
EncodedAudioChunk
.
dictionary {
EncodedAudioChunkMetadata AudioDecoderConfig decoderConfig ; };
-
decoderConfig
, of type AudioDecoderConfig -
A
AudioDecoderConfig
that authors MAY use to decode the associatedEncodedAudioChunk
.
6. VideoEncoder Interface
[Exposed =(Window ,DedicatedWorker ),SecureContext ]interface :
VideoEncoder EventTarget {constructor (VideoEncoderInit );
init readonly attribute CodecState state ;readonly attribute unsigned long encodeQueueSize ;attribute EventHandler ondequeue ;undefined configure (VideoEncoderConfig );
config undefined encode (VideoFrame ,
frame optional VideoEncoderEncodeOptions = {});
options Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <VideoEncoderSupport >isConfigSupported (VideoEncoderConfig ); };
config dictionary {
VideoEncoderInit required EncodedVideoChunkOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
EncodedVideoChunkOutputCallback undefined (EncodedVideoChunk ,
chunk optional EncodedVideoChunkMetadata = {});
metadata
6.1. Internal Slots
-
[[control message queue]]
-
A queue of control messages to be performed upon this codec instance. See [[control message queue]] .
-
[[message queue blocked]]
-
A boolean indicating when processing the
[[control message queue]]
is blocked by a pending control message . See [[message queue blocked]] . -
[[codec implementation]]
-
Underlying encoder implementation provided by the User Agent. See [[codec implementation]] .
-
[[codec work queue]]
-
A parallel queue used for running parallel steps that reference the
[[codec implementation]]
. See [[codec work queue]] . -
[[codec saturated]]
-
A boolean indicating when the
[[codec implementation]]
is unable to accept additional encoding work. -
[[output callback]]
-
Callback given at construction for encoded outputs.
-
[[error callback]]
-
Callback given at construction for encode errors.
-
[[active encoder config]]
-
The
VideoEncoderConfig
that is actively applied. -
[[active output config]]
-
The
VideoDecoderConfig
that describes how to decode the most recently emittedEncodedVideoChunk
. -
[[state]]
-
The current
CodecState
of thisVideoEncoder
. -
[[encodeQueueSize]]
-
The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
-
[[pending flush promises]]
-
A list of unresolved promises returned by calls to
flush()
. -
[[dequeue event scheduled]]
-
A boolean indicating whether a
dequeue
event is already scheduled to fire. Used to avoid event spam. -
[[active orientation]]
An integer and boolean pair indicating the
[[flip]]
and[[rotation]]
of the firstVideoFrame
given toencode()
afterconfigure()
.
6.2. Constructors
VideoEncoder(init)
-
Let e be a new
VideoEncoder
object. -
Assign a new queue to
[[control message queue]]
. -
Assign
false
to[[message queue blocked]]
. -
Assign
null
to[[codec implementation]]
. -
Assign the result of starting a new parallel queue to
[[codec work queue]]
. -
Assign
false
to[[codec saturated]]
. -
Assign init.output to
[[output callback]]
. -
Assign init.error to
[[error callback]]
. -
Assign
null
to[[active encoder config]]
. -
Assign
null
to[[active output config]]
. -
Assign
"unconfigured"
to[[state]]
-
Assign
0
to[[encodeQueueSize]]
. -
Assign a new list to
[[pending flush promises]]
. -
Assign
false
to[[dequeue event scheduled]]
. -
Return e.
6.3. Attributes
-
state
, of type CodecState , readonly -
Returns the value of
[[state]]
. -
encodeQueueSize
, of type unsigned long , readonly -
Returns the value of
[[encodeQueueSize]]
. -
ondequeue
, of type EventHandler -
An event handler IDL attribute whose event handler event type is
dequeue
.
6.4. Event Summary
-
dequeue
-
Fired at the
VideoEncoder
when theencodeQueueSize
has decreased.
6.5. Methods
-
configure(config)
-
Enqueues
a
control
message
to
configure
the
video
encoder
for
encoding
video
frames
as
described
by
config
.
NOTE: This method will trigger a
NotSupportedError
if the User Agent does not support config . Authors are encouraged to first check support by callingisConfigSupported()
with config . User Agents don’t have to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid VideoEncoderConfig , throw a
TypeError
. -
If
[[state]]
is"closed"
, throw anInvalidStateError
. -
Set
[[state]]
to"configured"
. -
Set
[[active orientation]]
tonull
. Queue a control message to configure the encoder using config .
Running a control message to configure the encoder means performing these steps:
-
Assign
true
to[[message queue blocked]]
. -
Enqueue the following steps to
[[codec work queue]]
:-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
If supported is
false
, queue a task to run the Close VideoEncoder algorithm withNotSupportedError
and abort these steps. -
If needed, assign
[[codec implementation]]
with an implementation supporting config . -
Configure
[[codec implementation]]
with config . -
queue a task to run the following steps:
-
Assign
false
to[[message queue blocked]]
.
-
-
-
Return
"processed"
.
-
-
encode( frame , options )
-
Enqueues
a
control
message
to
encode
the
given
frame
.
When invoked, run these steps:
-
If the value of frame ’s
[[Detached]]
internal slot istrue
, throw aTypeError
. -
If
[[state]]
is not"configured"
, throw anInvalidStateError
. -
If
[[active orientation]]
is notnull
and does not match frame ’s[[rotation]]
and[[flip]]
throw aDataError
. If
[[active orientation]]
isnull
, set it to frame ’s[[rotation]]
and[[flip]]
.Let frameClone hold the result of running the Clone VideoFrame algorithm with frame .
-
Increment
[[encodeQueueSize]]
. -
Queue a control message to encode frameClone .
Running a control message to encode the frame means performing these steps:
-
If
[[codec saturated]]
equalstrue
, return"not processed"
. -
If encoding frame will cause the
[[codec implementation]]
to become saturated , assigntrue
to[[codec saturated]]
. -
Decrement
[[encodeQueueSize]]
and run the Schedule Dequeue Event algorithm. -
Enqueue the following steps to the
[[codec work queue]]
:-
Attempt to use
[[codec implementation]]
to encode the frameClone according to options . -
If encoding results in an error, queue a task to run the Close VideoEncoder algorithm with
EncodingError
and return. -
If
[[codec saturated]]
equalstrue
and[[codec implementation]]
is no longer saturated , queue a task to perform the following steps:-
Assign
false
to[[codec saturated]]
.
-
-
Let encoded outputs be a list of encoded video data outputs emitted by
[[codec implementation]]
. -
If encoded outputs is not empty, queue a task to run the Output EncodedVideoChunks algorithm with encoded outputs .
-
-
Return
"processed"
.
-
-
flush()
-
Completes
all
control
messages
in
the
control
message
queue
and
emits
all
outputs.
When invoked, run these steps:
-
If
[[state]]
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Let promise be a new Promise.
-
Append promise to
[[pending flush promises]]
. -
Queue a control message to flush the codec with promise .
-
Return promise .
Running a control message to flush the codec means performing these steps with promise :
-
Enqueue the following steps to the
[[codec work queue]]
:-
Signal
[[codec implementation]]
to emit all internal pending outputs . -
Let encoded outputs be a list of encoded video data outputs emitted by
[[codec implementation]]
. -
Queue a task to perform these steps:
-
If encoded outputs is not empty, run the Output EncodedVideoChunks algorithm with encoded outputs .
-
Remove promise from
[[pending flush promises]]
. -
Resolve promise .
-
-
-
Return
"processed"
.
-
-
reset()
-
Immediately
resets
all
state
including
configuration,
control
messages
in
the
control
message
queue
,
and
all
pending
callbacks.
When invoked, run the Reset VideoEncoder algorithm with an
AbortError
DOMException
. -
close()
-
Immediately
aborts
all
pending
work
and
releases
system
resources
.
Close
is
final.
When invoked, run the Close VideoEncoder algorithm with an
AbortError
DOMException
. -
isConfigSupported(config)
-
Returns
a
promise
indicating
whether
the
provided
config
is
supported
by
the
User
Agent.
NOTE: The returned
VideoEncoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparingconfig
to their provided config .When invoked, run these steps:
-
If config is not a valid VideoEncoderConfig , return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue .
-
Enqueue the following steps to checkSupportQueue :
-
Let supported be the result of running the Check Configuration Support algorithm with config .
-
Queue a task to run the following steps:
-
Let encoderSupport be a newly constructed
VideoEncoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config . -
Set
supported
to supported .
-
-
-
Resolve p with encoderSupport .
-
-
Return p .
-
6.6. Algorithms
- Schedule Dequeue Event
-
-
If
[[dequeue event scheduled]]
equalstrue
, return. -
Assign
true
to[[dequeue event scheduled]]
. -
Queue a task to run the following steps:
-
Assign
false
to[[dequeue event scheduled]]
.
-
- Output EncodedVideoChunks (with outputs )
-
Run
these
steps:
-
For each output in outputs :
-
Let chunkInit be an
EncodedVideoChunkInit
with the following keys:-
Let
data
contain the encoded video data from output . -
Let
type
be theEncodedVideoChunkType
of output . -
Let
timestamp
be the[[timestamp]]
from theVideoFrame
associated with output . -
Let
duration
be the[[duration]]
from theVideoFrame
associated with output .
-
-
Let chunk be a new
EncodedVideoChunk
constructed with chunkInit . -
Let chunkMetadata be a new
EncodedVideoChunkMetadata
. -
Let encoderConfig be the
[[active encoder config]]
. -
Let outputConfig be a
VideoDecoderConfig
that describes output . Initialize outputConfig as follows:-
Assign
encoderConfig.codec
tooutputConfig.codec
. -
Assign
encoderConfig.width
tooutputConfig.codedWidth
. -
Assign
encoderConfig.height
tooutputConfig.codedHeight
. -
Assign
encoderConfig.displayWidth
tooutputConfig.displayAspectWidth
. -
Assign
encoderConfig.displayHeight
tooutputConfig.displayAspectHeight
. -
Assign
[[rotation]]
from theVideoFrame
associated with output tooutputConfig.rotation
. Assign
[[flip]]
from theVideoFrame
associated with output tooutputConfig.flip
.Assign the remaining keys of
outputConfig
as determined by[[codec implementation]]
. The User Agent MUST ensure that the configuration is completely described such that outputConfig could be used to correctly decode output .NOTE: The codec specific requirements for populating the
description
are described in the [WEBCODECS-CODEC-REGISTRY] .
-
-
If outputConfig and
[[active output config]]
are not equal dictionaries :-
Assign outputConfig to chunkMetadata .
decoderConfig
. -
Assign outputConfig to
[[active output config]]
.
-
-
If encoderConfig .
scalabilityMode
describes multiple temporal layers :-
Let svc be a new
SvcOutputMetadata
instance. -
Let temporal_layer_id be the zero-based index describing the temporal layer for output .
-
Assign temporal_layer_id to svc .
temporalLayerId
. -
Assign svc to chunkMetadata .
svc
.
-
-
If encoderConfig .
alpha
is set to"keep"
:-
Let alphaSideData be the encoded alpha data in output .
-
Assign alphaSideData to chunkMetadata .
alphaSideData
.
-
-
Invoke
[[output callback]]
with chunk and chunkMetadata .
-
-
- Reset VideoEncoder (with exception )
-
Run
these
steps:
-
If
[[state]]
is"closed"
, throw anInvalidStateError
. -
Set
[[state]]
to"unconfigured"
. -
Set
[[active encoder config]]
tonull
. -
Set
[[active output config]]
tonull
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Remove all control messages from the
[[control message queue]]
. -
If
[[encodeQueueSize]]
is greater than zero:-
Set
[[encodeQueueSize]]
to zero. -
Run the Schedule Dequeue Event algorithm.
-
-
For each promise in
[[pending flush promises]]
:-
Reject promise with exception .
-
Remove promise from
[[pending flush promises]]
.
-
-
- Close VideoEncoder (with exception )
-
Run
these
steps:
-
Run the Reset VideoEncoder algorithm with exception .
-
Set
[[state]]
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources . -
If exception is not an
AbortError
DOMException
, invoke the[[error callback]]
with exception .
-
6.7. EncodedVideoChunkMetadata
The following metadata dictionary is emitted by the
EncodedVideoChunkOutputCallback
alongside
an
associated
EncodedVideoChunk
.
dictionary {
EncodedVideoChunkMetadata VideoDecoderConfig decoderConfig ;SvcOutputMetadata svc ;BufferSource alphaSideData ; };dictionary {
SvcOutputMetadata unsigned long temporalLayerId ; };
-
decoderConfig
, of type VideoDecoderConfig -
A
VideoDecoderConfig
that authors MAY use to decode the associatedEncodedVideoChunk
. -
svc
, of type SvcOutputMetadata -
A collection of metadata describing this
EncodedVideoChunk
with respect to the configuredscalabilityMode
. -
alphaSideData
, of type BufferSource -
A
BufferSource
that contains theEncodedVideoChunk
’s extra alpha channel data. -
temporalLayerId
, of type unsigned long -
A number that identifies the temporal layer for the associated
EncodedVideoChunk
.
7. Configurations
7.1. Check Configuration Support (with config )
Run these steps:-
If the codec string in config .codec is not a valid codec string or is otherwise unrecognized by the User Agent, return
false
. -
If config is an
AudioDecoderConfig
orVideoDecoderConfig
and the User Agent can’t provide a codec that can decode the exact profile (where present), level (where present), and constraint bits (where present) indicated by the codec string in config .codec, returnfalse
. -
If config is an
AudioEncoderConfig
orVideoEncoderConfig
:-
If the codec string in config .codec contains a profile and the User Agent can’t provide a codec that can encode the exact profile indicated by config .codec, return
false
. -
If the codec string in config .codec contains a level and the User Agent can’t provide a codec that can encode to a level less than or equal to the level indicated by config .codec, return
false
. -
If the codec string in config .codec contains constraint bits and the User Agent can’t provide a codec that can produce an encoded bitstream at least as constrained as indicated by config .codec, return
false
.
-
-
If the User Agent can provide a codec to support all entries of the config , including applicable default values for keys that are not included, return
true
.NOTE: The types
AudioDecoderConfig
,VideoDecoderConfig
,AudioEncoderConfig
, andVideoEncoderConfig
each define their respective configuration entries and defaults.NOTE: Support for a given configuration can change dynamically if the hardware is altered (e.g. external GPU unplugged) or if essential hardware resources are exhausted. User Agents describe support on a best-effort basis given the resources that are available at the time of the query.
-
Otherwise, return false.
7.2. Clone Configuration (with config )
NOTE: This algorithm will copy only the dictionary members that the User Agent recognizes as part of the dictionary type.
Run these steps:
-
Let dictType be the type of dictionary config .
-
Let clone be a new empty instance of dictType .
-
For each dictionary member m defined on dictType :
-
If
config[m]
is a nested dictionary, setclone[m]
to the result of recursively running the Clone Configuration algorithm withconfig[m]
. -
Otherwise, assign a copy of
config[m]
toclone[m]
.
Note: This implements a "deep-copy". These configuration objects are frequently used as the input of asynchronous operations. Copying means that modifying the original object while the operation is in flight won’t change the operation’s outcome.
7.3. Signalling Configuration Support
7.3.1. AudioDecoderSupport
dictionary {
AudioDecoderSupport boolean supported ;AudioDecoderConfig config ; };
-
supported
, of type boolean -
A
boolean
indicating
the
whether
the
corresponding
config
is supported by the User Agent. -
config
, of type AudioDecoderConfig -
An
AudioDecoderConfig
used by the User Agent in determining the value ofsupported
.
7.3.2. VideoDecoderSupport
dictionary {
VideoDecoderSupport boolean supported ;VideoDecoderConfig config ; };
-
supported
, of type boolean -
A
boolean
indicating
the
whether
the
corresponding
config
is supported by the User Agent. -
config
, of type VideoDecoderConfig -
A
VideoDecoderConfig
used by the User Agent in determining the value ofsupported
.
7.3.3. AudioEncoderSupport
dictionary {
AudioEncoderSupport boolean supported ;AudioEncoderConfig config ; };
-
supported
, of type boolean -
A
boolean
indicating
the
whether
the
corresponding
config
is supported by the User Agent. -
config
, of type AudioEncoderConfig -
An
AudioEncoderConfig
used by the User Agent in determining the value ofsupported
.
7.3.4. VideoEncoderSupport
dictionary {
VideoEncoderSupport boolean supported ;VideoEncoderConfig config ; };
-
supported
, of type boolean -
A
boolean
indicating
the
whether
the
corresponding
config
is supported by the User Agent. -
config
, of type VideoEncoderConfig -
A
VideoEncoderConfig
used by the User Agent in determining the value ofsupported
.
7.4. Codec String
A codec string describes a given codec format to be used for encoding or decoding.A valid codec string MUST meet the following conditions.
-
Is valid per the relevant codec specification (see examples below).
-
It describes a single codec.
-
It is unambiguous about codec profile, level, and constraint bits for codecs that define these concepts.
NOTE:
In
other
media
specifications,
codec
strings
historically
accompanied
a
MIME
type
as
the
"codecs="
parameter
(
isTypeSupported()
,
canPlayType()
)
[RFC6381]
.
In
this
specification,
encoded
media
is
not
containerized;
hence,
only
the
value
of
the
codecs
parameter
is
accepted.
NOTE: Encoders for codecs that define level and constraint bits have flexibility around these parameters, but won’t produce bitstreams that have a higher level or are less constrained than requested.
The format and semantics for codec strings are defined by codec registrations listed in the [WEBCODECS-CODEC-REGISTRY] . A compliant implementation MAY support any combination of codec registrations or none at all.
7.5. AudioDecoderConfig
dictionary {
AudioDecoderConfig required DOMString codec ; [EnforceRange ]required unsigned long sampleRate ; [EnforceRange ]required unsigned long numberOfChannels ;BufferSource description ; };
To
check
if
an
AudioDecoderConfig
is
a
valid
AudioDecoderConfig
,
run
these
steps:
-
If
codec
is empty after stripping leading and trailing ASCII whitespace , returnfalse
. -
If
description
is [ detached ], return false. -
Return
true
.
-
codec
, of type DOMString - Contains a codec string in config .codec describing the codec.
-
sampleRate
, of type unsigned long - The number of frame samples per second.
-
numberOfChannels
, of type unsigned long - The number of audio channels.
-
description
, of type BufferSource -
A
sequence
of
codec
specific
bytes,
commonly
known
as
extradata.
NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] describe whether/how to populate this sequence, corresponding to the provided
codec
.
7.6. VideoDecoderConfig
dictionary {
VideoDecoderConfig required DOMString codec ;AllowSharedBufferSource description ; [EnforceRange ]unsigned long codedWidth ; [EnforceRange ]unsigned long codedHeight ; [EnforceRange ]unsigned long displayAspectWidth ; [EnforceRange ]unsigned long displayAspectHeight ;VideoColorSpaceInit colorSpace ;HardwareAcceleration hardwareAcceleration = "no-preference";boolean optimizeForLatency ;double rotation = 0;boolean flip =false ; };
To
check
if
a
VideoDecoderConfig
is
a
valid
VideoDecoderConfig
,
run
these
steps:
-
If
codec
is empty after stripping leading and trailing ASCII whitespace , returnfalse
. -
If one of
codedWidth
orcodedHeight
is provided but the other isn’t, returnfalse
. -
If
codedWidth
= 0 orcodedHeight
= 0, returnfalse
. -
If one of
displayAspectWidth
ordisplayAspectHeight
is provided but the other isn’t, returnfalse
. -
If
displayAspectWidth
= 0 ordisplayAspectHeight
= 0, returnfalse
. -
If
description
is [ detached ], return false. -
Return
true
.
-
codec
, of type DOMString - Contains a codec string describing the codec.
-
description
, of type AllowSharedBufferSource -
A
sequence
of
codec
specific
bytes,
commonly
known
as
extradata.
NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] describes whether/how to populate this sequence, corresponding to the provided
codec
. -
codedWidth
, of type unsigned long - Width of the VideoFrame in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.
-
codedHeight
, of type unsigned long -
Height
of
the
VideoFrame
in
pixels,
potentially
including
non-visible
padding,
and
prior
to
considering
potential
ratio
adjustments.
NOTE:
codedWidth
andcodedHeight
are used when selecting a[[codec implementation]]
. -
displayAspectWidth
, of type unsigned long - Horizontal dimension of the VideoFrame’s aspect ratio when displayed.
-
displayAspectHeight
, of type unsigned long -
Vertical
dimension
of
the
VideoFrame’s
aspect
ratio
when
displayed.
NOTE:
displayWidth
anddisplayHeight
can both be different fromdisplayAspectWidth
anddisplayAspectHeight
, but have identical ratios, after scaling is applied when creating the video frame . -
colorSpace
, of type VideoColorSpaceInit -
Configures
the
VideoFrame
.colorSpace
forVideoFrame
s associated with thisVideoDecoderConfig
. IfcolorSpace
exists , the provided values will override any in-band values from the bitsream. -
hardwareAcceleration
, of type HardwareAcceleration , defaulting to"no-preference"
-
Hint
that
configures
hardware
acceleration
for
this
codec.
See
HardwareAcceleration
. -
optimizeForLatency
, of type boolean -
Hint
that
the
selected
decoder
SHOULD
be
configured
to
minimize
the
number
of
EncodedVideoChunk
s that have to be decoded before aVideoFrame
is output.NOTE: In addition to User Agent and hardware limitations, some codec bitstreams require a minimum number of inputs before any output can be produced.
-
rotation
, of type double , defaulting to0
-
Sets
the
rotation
attribute on decoded frames. -
flip
, of type boolean , defaulting tofalse
-
Sets
the
flip
attribute on decoded frames.
7.7. AudioEncoderConfig
dictionary {
AudioEncoderConfig required DOMString codec ; [EnforceRange ]required unsigned long sampleRate ; [EnforceRange ]required unsigned long numberOfChannels ; [EnforceRange ]unsigned long long bitrate ;BitrateMode bitrateMode = "variable"; };
NOTE:
Codec-specific
extensions
to
AudioEncoderConfig
are
described
in
their
registrations
in
the
[WEBCODECS-CODEC-REGISTRY]
.
To
check
if
an
AudioEncoderConfig
is
a
valid
AudioEncoderConfig
,
run
these
steps:
-
If
codec
is empty after stripping leading and trailing ASCII whitespace , returnfalse
. -
If the
AudioEncoderConfig
has a codec-specific extension and the corresponding registration in the [WEBCODECS-CODEC-REGISTRY] defines steps to check whether the extension is a valid extension, return the result of running those steps. -
If
sampleRate
ornumberOfChannels
are equal to zero, returnfalse
. -
Return
true
.
-
codec
, of type DOMString - Contains a codec string describing the codec.
-
sampleRate
, of type unsigned long - The number of frame samples per second.
-
numberOfChannels
, of type unsigned long - The number of audio channels.
-
bitrate
, of type unsigned long long - The average bitrate of the encoded audio given in units of bits per second.
-
bitrateMode
, of type BitrateMode , defaulting to"variable"
-
Configures
the
encoder
to
use
a
constant
orvariable
bitrate as defined by [MEDIASTREAM-RECORDING] .NOTE: Not all audio codecs support specific
BitrateMode
s, Authors are encouraged to check by callingisConfigSupported()
with config .
7.8. VideoEncoderConfig
dictionary {
VideoEncoderConfig required DOMString codec ; [EnforceRange ]required unsigned long width ; [EnforceRange ]required unsigned long height ; [EnforceRange ]unsigned long displayWidth ; [EnforceRange ]unsigned long displayHeight ; [EnforceRange ]unsigned long long bitrate ;double framerate ;HardwareAcceleration hardwareAcceleration = "no-preference";AlphaOption alpha = "discard";DOMString scalabilityMode ;VideoEncoderBitrateMode bitrateMode = "variable";LatencyMode latencyMode = "quality";DOMString contentHint ; };
NOTE:
Codec-specific
extensions
to
VideoEncoderConfig
are
described
in
their
registrations
in
the
[WEBCODECS-CODEC-REGISTRY]
.
To
check
if
a
VideoEncoderConfig
is
a
valid
VideoEncoderConfig
,
run
these
steps:
-
If
codec
is empty after stripping leading and trailing ASCII whitespace , returnfalse
. -
If
displayWidth
= 0 ordisplayHeight
= 0, returnfalse
. -
Return
true
.
-
codec
, of type DOMString - Contains a codec string in config .codec describing the codec.
-
width
, of type unsigned long -
The
encoded
width
of
output
EncodedVideoChunk
s in pixels, prior to any display aspect ratio adjustments.The encoder MUST scale any
VideoFrame
whose[[visible width]]
differs from this value. -
height
, of type unsigned long -
The
encoded
height
of
output
EncodedVideoChunk
s in pixels, prior to any display aspect ratio adjustments.The encoder MUST scale any
VideoFrame
whose[[visible height]]
differs from this value.
-
displayWidth
, of type unsigned long -
The
intended
display
width
of
output
EncodedVideoChunk
s in pixels. Defaults towidth
if not present. -
displayHeight
, of type unsigned long -
The
intended
display
height
of
output
EncodedVideoChunk
s in pixels. Defaults towidth
if not present.
displayWidth
or
displayHeight
that
differs
from
width
and
height
signals
that
chunks
are
to
be
scaled
after
decoding
to
arrive
at
the
final
display
aspect
ratio.
For many codecs this is merely pass-through information, but some codecs can sometimes include display sizing in the bitstream.
-
bitrate
, of type unsigned long long -
The
average
bitrate
of
the
encoded
video
given
in
units
of
bits
per
second.
NOTE: Authors are encouraged to additionally provide a
framerate
to inform rate control. -
framerate
, of type double -
The
expected
frame
rate
in
frames
per
second,
if
known.
This
value,
along
with
the
frame
timestamp
, SHOULD be used by the video encoder to calculate the optimal byte length for each encoded frame. Additionally, the value SHOULD be considered a target deadline for outputting encoding chunks whenlatencyMode
is set torealtime
. -
hardwareAcceleration
, of type HardwareAcceleration , defaulting to"no-preference"
-
Hint
that
configures
hardware
acceleration
for
this
codec.
See
HardwareAcceleration
. -
alpha
, of type AlphaOption , defaulting to"discard"
-
Whether
the
alpha
component
of
the
VideoFrame
inputs SHOULD be kept or discarded prior to encoding. Ifalpha
is equal todiscard
, alpha data is always discarded, regardless of aVideoFrame
’s[[format]]
. -
scalabilityMode
, of type DOMString - An encoding scalability mode identifier as defined by [WebRTC-SVC] .
-
bitrateMode
, of type VideoEncoderBitrateMode , defaulting to"variable"
-
Configures
encoding
to
use
one
of
the
rate
control
modes
specified
by
VideoEncoderBitrateMode
.NOTE: The precise degree of bitrate fluctuation in either mode is implementation defined.
-
latencyMode
, of type LatencyMode , defaulting to"quality"
-
Configures
latency
related
behaviors
for
this
codec.
See
LatencyMode
. -
contentHint
, of type DOMString -
An
encoding
video
content
hint
as
defined
by
[mst-content-hint]
.
The User Agent MAY use this hint to set expectations about incoming
VideoFrame
s and to improve encoding quality. If using this hint:-
The User Agent MUST respect other explicitly set encoding options when configuring the encoder, whether they are codec-specific encoding options or not.
-
The User Agent SHOULD make a best-effort attempt to use additional configuration options to improve encoding quality, according to the goals defined by the corresponding video content hint .
NOTE: Some encoder options are implementation specific, and mappings between
contentHint
and those options cannot be prescribed.The User Agent MUST NOT refuse the configuration if it doesn’t support this content hint. See
isConfigSupported()
. -
7.9. Hardware Acceleration
enum {
HardwareAcceleration "no-preference" ,"prefer-hardware" ,"prefer-software" , };
When
supported,
hardware
acceleration
offloads
encoding
or
decoding
to
specialized
hardware.
prefer-hardware
and
prefer-software
are
hints.
While
User
Agents
SHOULD
respect
these
values
when
possible,
User
Agents
may
ignore
these
values
in
some
or
all
circumstances
for
any
reason.
To
prevent
fingerprinting,
if
a
User
Agent
implements
[media-capabilities]
,
the
User
Agent
MUST
ensure
rejection
or
acceptance
of
a
given
HardwareAcceleration
preference
reveals
no
additional
information
on
top
of
what
is
inherent
to
the
User
Agent
and
revealed
by
[media-capabilities]
.
If
a
User
Agent
does
not
implement
[media-capabilities]
for
reasons
of
fingerprinting,
they
SHOULD
ignore
the
HardwareAcceleration
preference.
prefer-hardware
or
prefer-software
are
for
reasons
of
user
privacy
or
circumstances
where
the
User
Agent
determines
an
alternative
setting
would
better
serve
the
end
user.
Most
authors
will
be
best
served
by
using
the
default
of
no-preference
.
This
gives
the
User
Agent
flexibility
to
optimize
based
on
its
knowledge
of
the
system
and
configuration.
A
common
strategy
will
be
to
prioritize
hardware
acceleration
at
higher
resolutions
with
a
fallback
to
software
codecs
if
hardware
acceleration
fails.
Authors are encouraged to carefully weigh the tradeoffs when setting a hardware acceleration preference. The precise tradeoffs will be device-specific, but authors can generally expect the following:
-
Setting a value of
prefer-hardware
orprefer-software
can significantly restrict what configurations are supported. It can occur that the user’s device does not offer acceleration for any codec, or only for the most common profiles of older codecs. It can also occur that a given User Agent lacks a software based codec implementation. -
Hardware acceleration does not simply imply faster encoding / decoding. Hardware acceleration often has higher startup latency but more consistent throughput performance. Acceleration will generally reduce CPU load.
-
For decoding, hardware acceleration is often less robust to inputs that are mislabeled or violate the relevant codec specification.
-
Hardware acceleration will often be more power efficient than purely software based codecs.
-
For lower resolution content, the overhead added by hardware acceleration can yield decreased performance and power efficiency compared to purely software based codecs.
Given these tradeoffs, a good example of using "prefer-hardware" would be if an author intends to provide their own software based fallback via WebAssembly.
Alternatively, a good example of using "prefer-software" would be if an author is especially sensitive to the higher startup latency or decreased robustness generally associated with hardware acceleration.
-
no-preference
- Indicates that the User Agent MAY use hardware acceleration if it is available and compatible with other aspects of the codec configuration.
-
prefer-software
-
Indicates
that
the
User
Agent
SHOULD
prefer
a
software
codec
implementation.
User
Agents
may
ignore
this
value
for
any
reason.
NOTE: This can cause the configuration to be unsupported on platforms where an unaccelerated codec is unavailable or is incompatible with other aspects of the codec configuration.
-
prefer-hardware
-
Indicates
that
the
User
Agent
SHOULD
prefer
hardware
acceleration.
User
Agents
may
ignore
this
value
for
any
reason.
NOTE: This can cause the configuration to be unsupported on platforms where an accelerated codec is unavailable or is incompatible with other aspects of the codec configuration.
7.10. Alpha Option
enum {
AlphaOption "keep" ,"discard" , };
Describes how the user agent SHOULD behave when dealing with alpha channels, for a variety of different operations.
-
keep
-
Indicates
that
the
user
agent
SHOULD
preserve
alpha
channel
data
for
VideoFrame
s, if it is present. -
discard
-
Indicates
that
the
user
agent
SHOULD
ignore
or
remove
VideoFrame
’s alpha channel data.
7.11. Latency Mode
enum {
LatencyMode "quality" ,"realtime" };
-
quality
-
Indicates that the User Agent SHOULD optimize for encoding quality. In this mode:
-
realtime
-
Indicates that the User Agent SHOULD optimize for low latency. In this mode:
7.12. Configuration Equivalence
Two dictionaries are equal dictionaries if they contain the same keys and values. For nested dictionaries, apply this definition recursively.7.13. VideoEncoderEncodeOptions
dictionary {
VideoEncoderEncodeOptions boolean keyFrame =false ; };
NOTE:
Codec-specific
extensions
to
VideoEncoderEncodeOptions
are
described
in
their
registrations
in
the
[WEBCODECS-CODEC-REGISTRY]
.
-
keyFrame
, of type boolean , defaulting tofalse
-
A
value
of
true
indicates that the given frame MUST be encoded as a key frame. A value offalse
indicates that the User Agent has flexibility to decide whether the frame will be encoded as a key frame .
7.14. VideoEncoderBitrateMode
enum {
VideoEncoderBitrateMode "constant" ,"variable" ,"quantizer" };
-
constant
-
Encode
at
a
constant
bitrate.
See
bitrate
. -
variable
-
Encode
using
a
variable
bitrate,
allowing
more
space
to
be
used
for
complex
signals
and
less
space
for
less
complex
signals.
See
bitrate
. -
quantizer
-
Encode
using
a
quantizer,
that
is
specified
for
each
video
frame
in
codec
specific
extensions
of
VideoEncoderEncodeOptions
.
7.15. CodecState
enum {
CodecState "unconfigured" ,"configured" ,"closed" };
-
unconfigured
- The codec is not configured for encoding or decoding.
-
configured
- A valid configuration has been provided. The codec is ready for encoding or decoding.
-
closed
- The codec is no longer usable and underlying system resources have been released.
7.16. WebCodecsErrorCallback
callback =
WebCodecsErrorCallback undefined (DOMException );
error
8. Encoded Media Interfaces (Chunks)
These interfaces represent chunks of encoded media.8.1. EncodedAudioChunk Interface
[Exposed =(Window ,DedicatedWorker ),Serializable ]interface {
EncodedAudioChunk constructor (EncodedAudioChunkInit );
init readonly attribute EncodedAudioChunkType type ;readonly attribute long long timestamp ; // microsecondsreadonly attribute unsigned long long ?duration ; // microsecondsreadonly attribute unsigned long byteLength ;undefined copyTo (AllowSharedBufferSource ); };
destination dictionary {
EncodedAudioChunkInit required EncodedAudioChunkType ; [
type EnforceRange ]required long long ; // microseconds [
timestamp EnforceRange ]unsigned long long ; // microseconds
duration required AllowSharedBufferSource ;
data sequence <ArrayBuffer >= []; };
transfer enum {
EncodedAudioChunkType ,
"key" , };
"delta"
8.1.1. Internal Slots
-
[[internal data]]
-
An array of bytes representing the encoded chunk data.
-
[[type]]
-
Describes whether the chunk is a key chunk .
-
[[timestamp]]
-
The presentation timestamp, given in microseconds.
-
[[duration]]
-
The presentation duration, given in microseconds.
-
[[byte length]]
-
The byte length of
[[internal data]]
.
8.1.2. Constructors
EncodedAudioChunk(init)
-
If init .
transfer
contains more than one reference to the sameArrayBuffer
, then throw aDataCloneError
DOMException
. -
For each transferable in init .
transfer
:-
If
[[Detached]]
internal slot istrue
, then throw aDataCloneError
DOMException
.
-
-
Let chunk be a new
EncodedAudioChunk
object, initialized as follows-
Assign
init.type
to[[type]]
. -
Assign
init.timestamp
to[[timestamp]]
. -
If
init.duration
exists, assign it to[[duration]]
, or assignnull
otherwise. -
Assign
init.data.byteLength
to[[byte length]]
; -
If init .
transfer
contains anArrayBuffer
referenced by init .data
the User Agent MAY choose to:-
Let resource be a new media resource referencing sample data in init .
data
.
-
-
Otherwise:
-
Assign a copy of init .
data
to[[internal data]]
.
-
-
-
For each transferable in init .
transfer
:-
Perform DetachArrayBuffer on transferable
-
-
Return chunk .
8.1.3. Attributes
-
type
, of type EncodedAudioChunkType , readonly -
Returns the value of
[[type]]
. -
timestamp
, of type long long , readonly -
Returns the value of
[[timestamp]]
. -
duration
, of type unsigned long long , readonly, nullable -
Returns the value of
[[duration]]
. -
byteLength
, of type unsigned long , readonly -
Returns the value of
[[byte length]]
.
8.1.4. Methods
-
copyTo(destination)
-
When invoked, run these steps:
-
If the
[[byte length]]
of thisEncodedAudioChunk
is greater than in destination , throw aTypeError
. -
Copy the
[[internal data]]
into destination .
-
8.1.5. Serialization
-
The
EncodedAudioChunk
serialization steps (with value , serialized , and forStorage ) are: -
-
If forStorage is
true
, throw aDataCloneError
. -
For each
EncodedAudioChunk
internal slot in value , assign the value of each internal slot to a field in serialized with the same name as the internal slot.
-
-
The
EncodedAudioChunk
deserialization steps (with serialized and value ) are: -
-
For all named fields in serialized , assign the value of each named field to the
EncodedAudioChunk
internal slot in value with the same name as the named field.
-
NOTE:
Since
EncodedAudioChunk
s
are
immutable,
User
Agents
can
choose
to
implement
serialization
using
a
reference
counting
model
similar
to
§ 9.2.6
Transfer
and
Serialization
.
8.2. EncodedVideoChunk Interface
[Exposed =(Window ,DedicatedWorker ),Serializable ]interface {
EncodedVideoChunk constructor (EncodedVideoChunkInit );
init readonly attribute EncodedVideoChunkType type ;readonly attribute long long timestamp ; // microsecondsreadonly attribute unsigned long long ?duration ; // microsecondsreadonly attribute unsigned long byteLength ;undefined copyTo (AllowSharedBufferSource ); };
destination dictionary {
EncodedVideoChunkInit required EncodedVideoChunkType ; [
type EnforceRange ]required long long ; // microseconds [
timestamp EnforceRange ]unsigned long long ; // microseconds
duration required AllowSharedBufferSource ;
data sequence <ArrayBuffer >= []; };
transfer enum {
EncodedVideoChunkType ,
"key" , };
"delta"
8.2.1. Internal Slots
-
[[internal data]]
-
An array of bytes representing the encoded chunk data.
-
[[type]]
-
The
EncodedVideoChunkType
of thisEncodedVideoChunk
; -
[[timestamp]]
-
The presentation timestamp, given in microseconds.
-
[[duration]]
-
The presentation duration, given in microseconds.
-
[[byte length]]
-
The byte length of
[[internal data]]
.
8.2.2. Constructors
EncodedVideoChunk(init)
-
If init .
transfer
contains more than one reference to the sameArrayBuffer
, then throw aDataCloneError
DOMException
. -
For each transferable in init .
transfer
:-
If
[[Detached]]
internal slot istrue
, then throw aDataCloneError
DOMException
.
-
-
Let chunk be a new
EncodedVideoChunk
object, initialized as follows-
Assign
init.type
to[[type]]
. -
Assign
init.timestamp
to[[timestamp]]
. -
If duration is present in init, assign
init.duration
to[[duration]]
. Otherwise, assignnull
to[[duration]]
. -
Assign
init.data.byteLength
to[[byte length]]
; -
If init .
transfer
contains anArrayBuffer
referenced by init .data
the User Agent MAY choose to:-
Let resource be a new media resource referencing sample data in init .
data
.
-
-
Otherwise:
-
Assign a copy of init .
data
to[[internal data]]
.
-
-
-
For each transferable in init .
transfer
:-
Perform DetachArrayBuffer on transferable
-
-
Return chunk .
8.2.3. Attributes
-
type
, of type EncodedVideoChunkType , readonly -
Returns the value of
[[type]]
. -
timestamp
, of type long long , readonly -
Returns the value of
[[timestamp]]
. -
duration
, of type unsigned long long , readonly, nullable -
Returns the value of
[[duration]]
. -
byteLength
, of type unsigned long , readonly -
Returns the value of
[[byte length]]
.
8.2.4. Methods
-
copyTo(destination)
-
When invoked, run these steps:
-
If
[[byte length]]
is greater than the[[byte length]]
of destination , throw aTypeError
. -
Copy the
[[internal data]]
into destination .
-
8.2.5. Serialization
-
The
EncodedVideoChunk
serialization steps (with value , serialized , and forStorage ) are: -
-
If forStorage is
true
, throw aDataCloneError
. -
For each
EncodedVideoChunk
internal slot in value , assign the value of each internal slot to a field in serialized with the same name as the internal slot.
-
-
The
EncodedVideoChunk
deserialization steps (with serialized and value ) are: -
-
For all named fields in serialized , assign the value of each named field to the
EncodedVideoChunk
internal slot in value with the same name as the named field.
-
NOTE:
Since
EncodedVideoChunk
s
are
immutable,
User
Agents
can
choose
to
implement
serialization
using
a
reference
counting
model
similar
to
§ 9.4.7
Transfer
and
Serialization
.
9. Raw Media Interfaces
These interfaces represent unencoded (raw) media.9.1. Memory Model
9.1.1. Background
This section is non-normative.
Decoded
media
data
MAY
occupy
a
large
amount
of
system
memory.
To
minimize
the
need
for
expensive
copies,
this
specification
defines
a
scheme
for
reference
counting
(
clone()
and
close()
).
NOTE:
Authors
are
encouraged
to
call
close()
immediately
when
frames
are
no
longer
needed.
9.1.2. Reference Counting
A
media
resource
is
storage
for
the
actual
pixel
data
or
the
audio
sample
data
described
by
a
VideoFrame
or
AudioData
.
The
AudioData
[[resource
reference]]
and
VideoFrame
[[resource
reference]]
internal
slots
hold
a
reference
to
a
media
resource
.
VideoFrame
.
clone()
and
AudioData
.
clone()
return
new
objects
whose
[[resource
reference]]
points
to
the
same
media
resource
as
the
original
object.
VideoFrame
.
close()
and
AudioData
.
close()
will
clear
their
[[resource
reference]]
slot,
releasing
the
reference
their
media
resource
.
A
media
resource
MUST
remain
alive
at
least
as
long
as
it
continues
to
be
referenced
by
a
[[resource
reference]]
.
NOTE:
When
a
media
resource
is
no
longer
referenced
by
a
[[resource
reference]]
,
the
resource
can
be
destroyed.
User
Agents
are
encouraged
to
destroy
such
resources
quickly
to
reduce
memory
pressure
and
facilitate
resource
reuse.
9.1.3. Transfer and Serialization
This section is non-normative.
AudioData
and
VideoFrame
are
both
transferable
and
serializable
objects.
Their
transfer
and
serialization
steps
are
defined
in
§ 9.2.6
Transfer
and
Serialization
and
§ 9.4.7
Transfer
and
Serialization
respectively.
Transferring
an
AudioData
or
VideoFrame
moves
its
[[resource
reference]]
to
the
destination
object
and
closes
(as
in
close()
)
the
source
object.
Authors
MAY
use
this
facility
to
move
an
AudioData
or
VideoFrame
between
realms
without
copying
the
underlying
media
resource
.
Serializing
an
AudioData
or
VideoFrame
effectively
clones
(as
in
clone()
)
the
source
object,
resulting
in
two
objects
that
reference
the
same
media
resource
.
Authors
MAY
use
this
facility
to
clone
an
AudioData
or
VideoFrame
to
another
realm
without
copying
the
underlying
media
resource
.
9.2. AudioData Interface
[Exposed =(Window ,DedicatedWorker ),Serializable ,Transferable ]interface {
AudioData constructor (AudioDataInit );
init readonly attribute AudioSampleFormat ?format ;readonly attribute float sampleRate ;readonly attribute unsigned long numberOfFrames ;readonly attribute unsigned long numberOfChannels ;readonly attribute unsigned long long duration ; // microsecondsreadonly attribute long long timestamp ; // microsecondsunsigned long allocationSize (AudioDataCopyToOptions );
options undefined copyTo (AllowSharedBufferSource ,
destination AudioDataCopyToOptions );
options AudioData clone ();undefined close (); };dictionary {
AudioDataInit required AudioSampleFormat ;
format required float ; [
sampleRate EnforceRange ]required unsigned long ; [
numberOfFrames EnforceRange ]required unsigned long ; [
numberOfChannels EnforceRange ]required long long ; // microseconds
timestamp required BufferSource ;
data sequence <ArrayBuffer >= []; };
transfer
9.2.1. Internal Slots
-
[[resource reference]]
-
A reference to a media resource that stores the audio sample data for this
AudioData
. -
[[format]]
-
The
AudioSampleFormat
used by thisAudioData
. Will benull
whenever the underlying format does not map to anAudioSampleFormat
or when[[Detached]]
istrue
. -
[[sample rate]]
-
The sample-rate, in Hz, for this
AudioData
. -
[[number of frames]]
-
[[number of channels]]
-
The number of audio channels for this
AudioData
. -
[[timestamp]]
-
The presentation timestamp, in microseconds, for this
AudioData
.
9.2.2. Constructors
AudioData(init)
-
If init is not a valid AudioDataInit , throw a
TypeError
. -
If init .
transfer
contains more than one reference to the sameArrayBuffer
, then throw aDataCloneError
DOMException
. -
For each transferable in init .
transfer
:-
If
[[Detached]]
internal slot istrue
, then throw aDataCloneError
DOMException
.
-
-
Let frame be a new
AudioData
object, initialized as follows:-
Assign
false
to[[Detached]]
. -
Assign init .
format
to[[format]]
. -
Assign init .
sampleRate
to[[sample rate]]
. -
Assign init .
numberOfFrames
to[[number of frames]]
. -
Assign init .
numberOfChannels
to[[number of channels]]
. -
Assign init .
timestamp
to[[timestamp]]
. -
If init .
transfer
contains anArrayBuffer
referenced by init .data
the User Agent MAY choose to:-
Let resource be a new media resource referencing sample data in data .
-
-
Otherwise:
-
Let resource be a media resource containing a copy of init .
data
.
-
-
Let resourceReference be a reference to resource .
-
Assign resourceReference to
[[resource reference]]
.
-
-
For each transferable in init .
transfer
:-
Perform DetachArrayBuffer on transferable
-
-
Return frame .
9.2.3. Attributes
-
format
, of type AudioSampleFormat , readonly, nullable -
The
AudioSampleFormat
used by thisAudioData
. Will benull
whenever the underlying format does not map to aAudioSampleFormat
or when[[Detached]]
istrue
.The
format
getter steps are to return[[format]]
. -
sampleRate
, of type float , readonly -
The sample-rate, in Hz, for this
AudioData
.The
sampleRate
getter steps are to return[[sample rate]]
. -
numberOfFrames
, of type unsigned long , readonly -
The number of frames for this
AudioData
.The
numberOfFrames
getter steps are to return[[number of frames]]
. -
numberOfChannels
, of type unsigned long , readonly -
The number of audio channels for this
AudioData
.The
numberOfChannels
getter steps are to return[[number of channels]]
. -
timestamp
, of type long long , readonly -
The presentation timestamp, in microseconds, for this
AudioData
.The
numberOfChannels
getter steps are to return[[timestamp]]
. -
duration
, of type unsigned long long , readonly -
The duration, in microseconds, for this
AudioData
.The
duration
getter steps are to:-
Let microsecondsPerSecond be
1,000,000
. -
Let durationInSeconds be the result of dividing
[[number of frames]]
by[[sample rate]]
. -
Return the product of durationInSeconds and microsecondsPerSecond .
-
9.2.4. Methods
-
allocationSize( options )
-
Returns the number of bytes required to hold the samples as described by options .
When invoked, run these steps:
-
If
[[Detached]]
istrue
, throw anInvalidStateError
DOMException
. -
Let copyElementCount be the result of running the Compute Copy Element Count algorithm with options .
-
Let destFormat be the value of
[[format]]
. -
If options .
format
exists , assign options .format
to destFormat . -
Let bytesPerSample be the number of bytes per sample, as defined by the destFormat .
-
Return the product of multiplying bytesPerSample by copyElementCount .
-
-
copyTo( destination , options )
-
Copies the samples from the specified plane of the
AudioData
to the destination buffer.When invoked, run these steps:
-
If
[[Detached]]
istrue
, throw anInvalidStateError
DOMException
. -
Let copyElementCount be the result of running the Compute Copy Element Count algorithm with options .
-
Let destFormat be the value of
[[format]]
. -
If options .
format
exists , assign options .format
to destFormat . -
Let bytesPerSample be the number of bytes per sample, as defined by the destFormat .
-
If the product of multiplying bytesPerSample by copyElementCount is greater than
destination.byteLength
, throw aRangeError
. -
Let resource be the media resource referenced by
[[resource reference]]
. -
Let planeFrames be the region of resource corresponding to options .
planeIndex
. -
Copy elements of planeFrames into destination , starting with the frame positioned at options .
frameOffset
and stopping after copyElementCount samples have been copied. If destFormat does not equal[[format]]
, convert elements to the destFormatAudioSampleFormat
while making the copy.
-
-
clone()
-
Creates a new AudioData with a reference to the same media resource .
When invoked, run these steps:
-
If
[[Detached]]
istrue
, throw anInvalidStateError
DOMException
. -
Return the result of running the Clone AudioData algorithm with this .
-
-
close()
-
Clears all state and releases the reference to the media resource . Close is final.
When invoked, run the Close AudioData algorithm with this .
9.2.5. Algorithms
- Compute Copy Element Count (with options )
-
Run these steps:
-
Let destFormat be the value of
[[format]]
. -
If options .
format
exists , assign options .format
to destFormat . -
If destFormat describes an interleaved
AudioSampleFormat
and options .planeIndex
is greater than0
, throw aRangeError
. -
Otherwise, if destFormat describes a planar
AudioSampleFormat
and if options .planeIndex
is greater or equal to[[number of channels]]
, throw aRangeError
. -
If
[[format]]
does not equal destFormat and the User Agent does not support the requestedAudioSampleFormat
conversion, throw aNotSupportedError
DOMException
. Conversion tof32-planar
MUST always be supported. -
Let frameCount be the number of frames in the plane identified by options .
planeIndex
. -
If options .
frameOffset
is greater than or equal to frameCount , throw aRangeError
. -
Let copyFrameCount be the difference of subtracting options .
frameOffset
from frameCount . -
If options .
frameCount
exists :-
If options .
frameCount
is greater than copyFrameCount , throw aRangeError
. -
Otherwise, assign options .
frameCount
to copyFrameCount .
-
-
Let elementCount be copyFrameCount .
-
If destFormat describes an interleaved
AudioSampleFormat
, multiply elementCount by[[number of channels]]
-
return elementCount .
-
- Clone AudioData (with data )
-
Run these steps:
-
Let clone be a new
AudioData
initialized as follows:-
Let resource be the media resource referenced by data ’s
[[resource reference]]
. -
Let reference be a new reference to resource .
-
Assign reference to
[[resource reference]]
. -
Assign the values of data ’s
[[Detached]]
,[[format]]
,[[sample rate]]
,[[number of frames]]
,[[number of channels]]
, and[[timestamp]]
slots to the corresponding slots in clone .
-
-
Return clone .
-
- Close AudioData (with data )
-
Run these steps:
-
Assign
true
to data ’s[[Detached]]
internal slot. -
Assign
null
to data ’s[[resource reference]]
. -
Assign
0
to data ’s[[sample rate]]
. -
Assign
0
to data ’s[[number of frames]]
. -
Assign
0
to data ’s[[number of channels]]
. -
Assign
null
to data ’s[[format]]
.
-
-
To
check
if
a
AudioDataInit
is a valid AudioDataInit , run these steps: -
-
If
sampleRate
less than or equal to0
, returnfalse
. -
If
numberOfFrames
=0
, returnfalse
. -
If
numberOfChannels
=0
, returnfalse
. -
Verify
data
has enough data by running the following steps:-
Let totalSamples be the product of multiplying
numberOfFrames
bynumberOfChannels
. -
Let bytesPerSample be the number of bytes per sample, as defined by the
format
. -
Let totalSize be the product of multiplying bytesPerSample with totalSamples .
-
Let dataSize be the size in bytes of
data
. -
If dataSize is less than totalSize , return false.
-
-
Return
true
.
-
AudioDataInit
’s
data
’s
memory
layout
matches
the
expectations
of
the
planar
or
interleaved
format
.
There
is
no
real
way
to
verify
whether
the
samples
conform
to
their
AudioSampleFormat
.
9.2.6. Transfer and Serialization
-
The
AudioData
transfer steps (with value and dataHolder ) are: -
-
If value ’s
[[Detached]]
istrue
, throw aDataCloneError
DOMException
. -
For all
AudioData
internal slots in value , assign the value of each internal slot to a field in dataHolder with the same name as the internal slot. -
Run the Close AudioData algorithm with value .
-
-
The
AudioData
transfer-receiving steps (with dataHolder and value ) are: -
-
For all named fields in dataHolder , assign the value of each named field to the
AudioData
internal slot in value with the same name as the named field.
-
-
The
AudioData
serialization steps (with value , serialized , and forStorage ) are: -
-
If value ’s
[[Detached]]
istrue
, throw aDataCloneError
DOMException
. -
If forStorage is
true
, throw aDataCloneError
. -
Let resource be the media resource referenced by value ’s
[[resource reference]]
. -
Let newReference be a new reference to resource .
-
Assign newReference to |serialized.resource reference|.
-
For all remaining
AudioData
internal slots (excluding[[resource reference]]
) in value , assign the value of each internal slot to a field in serialized with the same name as the internal slot.
-
-
The
AudioData
deserialization steps (with serialized and value ) are: -
-
For all named fields in serialized , assign the value of each named field to the
AudioData
internal slot in value with the same name as the named field.
-
9.2.7. AudioDataCopyToOptions
dictionary { [
AudioDataCopyToOptions EnforceRange ]required unsigned long planeIndex ; [EnforceRange ]unsigned long frameOffset = 0; [EnforceRange ]unsigned long frameCount ;AudioSampleFormat format ; };
-
planeIndex
, of type unsigned long -
The index identifying the plane to copy from.
-
frameOffset
, of type unsigned long , defaulting to0
-
An offset into the source plane data indicating which frame to begin copying from. Defaults to
0
. -
frameCount
, of type unsigned long -
The number of frames to copy. If not provided, the copy will include all frames in the plane beginning with
frameOffset
. -
format
, of type AudioSampleFormat -
The output
AudioSampleFormat
for the destination data. If not provided, the resulting copy will use this AudioData’s[[format]]
. InvokingcopyTo()
will throw aNotSupportedError
if conversion to the requested format is not supported. Conversion from anyAudioSampleFormat
tof32-planar
MUST always be supported.NOTE: Authors seeking to integrate with [WEBAUDIO] can request
f32-planar
and use the resulting copy to create andAudioBuffer
or render viaAudioWorklet
.
9.3. Audio Sample Format
An
audio
sample
format
describes
the
numeric
type
used
to
represent
a
single
sample
(e.g.
32-bit
floating
point)
and
the
arrangement
of
samples
from
different
channels
as
either
interleaved
or
planar
.
The
audio
sample
type
refers
solely
to
the
numeric
type
and
interval
used
to
store
the
data,
this
is
u8
,
s16
,
s32
,
or
f32
for
respectively
unsigned
8-bits,
signed
16-bits,
signed
32-bits,
and
32-bits
floating
point
number.
The
audio
buffer
arrangement
refers
solely
to
the
way
the
samples
are
laid
out
in
memory
(
planar
or
interleaved
).
A sample refers to a single value that is the magnitude of a signal at a particular point in time in a particular channel.
A frame or (sample-frame) refers to a set of values of all channels of a multi-channel signal, that happen at the exact same time.
NOTE: Consequently, if an audio signal is mono (has only one channel), a frame and a sample refer to the same thing.
All audio samples in this specification are using linear pulse-code modulation (Linear PCM): quantization levels are uniform between values.
NOTE: The Web Audio API, that is expected to be used with this specification, also uses Linear PCM.
enum {
AudioSampleFormat "u8" ,"s16" ,"s32" ,"f32" ,"u8-planar" ,"s16-planar" ,"s32-planar" ,"f32-planar" , };
-
u8
-
8-bit unsigned integer samples with interleaved channel arrangement .
-
s16
-
16-bit signed integer samples with interleaved channel arrangement .
-
s32
-
32-bit signed integer samples with interleaved channel arrangement .
-
f32
-
u8-planar
-
8-bit unsigned integer samples with planar channel arrangement .
-
s16-planar
-
16-bit signed integer samples with planar channel arrangement .
-
s32-planar
-
32-bit signed integer samples with planar channel arrangement .
-
f32-planar
9.3.1. Arrangement of audio buffer
When
an
AudioData
has
an
AudioSampleFormat
that
is
interleaved
,
the
audio
samples
from
different
channels
are
laid
out
consecutively
in
the
same
buffer,
in
the
order
described
in
the
section
§ 9.3.3
Audio
channel
ordering
.
The
AudioData
has
a
single
plane,
that
contains
a
number
of
elements
therefore
equal
to
[[number
of
frames]]
*
[[number
of
channels]]
.
When
an
AudioData
has
an
AudioSampleFormat
that
is
planar
,
the
audio
samples
from
different
channels
are
laid
out
in
different
buffers,
themselves
arranged
in
an
order
described
in
the
section
§ 9.3.3
Audio
channel
ordering
.
The
AudioData
has
a
number
of
planes
equal
to
the
AudioData
’s
[[number
of
channels]]
.
Each
plane
contains
[[number
of
frames]]
elements.
NOTE:
The
Web
Audio
API
currently
uses
f32-planar
exclusively.
AudioSampleFormat
s
9.3.2. Magnitude of the audio samples
The minimum value and maximum value of an audio sample, for a particular audio sample type, are the values below which (respectively above which) audio clipping might occur. They are otherwise regular types, that can hold values outside this interval during intermediate processing.
The bias value for an audio sample type is the value that often corresponds to the middle of the range (but often the range is not symmetrical). An audio buffer comprised only of values equal to the bias value is silent.
Sample type | IDL type | Minimum value | Bias value | Maximum value |
---|---|---|---|---|
u8
| octet | 0 | 128 | +255 |
s16
| short | -32768 | 0 | +32767 |
s32
| long | -2147483648 | 0 | +2147483647 |
f32
| float | -1.0 | 0.0 | +1.0 |
NOTE: There is no data type that can hold 24 bits of information conveniently, but audio content using 24-bit samples is common, so 32-bits integers are commonly used to hold 24-bit content.
AudioData
containing
24-bit
samples
SHOULD
store
those
samples
in
s32
or
f32
.
When
samples
are
stored
in
s32
,
each
sample
MUST
be
left-shifted
by
8
bits.
By
virtue
of
this
process,
samples
outside
of
the
valid
24-bit
range
([-8388608,
+8388607])
will
be
clipped.
To
avoid
clipping
and
ensure
lossless
transport,
samples
MAY
be
converted
to
f32
.
NOTE:
While
clipping
is
unavoidable
in
u8
,
s16
,
and
s32
samples
due
to
their
storage
types,
implementations
SHOULD
take
care
not
to
clip
internally
when
handling
f32
samples.
9.3.3. Audio channel ordering
When
decoding,
the
ordering
of
the
audio
channels
in
the
resulting
AudioData
MUST
be
the
same
as
what
is
present
in
the
EncodedAudioChunk
.
When
encoding,
the
ordering
of
the
audio
channels
in
the
resulting
EncodedAudioChunk
MUST
be
the
same
as
what
is
preset
in
the
given
AudioData
.
In other terms, no channel reordering is performed when encoding and decoding.
NOTE: The container either implies or specifies the channel mapping: the channel attributed to a particular channel index.
9.4. VideoFrame Interface
NOTE:
VideoFrame
is
a
CanvasImageSource
.
A
VideoFrame
can
be
passed
to
any
method
accepting
a
CanvasImageSource
,
including
CanvasDrawImage
’s
drawImage()
.
[Exposed =(Window ,DedicatedWorker ),Serializable ,Transferable ]interface {
VideoFrame constructor (CanvasImageSource ,
image optional VideoFrameInit = {});
init constructor (AllowSharedBufferSource ,
data VideoFrameBufferInit );
init readonly attribute VideoPixelFormat ?format ;readonly attribute unsigned long codedWidth ;readonly attribute unsigned long codedHeight ;readonly attribute DOMRectReadOnly ?codedRect ;readonly attribute DOMRectReadOnly ?visibleRect ;readonly attribute double rotation ;readonly attribute boolean flip ;readonly attribute unsigned long displayWidth ;readonly attribute unsigned long displayHeight ;readonly attribute unsigned long long ?duration ; // microsecondsreadonly attribute long long timestamp ; // microsecondsreadonly attribute VideoColorSpace colorSpace ;VideoFrameMetadata metadata ();unsigned long allocationSize (optional VideoFrameCopyToOptions = {});
options Promise <sequence <PlaneLayout >>copyTo (AllowSharedBufferSource ,
destination optional VideoFrameCopyToOptions = {});
options ();VideoFrame clone ();undefined close (); };dictionary {
VideoFrameInit unsigned long long ; // microseconds
duration long long ; // microseconds
timestamp AlphaOption = "keep"; // Default matches image. May be used to efficiently crop. Will trigger // new computation of displayWidth and displayHeight using image's pixel // aspect ratio unless an explicit displayWidth and displayHeight are given.
alpha DOMRectInit ;
visibleRect double = 0;
rotation boolean =
flip false ; // Default matches image unless visibleRect is provided. [EnforceRange ]unsigned long ; [
displayWidth EnforceRange ]unsigned long ;
displayHeight VideoFrameMetadata ; };
metadata dictionary {
VideoFrameBufferInit required VideoPixelFormat ;
format required [EnforceRange ]unsigned long ;
codedWidth required [EnforceRange ]unsigned long ;
codedHeight required [EnforceRange ]long long ; // microseconds [
timestamp EnforceRange ]unsigned long long ; // microseconds // Default layout is tightly-packed.
duration sequence <PlaneLayout >; // Default visible rect is coded size positioned at (0,0)
layout DOMRectInit ;
visibleRect double = 0;
rotation boolean =
flip false ; // Default display dimensions match visibleRect. [EnforceRange ]unsigned long ; [
displayWidth EnforceRange ]unsigned long ;
displayHeight VideoColorSpaceInit ;
colorSpace sequence <ArrayBuffer >= [];
transfer VideoFrameMetadata ; };
metadata dictionary { // Possible members are recorded in the VideoFrame Metadata Registry. };
VideoFrameMetadata
9.4.1. Internal Slots
-
[[resource reference]]
-
A reference to the media resource that stores the pixel data for this frame.
-
[[format]]
-
A
VideoPixelFormat
describing the pixel format of theVideoFrame
. Will benull
whenever the underlying format does not map to aVideoPixelFormat
or when[[Detached]]
istrue
. -
[[coded width]]
-
Width of the
VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments. -
[[coded height]]
-
Height of the
VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments. -
[[visible left]]
-
The number of pixels defining the left offset of the visible rectangle.
-
[[visible top]]
-
The number of pixels defining the top offset of the visible rectangle.
-
[[visible width]]
-
The width of pixels to include in visible rectangle, starting from
[[visible left]]
. -
[[visible height]]
-
The height of pixels to include in visible rectangle, starting from
[[visible top]]
. -
[[rotation]]
-
The rotation to applied to the
VideoFrame
when rendered, in degrees clockwise. Rotation applies before flip. -
[[flip]]
-
Whether a horizontal flip is applied to the
VideoFrame
when rendered. Flip is applied after rotation. -
[[display width]]
-
Width of the
VideoFrame
when displayed after applying aspect ratio adjustments. -
[[display height]]
-
Height of the
VideoFrame
when displayed after applying aspect ratio adjustments. -
[[duration]]
-
The presentation duration, given in microseconds. The duration is copied from the
EncodedVideoChunk
corresponding to thisVideoFrame
. -
[[timestamp]]
-
The presentation timestamp, given in microseconds. The timestamp is copied from the
EncodedVideoChunk
corresponding to thisVideoFrame
. -
[[color space]]
-
The
VideoColorSpace
associated with this frame. -
[[metadata]]
-
The
VideoFrameMetadata
associated with this frame. Possible members are recorded in [webcodecs-video-frame-metadata-registry] . By design, allVideoFrameMetadata
properties are serializable.
9.4.2. Constructors
VideoFrame(image,
init)
-
Check the usability of the image argument . If this throws an exception or returns bad , then throw an
InvalidStateError
DOMException
. -
If image is not origin-clean , then throw a
SecurityError
DOMException
. -
Let frame be a new
VideoFrame
. -
Switch on image :
NOTE: Authors are encouraged to provide a meaningful timestamp unless it is implicitly provided by the
CanvasImageSource
at construction. Interfaces that consumeVideoFrame
s can rely on this value for timing decisions. For example,VideoEncoder
can usetimestamp
values to guide rate control (seeframerate
).-
-
If image ’s media data has no natural dimensions (e.g., it’s a vector graphic with no specified content size), then throw an
InvalidStateError
DOMException
. -
Let resource be a new media resource containing a copy of image ’s media data. If this is an animated image, image ’s bitmap data MUST only be taken from the default image of the animation (the one that the format defines is to be used when animation is not supported or is disabled), or, if there is no such image, the first frame of the animation.
-
Let codedWidth and codedHeight be the width and height of resource .
-
Let baseRotation and baseFlip describe the rotation and flip of image relative to resource .
-
Let defaultDisplayWidth and defaultDisplayHeight be the natural width and natural height of image .
-
Run the Initialize Frame With Resource algorithm with init , frame , resource , codedWidth , codedHeight , baseRotation , baseFlip , defaultDisplayWidth , and defaultDisplayHeight .
-
-
If image ’s
networkState
attribute isNETWORK_EMPTY
, then throw anInvalidStateError
DOMException
. -
Let currentPlaybackFrame be the
VideoFrame
at the current playback position . -
If
metadata
does not exist in init , assign currentPlaybackFrame .[[metadata]]
to it. -
Run the Initialize Frame From Other Frame algorithm with init , frame , and currentPlaybackFrame .
-
-
-
Let resource be a new media resource containing a copy of image ’s bitmap data .
NOTE: Implementers are encouraged to avoid a deep copy by using reference counting where feasible.
-
Let width be
image.width
and height beimage.height
. -
Run the Initialize Frame With Resource algorithm with init , frame , resource , width , height ,
0
,false
, width , and height .
-
-
Run the Initialize Frame From Other Frame algorithm with init , frame , and image .
-
-
Return frame .
VideoFrame(data,
init)
-
If init is not a valid VideoFrameBufferInit , throw a
TypeError
. -
Let defaultRect be «[ "x:" →
0
, "y" →0
, "width" → init .codedWidth
, "height" → init .codedWidth
]». -
Let overrideRect be
undefined
. -
If init .
visibleRect
exists , assign its value to overrideRect . -
Let parsedRect be the result of running the Parse Visible Rect algorithm with defaultRect , overrideRect , init .
codedWidth
, init .codedHeight
, and init .format
. -
If parsedRect is an exception, return parsedRect .
-
Let optLayout be
undefined
. -
Let combinedLayout be the result of running the Compute Layout and Allocation Size algorithm with parsedRect , init .
format
, and optLayout . -
If combinedLayout is an exception, throw combinedLayout .
-
If
data.byteLength
is less than combinedLayout ’s allocationSize , throw aTypeError
. -
If init .
transfer
contains more than one reference to the sameArrayBuffer
, then throw aDataCloneError
DOMException
. -
For each transferable in init .
transfer
:-
If
[[Detached]]
internal slot istrue
, then throw aDataCloneError
DOMException
.
-
-
If init .
transfer
contains anArrayBuffer
referenced by data the User Agent MAY choose to:-
Let resource be a new media resource referencing pixel data in data .
-
-
Otherwise:
-
Let resource be a new media resource containing a copy of data . Use
visibleRect
andlayout
to determine where in data the pixels for each plane reside.The User Agent MAY choose to allocate resource with a larger coded size and plane strides to improve memory alignment. Increases will be reflected by
codedWidth
andcodedHeight
. Additionally, the User Agent MAY usevisibleRect
to copy only the visible rectangle. It MAY also reposition the visible rectangle within resource . The final position will be reflected byvisibleRect
.
-
-
For each transferable in init .
transfer
:-
Perform DetachArrayBuffer on transferable
-
-
Let resourceCodedWidth be the coded width of resource .
-
Let resourceCodedHeight be the coded height of resource .
-
Let resourceVisibleLeft be the left offset for the visible rectangle of resource .
-
Let resourceVisibleTop be the top offset for the visible rectangle of resource .
The spec SHOULD provide definitions (and possibly diagrams) for coded size, visible rectangle, and display size. See #166 .
-
Let frame be a new
VideoFrame
object initialized as follows:-
Assign resourceCodedWidth , resourceCodedHeight , resourceVisibleLeft , and resourceVisibleTop to
[[coded width]]
,[[coded height]]
,[[visible left]]
, and[[visible top]]
respectively. -
If init .
visibleRect
exists :-
Let truncatedVisibleWidth be the value of
visibleRect
.width
after truncating. -
Assign truncatedVisibleWidth to
[[visible width]]
. -
Let truncatedVisibleHeight be the value of
visibleRect
.height
after truncating. -
Assign truncatedVisibleHeight to
[[visible height]]
.
-
-
Otherwise:
-
Assign
[[coded width]]
to[[visible width]]
. -
Assign
[[coded height]]
to[[visible height]]
.
-
-
Assign the result of running the Parse Rotation algorithm, with init .
rotation
, to[[rotation]]
. -
If
displayWidth
anddisplayHeight
exist in init , assign them to[[display width]]
and[[display height]]
respectively. -
Otherwise:
-
If
[[rotation]]
is equal to0
or180
:-
Assign
[[visible width]]
to[[display width]]
. -
Assign
[[visible height]]
to[[display height]]
.
-
-
Otherwise:
-
Assign
[[visible height]]
to[[display width]]
. -
Assign
[[visible width]]
to[[display height]]
.
-
-
-
Assign init ’s
timestamp
andduration
to[[timestamp]]
and[[duration]]
respectively. -
Let colorSpace be
undefined
. -
If init .
colorSpace
exists , assign its value to colorSpace . -
Assign init ’s
format
to[[format]]
. -
Assign the result of running the Pick Color Space algorithm, with colorSpace and
[[format]]
, to[[color space]]
. -
Assign the result of calling Copy VideoFrame metadata with init ’s
metadata
to frame .[[metadata]]
.
-
-
Return frame .
9.4.3. Attributes
-
format
, of type VideoPixelFormat , readonly, nullable -
Describes the arrangement of bytes in each plane as well as the number and order of the planes. Will be
null
whenever the underlying format does not map to aVideoPixelFormat
or when[[Detached]]
istrue
.The
format
getter steps are to return[[format]]
. -
codedWidth
, of type unsigned long , readonly -
Width of the
VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.The
codedWidth
getter steps are to return[[coded width]]
. -
codedHeight
, of type unsigned long , readonly -
Height of the
VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.The
codedHeight
getter steps are to return[[coded height]]
. -
codedRect
, of type DOMRectReadOnly , readonly, nullable -
A
DOMRectReadOnly
withwidth
andheight
matchingcodedWidth
andcodedHeight
andx
andy
at(0,0)
. Offered for convenience for use withallocationSize()
andcopyTo()
.The
codedRect
getter steps are:-
If
[[Detached]]
istrue
, returnnull
. -
Let rect be a new
DOMRectReadOnly
, initialized as follows:-
Assign
[[coded width]]
and[[coded height]]
towidth
andheight
respectively.
-
Return rect .
-
-
visibleRect
, of type DOMRectReadOnly , readonly, nullable -
A
DOMRectReadOnly
describing the visible rectangle of pixels for thisVideoFrame
.The
visibleRect
getter steps are:-
If
[[Detached]]
istrue
, returnnull
. -
Let rect be a new
DOMRectReadOnly
, initialized as follows:-
Assign
[[visible left]]
,[[visible top]]
,[[visible width]]
, and[[visible height]]
tox
,y
,width
, andheight
respectively.
-
-
Return rect .
-
-
rotation
, of type double , readonly -
The rotation to applied to the VideoFrame when rendered, in degrees clockwise. Rotation applies before flip.
The
rotation
getter steps are to return[[rotation]]
. -
flip
, of type boolean , readonly -
Whether a horizontal flip is applied to the
VideoFrame
when rendered. Flip applies after rotation. -
displayWidth
, of type unsigned long , readonly -
Width of the VideoFrame when displayed after applying rotation and aspect ratio adjustments.
The
displayWidth
getter steps are to return[[display width]]
. -
displayHeight
, of type unsigned long , readonly -
Height of the VideoFrame when displayed after applying rotation and aspect ratio adjustments.
The
displayHeight
getter steps are to return[[display height]]
. -
timestamp
, of type long long , readonly -
The presentation timestamp, given in microseconds. For decode, timestamp is copied from the
EncodedVideoChunk
corresponding to thisVideoFrame
. For encode, timestamp is copied to theEncodedVideoChunk
s corresponding to thisVideoFrame
.The
timestamp
getter steps are to return[[timestamp]]
. -
duration
, of type unsigned long long , readonly, nullable -
The presentation duration, given in microseconds. The duration is copied from the
EncodedVideoChunk
corresponding to this VideoFrame.The
duration
getter steps are to return[[duration]]
. -
colorSpace
, of type VideoColorSpace , readonly -
The
VideoColorSpace
associated with this frame.The
colorSpace
getter steps are to return[[color space]]
.
9.4.4. Internal Structures
A combined buffer layout is a struct that consists of:-
A allocationSize (an
unsigned long
) -
A computedLayouts (a list of computed plane layout structs).
A computed plane layout is a struct that consists of:
-
A destinationOffset (an
unsigned long
) -
A destinationStride (an
unsigned long
) -
A sourceTop (an
unsigned long
) -
A sourceHeight (an
unsigned long
) -
A sourceLeftBytes (an
unsigned long
) -
A sourceWidthBytes (an
unsigned long
)
9.4.5. Methods
-
allocationSize( options )
-
Returns the minimum byte length for a valid destination
BufferSource
to be used withcopyTo()
with the given options.When invoked, run these steps:
-
If
[[Detached]]
istrue
, throw anInvalidStateError
DOMException
. -
If
[[format]]
isnull
, throw aNotSupportedError
DOMException
. -
Let combinedLayout be the result of running the Parse VideoFrameCopyToOptions algorithm with options .
-
If combinedLayout is an exception, throw combinedLayout .
-
Return combinedLayout ’s allocationSize .
-
-
copyTo( destination , options )
-
Asynchronously copies the planes of this frame into destination according to options . The format of the data is options .
format
, if it exists or thisVideoFrame
’sformat
otherwise.NOTE: Promises that are returned by several calls to
copyTo()
are not guaranteed to resolve in the order they were returned.When invoked, run these steps:
-
If
[[Detached]]
istrue
, return a promise rejected with aInvalidStateError
DOMException
. -
If
[[format]]
isnull
, return a promise rejected with aNotSupportedError
DOMException
. -
Let combinedLayout be the result of running the Parse VideoFrameCopyToOptions algorithm with options .
-
If combinedLayout is an exception, return a promise rejected with combinedLayout .
-
If
destination.byteLength
is less than combinedLayout ’s allocationSize , return a promise rejected with aTypeError
. -
If options .
format
is equal to one ofRGBA
,RGBX
,BGRA
,BGRX
then:-
Let newOptions be the result of running the Clone Configuration algorithm with options .
-
Assign
undefined
to newOptions .format
. -
Let rgbFrame be the result of running the Convert to RGB frame algorithm with this , options .
format
, and options .colorSpace
. -
Return the result of calling
copyTo()
on rgbFrame with destination and newOptions .
-
-
Let p be a new
Promise
. -
Let copyStepsQueue be the result of starting a new parallel queue .
-
Let planeLayouts be a new list .
-
Enqueue the following steps to copyStepsQueue :
-
Let resource be the media resource referenced by
[[resource reference]]
. -
Let numPlanes be the number of planes as defined by
[[format]]
. -
Let planeIndex be
0
. -
While planeIndex is less than combinedLayout ’s numPlanes :
-
Let sourceStride be the stride of the plane in resource as identified by planeIndex .
-
Let computedLayout be the computed plane layout in combinedLayout ’s computedLayouts at the position of planeIndex
-
Let sourceOffset be the product of multiplying computedLayout ’s sourceTop by sourceStride
-
Add computedLayout ’s sourceLeftBytes to sourceOffset .
-
Let destinationOffset be computedLayout ’s destinationOffset .
-
Let rowBytes be computedLayout ’s sourceWidthBytes .
-
Let layout be a new
PlaneLayout
, withoffset
set to destinationOffset andstride
set to rowBytes . -
Let row be
0
. -
While row is less than computedLayout ’s sourceHeight :
-
Copy rowBytes bytes from resource starting at sourceOffset to destination starting at destinationOffset .
-
Increment sourceOffset by sourceStride .
-
Increment destinationOffset by computedLayout ’s destinationStride .
-
Increment row by
1
.
-
-
Increment planeIndex by
1
. -
Append layout to planeLayouts .
-
-
Queue a task to resolve p with planeLayouts .
-
-
Return p .
-
-
clone()
-
Creates a new
VideoFrame
with a reference to the same media resource .When invoked, run these steps:
-
If the value of frame ’s
[[Detached]]
internal slot istrue
, throw anInvalidStateError
DOMException
. -
Return the result of running the Clone VideoFrame algorithm with this .
-
-
close()
-
Clears all state and releases the reference to the media resource . Close is final.
When invoked, run the Close VideoFrame algorithm with this .
-
metadata()
-
Gets the
VideoFrameMetadata
associated with this frame.When invoked, run these steps:
-
If
[[Detached]]
istrue
, throw anInvalidStateError
DOMException
. -
Return the result of calling Copy VideoFrame metadata with
[[metadata]]
.
-
9.4.6. Algorithms
- Create a VideoFrame (with output , timestamp , duration , displayAspectWidth , displayAspectHeight , colorSpace , rotation , and flip )
-
-
Let frame be a new
VideoFrame
, constructed as follows:-
Assign
false
to[[Detached]]
. -
Let resource be the media resource described by output .
-
Let resourceReference be a reference to resource .
-
Assign resourceReference to
[[resource reference]]
. -
If output uses a recognized
VideoPixelFormat
, assign that format to[[format]]
. Otherwise, assignnull
to[[format]]
. -
Let codedWidth and codedHeight be the coded width and height of the output in pixels.
-
Let visibleLeft , visibleTop , visibleWidth , and visibleHeight be the left, top, width and height for the visible rectangle of output .
-
Let displayWidth and displayHeight be the display size of output in pixels.
-
If displayAspectWidth and displayAspectHeight are provided, increase displayWidth or displayHeight until the ratio of displayWidth to displayHeight matches the ratio of displayAspectWidth to displayAspectHeight .
-
Assign codedWidth , codedHeight , visibleLeft , visibleTop , visibleWidth , visibleHeight , displayWidth , and displayHeight to
[[coded width]]
,[[coded height]]
,[[visible left]]
,[[visible top]]
,[[visible width]]
, and[[visible height]]
respectively. -
Assign duration and timestamp to
[[duration]]
and[[timestamp]]
respectively. -
Assign
[[color space]]
with the result of running the Pick Color Space algorithm, with colorSpace and[[format]]
.
-
-
Return frame .
-
- Pick Color Space (with overrideColorSpace and format )
-
-
If overrideColorSpace is provided, return a new
VideoColorSpace
constructed with overrideColorSpace .User Agents MAY replace
null
members of the provided overrideColorSpace with guessed values as determined by implementer defined heuristics. -
Otherwise, if
[[format]]
is an RGB format return a new instance of the sRGB Color Space -
Otherwise, return a new instance of the REC709 Color Space .
-
- Validate VideoFrameInit (with format , codedWidth , and codedHeight ):
-
-
If
visibleRect
exists :-
Let validAlignment be the result of running the Verify Rect Offset Alignment with format and visibleRect .
-
If validAlignment is
false
, returnfalse
. -
If any attribute of
visibleRect
is negative or not finite, returnfalse
. -
If
visibleRect
.width
==0
orvisibleRect
.height
==0
returnfalse
. -
If
visibleRect
.y
+visibleRect
.height
> codedHeight , returnfalse
. -
If
visibleRect
.x
+visibleRect
.width
> codedWidth , returnfalse
.
-
-
If codedWidth = 0 or codedHeight = 0,return
false
. -
If only one of
displayWidth
ordisplayHeight
exists , returnfalse
. -
If
displayWidth
==0
ordisplayHeight
==0
, returnfalse
. -
Return
true
.
-
-
To
check
if
a
VideoFrameBufferInit
is a valid VideoFrameBufferInit , run these steps: -
-
If
codedWidth
= 0 orcodedHeight
= 0,returnfalse
. -
If any attribute of
visibleRect
is negative or not finite, returnfalse
. -
If
visibleRect
.y
+visibleRect
.height
>codedHeight
, returnfalse
. -
If
visibleRect
.x
+visibleRect
.width
>codedWidth
, returnfalse
. -
If only one of
displayWidth
ordisplayHeight
exists , returnfalse
. -
If
displayWidth
= 0 ordisplayHeight
= 0, returnfalse
. -
Return
true
.
-
- Initialize Frame From Other Frame (with init , frame , and otherFrame )
-
-
Let format be otherFrame .
format
. -
If init .
alpha
isdiscard
, assign otherFrame .format
’s equivalent opaque format format . -
Let validInit be the result of running the Validate VideoFrameInit algorithm with format and otherFrame ’s
[[coded width]]
and[[coded height]]
. -
If validInit is
false
, throw aTypeError
. -
Let resource be the media resource referenced by otherFrame ’s
[[resource reference]]
. -
Assign a new reference for resource to frame ’s
[[resource reference]]
. -
Assign the following attributes from otherFrame to frame :
codedWidth
,codedHeight
,colorSpace
. -
Let defaultVisibleRect be the result of performing the getter steps for
visibleRect
on otherFrame . -
Let baseRotation and baseFlip be otherFrame ’s
[[rotation]]
and[[flip]]
, respectively. -
Let defaultDisplayWidth and defaultDisplayHeight be otherFrame ’s
[[display width]]
and[[display height]]
, respectively. -
Run the Initialize Visible Rect, Orientation, and Display Size algorithm with init , frame , defaultVisibleRect , baseRotation , baseFlip , defaultDisplayWidth , and defaultDisplayHeight .
-
If
duration
exists in init , assign it to frame ’s[[duration]]
. Otherwise, assign otherFrame .duration
to frame ’s[[duration]]
. -
If
timestamp
exists in init , assign it to frame ’s[[timestamp]]
. Otherwise, assign otherFrame ’stimestamp
to frame ’s[[timestamp]]
. -
Assign format to frame .
[[format]]
. -
Assign the result of calling Copy VideoFrame metadata with init ’s
metadata
to frame .[[metadata]]
.
-
- Initialize Frame With Resource (with init , frame , resource , codedWidth , codedHeight , baseRotation , baseFlip , defaultDisplayWidth , and defaultDisplayHeight )
-
-
Let format be
null
. -
If resource uses a recognized
VideoPixelFormat
, assign theVideoPixelFormat
of resource to format . -
Let validInit be the result of running the Validate VideoFrameInit algorithm with format , width and height .
-
If validInit is
false
, throw aTypeError
. -
Assign a new reference for resource to frame ’s
[[resource reference]]
. -
If init .
alpha
isdiscard
, assign format ’s equivalent opaque format to format . -
Assign format to
[[format]]
-
Assign codedWidth and codedHeight to frame ’s
[[coded width]]
and[[coded height]]
respectively. -
Let defaultVisibleRect be a new
DOMRect
constructed with «[ "x:" →0
, "y" →0
, "width" → codedWidth , "height" → codedHeight ]» -
Run the Initialize Visible Rect, Orientation, and Display Size algorithm with init , frame , defaultVisibleRect , defaultDisplayWidth , and defaultDisplayHeight .
-
Assign
init
.duration
to frame ’s[[duration]]
. -
Assign
init
.timestamp
to frame ’s[[timestamp]]
. -
If resource has a known
VideoColorSpace
, assign its value to[[color space]]
. -
Otherwise, assign a new
VideoColorSpace
, constructed with an emptyVideoColorSpaceInit
, to[[color space]]
.
-
- Initialize Visible Rect, Orientation, and Display Size (with init , frame , defaultVisibleRect , baseRotation , baseFlip , defaultDisplayWidth and defaultDisplayHeight )
-
-
Let visibleRect be defaultVisibleRect .
-
If init .
visibleRect
exists , assign it to visibleRect . -
Assign visibleRect ’s
x
,y
,width
, andheight
, to frame ’s[[visible left]]
,[[visible top]]
,[[visible width]]
, and[[visible height]]
respectively. -
Let rotation be the result of running the Parse Rotation algorithm, with init .
rotation
. -
Assign the result of running the Add Rotations algorithm, with baseRotation , baseFlip , and rotation , to frame ’s
[[rotation]]
. -
If baseFlip is equal to init .
flip
, assignfalse
to frame ’s[[flip]]
. Otherwise, assigntrue
to frame ’s[[flip]]
. -
If
displayWidth
anddisplayHeight
exist in init , assign them to[[display width]]
and[[display height]]
respectively. -
Otherwise:
-
If baseRotation is equal to
0
or180
: -
Otherwise:
-
Let displayWidth be
|frame|'s {{VideoFrame/[[visible width]]}} * |widthScale|
, rounded to the nearest integer. -
Let displayHeight be
|frame|'s {{VideoFrame/[[visible height]]}} * |heightScale|
, rounded to the nearest integer. -
If rotation is equal to
0
or180
:-
Assign displayWidth to frame ’s
[[display width]]
. -
Assign displayHeight to frame ’s
[[display height]]
.
-
-
Otherwise:
-
Assign displayHeight to frame ’s
[[display width]]
. -
Assign displayWidth to frame ’s
[[display height]]
.
-
-
-
- Clone VideoFrame (with frame )
-
-
Let clone be a new
VideoFrame
initialized as follows:-
Let resource be the media resource referenced by frame ’s
[[resource reference]]
. -
Let newReference be a new reference to resource .
-
Assign newReference to clone ’s
[[resource reference]]
. -
Assign all remaining internal slots of frame (excluding
[[resource reference]]
) to those of the same name in clone .
-
-
Return clone .
-
- Close VideoFrame (with frame )
-
-
Assign
null
to frame ’s[[resource reference]]
. -
Assign
true
to frame ’s[[Detached]]
. -
Assign
null
to frame ’sformat
. -
Assign
0
to frame ’s[[coded width]]
,[[coded height]]
,[[visible left]]
,[[visible top]]
,[[visible width]]
,[[visible height]]
,[[rotation]]
,[[display width]]
, and[[display height]]
. -
Assign
false
to frame ’s[[flip]]
. -
Assign a new
VideoFrameMetadata
to frame .[[metadata]]
.
-
- Parse Rotation (with rotation )
-
-
Let alignedRotation be the nearest multiple of
90
to rotation , rounding ties towards positive infinity. -
Let fullTurns be the greatest multiple of
360
less than or equal to alignedRotation . -
Return
|alignedRotation| - |fullTurns|
.
-
- Add Rotations (with baseRotation , baseFlip , and rotation )
-
-
If baseFlip is
false
, let combinedRotation be|baseRotation| + |rotation|
. Otherwise, let combinedRotation be|baseRotation| - |rotation|
. -
Let fullTurns be the greatest multiple of
360
less than or equal to combinedRotation . -
Return
|combinedRotation| - |fullTurns|
.
-
- Parse VideoFrameCopyToOptions (with options )
-
-
Let defaultRect be the result of performing the getter steps for
visibleRect
. -
Let overrideRect be
undefined
. -
If options .
rect
exists , assign the value of options .rect
to overrideRect . -
Let parsedRect be the result of running the Parse Visible Rect algorithm with defaultRect , overrideRect ,
[[coded width]]
,[[coded height]]
, and[[format]]
. -
If parsedRect is an exception, return parsedRect .
-
Let optLayout be
undefined
. -
If options .
layout
exists , assign its value to optLayout . -
Let format be
undefined
. -
If options .
format
does not exist , assign[[format]]
to format . -
Otherwise, if options .
format
is equal to one ofRGBA
,RGBX
,BGRA
,BGRX
, then assign options .format
to format , otherwise returnNotSupportedError
. -
Let combinedLayout be the result of running the Compute Layout and Allocation Size algorithm with parsedRect , format , and optLayout .
-
Return combinedLayout .
-
- Verify Rect Offset Alignment (with format and rect )
-
-
If format is
null
, returntrue
. -
Let planeIndex be
0
. -
Let numPlanes be the number of planes as defined by format .
-
While planeIndex is less than numPlanes :
-
Let plane be the Plane identified by planeIndex as defined by format .
-
Let sampleWidth be the horizontal sub-sampling factor of each subsample for plane .
-
Let sampleHeight be the vertical sub-sampling factor of each subsample for plane .
-
If rect .
x
is not a multiple of sampleWidth , returnfalse
. -
If rect .
y
is not a multiple of sampleHeight , returnfalse
. -
Increment planeIndex by
1
.
-
-
Return
true
.
-
- Parse Visible Rect (with defaultRect , overrideRect , codedWidth , codedHeight , and format )
-
-
Let sourceRect be defaultRect
-
If overrideRect is not
undefined
:-
If either of overrideRect .
width
orheight
is0
, return aTypeError
. -
If the sum of overrideRect .
x
and overrideRect .width
is greater than codedWidth , return aTypeError
. -
If the sum of overrideRect .
y
and overrideRect .height
is greater than codedHeight , return aTypeError
. -
Assign overrideRect to sourceRect .
-
-
Let validAlignment be the result of running the Verify Rect Offset Alignment algorithm with format and sourceRect .
-
If validAlignment is
false
, throw aTypeError
. -
Return sourceRect .
-
- Compute Layout and Allocation Size (with parsedRect , format , and layout )
-
-
Let numPlanes be the number of planes as defined by format .
-
If layout is not
undefined
and its length does not equal numPlanes , throw aTypeError
. -
Let minAllocationSize be
0
. -
Let computedLayouts be a new list .
-
Let endOffsets be a new list .
-
Let planeIndex be
0
. -
While planeIndex < numPlanes :
-
Let plane be the Plane identified by planeIndex as defined by format .
-
Let sampleBytes be the number of bytes per sample for plane .
-
Let sampleWidth be the horizontal sub-sampling factor of each subsample for plane .
-
Let sampleHeight be the vertical sub-sampling factor of each subsample for plane .
-
Let computedLayout be a new computed plane layout .
-
Set computedLayout ’s sourceTop to the result of the division of truncated parsedRect .
y
by sampleHeight , rounded up to the nearest integer. -
Set computedLayout ’s sourceHeight to the result of the division of truncated parsedRect .
height
by sampleHeight , rounded up to the nearest integer. -
Set computedLayout ’s sourceLeftBytes to the result of the integer division of truncated parsedRect .
x
by sampleWidth , multiplied by sampleBytes . -
Set computedLayout ’s sourceWidthBytes to the result of the integer division of truncated parsedRect .
width
by sampleWidth , multiplied by sampleBytes . -
If layout is not
undefined
:-
Let planeLayout be the
PlaneLayout
in layout at position planeIndex . -
If planeLayout .
stride
is less than computedLayout ’s sourceWidthBytes , return aTypeError
. -
Assign planeLayout .
offset
to computedLayout ’s destinationOffset . -
Assign planeLayout .
stride
to computedLayout ’s destinationStride .
-
-
Otherwise:
NOTE: If an explicit layout was not provided, the following steps default to tight packing.
-
Assign minAllocationSize to computedLayout ’s destinationOffset .
-
Assign computedLayout ’s sourceWidthBytes to computedLayout ’s destinationStride .
-
-
Let planeSize be the product of multiplying computedLayout ’s destinationStride and sourceHeight .
-
Let planeEnd be the sum of planeSize and computedLayout ’s destinationOffset .
-
If planeSize or planeEnd is greater than maximum range of
unsigned long
, return aTypeError
. -
Append planeEnd to endOffsets .
-
Assign the maximum of minAllocationSize and planeEnd to minAllocationSize .
NOTE: The above step uses a maximum to allow for the possibility that user specified plane offsets reorder planes.
-
Let earlierPlaneIndex be
0
. -
While earlierPlaneIndex is less than planeIndex .
-
Let earlierLayout be
computedLayouts[earlierPlaneIndex]
. -
If
endOffsets[planeIndex]
is less than or equal to earlierLayout ’s destinationOffset or ifendOffsets[earlierPlaneIndex]
is less than or equal to computedLayout ’s destinationOffset , continue.NOTE: If plane A ends before plane B starts, they do not overlap.
-
Otherwise, return a
TypeError
. -
Increment earlierPlaneIndex by
1
.
-
-
Append computedLayout to computedLayouts .
-
Increment planeIndex by
1
.
-
-
Let combinedLayout be a new combined buffer layout , initialized as follows:
-
Assign computedLayouts to computedLayouts .
-
Assign minAllocationSize to allocationSize .
-
-
Return combinedLayout .
-
- Convert PredefinedColorSpace to VideoColorSpace (with colorSpace )
-
-
Assert: colorSpace is equal to one of
srgb
ordisplay-p3
. -
If colorSpace is equal to
srgb
return a new instance of the sRGB Color Space -
If colorSpace is equal to
display-p3
return a new instance of the Display P3 Color Space
-
- Convert to RGB frame (with frame , format and colorSpace )
-
-
This algorithm MUST be called only if format is equal to one of
RGBA
,RGBX
,BGRA
,BGRX
. -
Let convertedFrame be a new
VideoFrame
, constructed as follows:-
Assign
false
to[[Detached]]
. -
Assign format to
[[format]]
. -
Let width be frame ’s
[[visible width]]
. -
Let height be frame ’s
[[visible height]]
. -
Assign width , height , 0, 0, width , height , width , and height to
[[coded width]]
,[[coded height]]
,[[visible left]]
,[[visible top]]
,[[visible width]]
, and[[visible height]]
respectively. -
Assign frame ’s
[[duration]]
and frame ’s[[timestamp]]
to[[duration]]
and[[timestamp]]
respectively. -
Assign the result of running the Convert PredefinedColorSpace to VideoColorSpace algorithm with colorSpace to
[[color space]]
. -
Let resource be a new media resource containing the result of conversion of media resource referenced by frame ’s
[[resource reference]]
into a color space and pixel format specified by[[color space]]
and[[format]]
respectively. -
Assign the reference to resource to
[[resource reference]]
-
-
Return convertedFrame .
-
- Copy VideoFrame metadata (with metadata )
-
-
Let metadataCopySerialized be StructuredSerialize ( metadata ).
-
Let metadataCopy be StructuredDeserialize ( metadataCopySerialized , the current Realm ).
-
Return metadataCopy .
-
The
goal
of
this
algorithm
is
to
ensure
that
metadata
owned
by
a
VideoFrame
is
immutable.
9.4.7. Transfer and Serialization
-
The
VideoFrame
transfer steps (with value and dataHolder ) are: -
-
If value ’s
[[Detached]]
istrue
, throw aDataCloneError
DOMException
. -
For all
VideoFrame
internal slots in value , assign the value of each internal slot to a field in dataHolder with the same name as the internal slot. -
Run the Close VideoFrame algorithm with value .
-
-
The
VideoFrame
transfer-receiving steps (with dataHolder and value ) are: -
-
For all named fields in dataHolder , assign the value of each named field to the
VideoFrame
internal slot in value with the same name as the named field.
-
-
The
VideoFrame
serialization steps (with value , serialized , and forStorage ) are: -
-
If value ’s
[[Detached]]
istrue
, throw aDataCloneError
DOMException
. -
If forStorage is
true
, throw aDataCloneError
. -
Let resource be the media resource referenced by value ’s
[[resource reference]]
. -
Let newReference be a new reference to resource .
-
Assign newReference to |serialized.resource reference|.
-
For all remaining
VideoFrame
internal slots (excluding[[resource reference]]
) in value , assign the value of each internal slot to a field in serialized with the same name as the internal slot.
-
-
The
VideoFrame
deserialization steps (with serialized and value ) are: -
-
For all named fields in serialized , assign the value of each named field to the
VideoFrame
internal slot in value with the same name as the named field.
-
9.4.8. Rendering
When
rendered,
for
example
by
CanvasDrawImage
drawImage()
,
a
VideoFrame
MUST
be
converted
to
a
color
space
compatible
with
the
rendering
target,
unless
color
conversion
is
explicitly
disabled.
Color
space
conversion
during
ImageBitmap
construction
is
controlled
by
ImageBitmapOptions
colorSpaceConversion
.
Setting
this
value
to
"none"
disables
color
space
conversion.
The
rendering
of
a
VideoFrame
is
produced
from
the
media
resource
by
applying
any
necessary
color
space
conversion,
cropping
to
the
visibleRect
,
rotating
clockwise
by
rotation
degrees,
and
flipping
horizontally
if
flip
is
true
.
9.5. VideoFrame CopyTo() Options
Options to specify a rectangle of pixels to copy, their format, and the offset and stride of planes in the destination buffer.dictionary {
VideoFrameCopyToOptions DOMRectInit rect ;sequence <PlaneLayout >layout ;VideoPixelFormat format ;PredefinedColorSpace colorSpace ; };
copyTo()
or
allocationSize()
will
enforce
the
following
requirements:
-
The coordinates of
rect
are sample-aligned as determined by[[format]]
. -
If
layout
exists , aPlaneLayout
is provided for all planes.
-
rect
, of type DOMRectInit -
A
DOMRectInit
describing the rectangle of pixels to copy from theVideoFrame
. If unspecified, thevisibleRect
will be used.NOTE: The coded rectangle can be specified by passing
VideoFrame
’scodedRect
.NOTE: The default
rect
does not necessarily meet the sample-alignment requirement and can result incopyTo()
orallocationSize()
rejecting. -
layout
, of type sequence< PlaneLayout > -
The
PlaneLayout
for each plane inVideoFrame
, affording the option to specify an offset and stride for each plane in the destinationBufferSource
. If unspecified, the planes will be tightly packed. It is invalid to specify planes that overlap. -
format
, of type VideoPixelFormat -
A
VideoPixelFormat
for the pixel data in the destinationBufferSource
. Potential values are:RGBA
,RGBX
,BGRA
,BGRX
. If it does not exist , the destinationBufferSource
will be in the same format asformat
. -
colorSpace
, of type PredefinedColorSpace -
A
PredefinedColorSpace
that MUST be used as a target color space for the pixel data in the destinationBufferSource
, but only ifformat
is one ofRGBA
,RGBX
,BGRA
,BGRX
, otherwise it is ignored. If it does not exist ,srgb
is used.
9.6. DOMRects in VideoFrame
The
VideoFrame
interface
uses
DOMRect
s
to
specify
the
position
and
dimensions
for
a
rectangle
of
pixels.
DOMRectInit
is
used
with
copyTo()
and
allocationSize()
to
describe
the
dimensions
of
the
source
rectangle.
VideoFrame
defines
codedRect
and
visibleRect
for
convenient
copying
of
the
coded
size
and
visible
region
respectively.
NOTE:
VideoFrame
pixels
are
only
addressable
by
integer
numbers.
All
floating
point
values
provided
to
DOMRectInit
will
be
truncated.
9.7. Plane Layout
A
PlaneLayout
is
a
dictionary
specifying
the
offset
and
stride
of
a
VideoFrame
plane
once
copied
to
a
BufferSource
.
A
sequence
of
PlaneLayout
s
MAY
be
provided
to
VideoFrame
’s
copyTo()
to
specify
how
the
plane
is
laid
out
in
the
destination
BufferSource
.
Alternatively,
callers
can
inspect
copyTo()
’s
returned
sequence
of
PlaneLayout
s
to
learn
the
offset
and
stride
for
planes
as
decided
by
the
User
Agent.
dictionary { [
PlaneLayout EnforceRange ]required unsigned long offset ; [EnforceRange ]required unsigned long stride ; };
-
offset
, of type unsigned long -
The offset in bytes where the given plane begins within a
BufferSource
. -
stride
, of type unsigned long -
The number of bytes, including padding, used by each row of the plane within a
BufferSource
.
9.8. Pixel Format
Pixel formats describe the arrangement of bytes in each plane as well as the number and order of the planes. Each format is described in its own sub-section.enum { // 4:2:0 Y, U, V
VideoPixelFormat "I420" ,"I420P10" ,"I420P12" , // 4:2:0 Y, U, V, A"I420A" ,"I420AP10" ,"I420AP12" , // 4:2:2 Y, U, V"I422" ,"I422P10" ,"I422P12" , // 4:2:2 Y, U, V, A"I422A" ,"I422AP10" ,"I422AP12" , // 4:4:4 Y, U, V"I444" ,"I444P10" ,"I444P12" , // 4:4:4 Y, U, V, A"I444A" ,"I444AP10" ,"I444AP12" , // 4:2:0 Y, UV"NV12" , // 4:4:4 RGBA"RGBA" , // 4:4:4 RGBX (opaque)"RGBX" , // 4:4:4 BGRA"BGRA" , // 4:4:4 BGRX (opaque)"BGRX" , };
Sub-sampling is a technique where a single sample contains information for multiple pixels in the final image. Sub-sampling can be horizontal, vertical or both, and has a factor , that is the number of final pixels in the image that are derived from a sub-sampled sample.
VideoFrame
is
in
I420
format,
then
the
very
first
component
of
the
second
plane
(the
U
plane)
corresponds
to
four
pixels,
that
are
the
pixels
in
the
top-left
angle
of
the
image.
Consequently,
the
first
component
of
the
second
row
corresponds
to
the
four
pixels
below
those
initial
four
top-left
pixels.
The
sub-sampling
factor
is
2
in
both
the
horizontal
and
vertical
direction.
If
a
VideoPixelFormat
has
an
alpha
component,
the
format’s
equivalent
opaque
format
is
the
same
VideoPixelFormat
,
without
an
alpha
component.
If
a
VideoPixelFormat
does
not
have
an
alpha
component,
it
is
its
own
equivalent
opaque
format
.
Integer values are unsigned unless otherwise specified.
-
I420
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
It
is
also
often
refered
to
as
Planar
YUV
4:2:0.
The U an V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y plane.
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
samples (and therefore bytes) in the Y plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have a number of rows equal to the result of the division of
codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle offset (
visibleRect
.x
andvisibleRect
.y
) MUST be even. -
I420P10
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y plane.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have a number of rows equal to the result of the division of
codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle offset (
visibleRect
.x
andvisibleRect
.y
) MUST be even. -
I420P12
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y plane.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have a number of rows equal to the result of the division of
codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle offset (
visibleRect
.x
andvisibleRect
.y
) MUST be even. -
I420A
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
It
is
also
often
refered
to
as
Planar
YUV
4:2:0
with
an
alpha
channel.
The U an V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y and Alpha planes.
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
samples (and therefore bytes) in the Y and Alpha planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have a number of rows equal to the result of the division of
codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle offset (
visibleRect
.x
andvisibleRect
.y
) MUST be even.I420A
’s equivalent opaque format isI420
. -
I420AP10
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y and Alpha planes.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have a number of rows equal to the result of the division of
codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle offset (
visibleRect
.x
andvisibleRect
.y
) MUST be even.I420AP10
’s equivalent opaque format isI420P10
. -
I420AP12
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y and Alpha planes.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have a number of rows equal to the result of the division of
codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle offset (
visibleRect
.x
andvisibleRect
.y
) MUST be even.I420AP12
’s equivalent opaque format isI420P12
. -
I422
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
It
is
also
often
refered
to
as
Planar
YUV
4:2:2.
The U an V planes are sub-sampled horizontally by a factor of 2 compared to the Y plane, and not sub-sampled vertically.
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
samples (and therefore bytes) in the Y and plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have
codedHeight
rows. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle horizontal offset (
visibleRect
.x
) MUST be even. -
I422P10
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y plane, and not sub-sampled vertically.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have
codedHeight
rows. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle horizontal offset (
visibleRect
.x
) MUST be even. -
I422P12
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y plane, and not sub-sampled vertically.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have
codedHeight
rows. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle horizontal offset (
visibleRect
.x
) MUST be even. -
I422A
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
It
is
also
often
refered
to
as
Planar
YUV
4:2:2
with
an
alpha
channel.
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y and Alpha planes, and not sub-sampled vertically.
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
samples (and therefore bytes) in the Y and Alpha planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have
codedHeight
rows. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle horizontal offset (
visibleRect
.x
) MUST be even.I422A
’s equivalent opaque format isI422
. -
I422AP10
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y and Alpha planes, and not sub-sampled vertically.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have
codedHeight
rows. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle horizontal offset (
visibleRect
.x
) MUST be even.I422AP10
’s equivalent opaque format isI420P10
. -
I422AP12
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y and Alpha planes, and not sub-sampled vertically.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The U and V planes have
codedHeight
rows. Each row has a number of samples equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.The visible rectangle horizontal offset (
visibleRect
.x
) MUST be even.I422AP10
’s equivalent opaque format isI420P10
. -
I444
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
It
is
also
often
refered
to
as
Planar
YUV
4:4:4.
This format does not use sub-sampling .
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
samples (and therefore bytes) in all three planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples. -
I444P10
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
This format does not use sub-sampling .
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in all three planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples. -
I444P12
-
This
format
is
composed
of
three
distinct
planes,
one
plane
of
Luma
and
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
present
in
this
order.
This format does not use sub-sampling .
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in all three planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples. -
I444A
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
This format does not use sub-sampling .
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
samples (and therefore bytes) in all four planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.I444A
’s equivalent opaque format isI444
. -
I444AP10
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
This format does not use sub-sampling .
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in all four planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.I444AP10
’s equivalent opaque format isI444P10
. -
I444AP12
-
This
format
is
composed
of
four
distinct
planes,
one
plane
of
Luma,
two
planes
of
Chroma,
denoted
Y,
U
and
V,
and
one
plane
of
Alpha
values,
all
present
in
this
order.
This format does not use sub-sampling .
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are
codedWidth
*codedHeight
samples in all four planes, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.I444AP10
’s equivalent opaque format isI444P10
. -
NV12
-
This
format
is
composed
of
two
distinct
planes,
one
plane
of
Luma
and
then
another
plane
for
the
two
Chroma
components.
The
two
planes
are
present
in
this
order,
and
are
refered
to
as
respectively
the
Y
plane
and
the
UV
plane.
The U an V components are sub-sampled horizontally and vertically by a factor of 2 compared to the components in the Y planes.
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
samples (and therefore bytes) in the Y and plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.The UV plane is composed of interleaved U and V values, in a number of rows equal to the result of the division of
codedHeight
by 2, rounded up to the nearest integer. Each row has a number of elements equal to the result of the division ofcodedWidth
by 2, rounded up to the nearest integer. Each element is composed of two Chroma samples, the U and V samples, in that order. Samples are arranged starting at the top left of the image.The visible rectangle offset (
visibleRect
.x
andvisibleRect
.y
) MUST be even.An image in the NV12 pixel format that is 16 pixels wide and 10 pixels tall will be arranged like so in memory:YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV
All samples being linear in memory.
-
RGBA
-
This
format
is
composed
of
a
single
plane,
that
encodes
four
components:
Red,
Green,
Blue,
and
an
alpha
value,
present
in
this
order.
Each sample in this format is 8 bits, and each pixel is therefore 32 bits.
There are
codedWidth
*codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.RGBA
’s equivalent opaque format isRGBX
. -
RGBX
-
This
format
is
composed
of
a
single
plane,
that
encodes
four
components:
Red,
Green,
Blue,
and
a
padding
value,
present
in
this
order.
Each sample in this format is 8 bits. The fourth element in each pixel is to be ignored, the image is always fully opaque.
There are
codedWidth
*codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples. -
BGRA
-
This
format
is
composed
of
a
single
plane,
that
encodes
four
components:
Blue,
Green,
Red,
and
an
alpha
value,
present
in
this
order.
Each sample in this format is 8 bits.
There are
codedWidth
*codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.BGRA
’s equivalent opaque format isBGRX
. -
BGRX
-
This
format
is
composed
of
a
single
plane,
that
encodes
four
components:
Blue,
Green,
Red,
and
a
padding
value,
present
in
this
order.
Each sample in this format is 8 bits. The fourth element in each pixel is to be ignored, the image is always fully opaque.
There are
codedWidth
*codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, incodedHeight
rows ofcodedWidth
samples.
9.9. Video Color Space Interface
[Exposed =(Window ,DedicatedWorker )]interface {
VideoColorSpace constructor (optional VideoColorSpaceInit = {});
init readonly attribute VideoColorPrimaries ?primaries ;readonly attribute VideoTransferCharacteristics ?transfer ;readonly attribute VideoMatrixCoefficients ?matrix ;readonly attribute boolean ?fullRange ; [Default ]VideoColorSpaceInit (); };
toJSON dictionary {
VideoColorSpaceInit VideoColorPrimaries ?=
primaries null ;VideoTransferCharacteristics ?=
transfer null ;VideoMatrixCoefficients ?=
matrix null ;boolean ?=
fullRange null ; };
9.9.1. Internal Slots
-
[[primaries]]
-
The color primaries.
-
[[transfer]]
-
The transfer characteristics.
-
[[matrix]]
-
The matrix coefficients.
-
[[full range]]
-
Indicates whether full-range color values are used.
9.9.2. Constructors
VideoColorSpace(init)
-
Let c be a new
VideoColorSpace
object, initialized as follows:-
Assign
init.primaries
to[[primaries]]
. -
Assign
init.transfer
to[[transfer]]
. -
Assign
init.matrix
to[[matrix]]
. -
Assign
init.fullRange
to[[full range]]
.
-
-
Return c .
9.9.3. Attributes
-
primaries
, of type VideoColorPrimaries , readonly, nullable -
The
primaries
getter steps are to return the value of[[primaries]]
. -
transfer
, of type VideoTransferCharacteristics , readonly, nullable -
The
transfer
getter steps are to return the value of[[transfer]]
. -
matrix
, of type VideoMatrixCoefficients , readonly, nullable -
The
matrix
getter steps are to return the value of[[matrix]]
. -
fullRange
, of type boolean , readonly, nullable -
The
fullRange
getter steps are to return the value of[[full range]]
.
9.10. Video Color Primaries
Color primaries describe the color gamut of video samples.enum {
VideoColorPrimaries "bt709" ,"bt470bg" ,"smpte170m" ,"bt2020" ,"smpte432" , };
-
bt709
- Color primaries used by BT.709 and sRGB, as described by [H.273] section 8.1 table 2 value 1.
-
bt470bg
- Color primaries used by BT.601 PAL, as described by [H.273] section 8.1 table 2 value 5.
-
smpte170m
- Color primaries used by BT.601 NTSC, as described by [H.273] section 8.1 table 2 value 6.
-
bt2020
- Color primaries used by BT.2020 and BT.2100, as described by [H.273] section 8.1 table 2 value 9.
-
smpte432
- Color primaries used by P3 D65, as described by [H.273] section 8.1 table 2 value 12.
9.11. Video Transfer Characteristics
Transfer characteristics describe the opto-electronic transfer characteristics of video samples.enum {
VideoTransferCharacteristics "bt709" ,"smpte170m" ,"iec61966-2-1" ,"linear" ,"pq" ,"hlg" , };
-
bt709
- Transfer characteristics used by BT.709, as described by [H.273] section 8.2 table 3 value 1.
-
smpte170m
- Transfer characteristics used by BT.601, as described by [H.273] section 8.2 table 3 value 6. (Functionally the same as "bt709".)
-
iec61966-2-1
- Transfer characteristics used by sRGB, as described by [H.273] section 8.2 table 3 value 13.
-
linear
- Transfer characteristics used by linear RGB, as described by [H.273] section 8.2 table 3 value 8.
-
pq
- Transfer characteristics used by BT.2100 PQ, as described by [H.273] section 8.2 table 3 value 16.
-
hlg
- Transfer characteristics used by BT.2100 HLG, as described by [H.273] section 8.2 table 3 value 18.
9.12. Video Matrix Coefficients
Matrix coefficients describe the relationship between sample component values and color coordinates.enum {
VideoMatrixCoefficients "rgb" ,"bt709" ,"bt470bg" ,"smpte170m" ,"bt2020-ncl" , };
-
rgb
- Matrix coefficients used by sRGB, as described by [H.273] section 8.3 table 4 value 0.
-
bt709
- Matrix coefficients used by BT.709, as described by [H.273] section 8.3 table 4 value 1.
-
bt470bg
- Matrix coefficients used by BT.601 PAL, as described by [H.273] section 8.3 table 4 value 5.
-
smpte170m
- Matrix coefficients used by BT.601 NTSC, as described by [H.273] section 8.3 table 4 value 6. (Functionally the same as "bt470bg".)
-
bt2020-ncl
- Matrix coefficients used by BT.2020 NCL, as described by [H.273] section 8.3 table 4 value 9.
10. Image Decoding
10.1. Background
Image
codec
definitions
are
typically
accompanied
by
a
definition
for
a
corresponding
file
format.
Hence
image
decoders
often
perform
both
duties
of
unpacking
(demuxing)
as
well
as
decoding
the
encoded
image
data.
The
WebCodecs
ImageDecoder
follows
this
pattern,
which
motivates
an
interface
design
that
is
notably
different
from
that
of
VideoDecoder
and
AudioDecoder
.
In
spite
of
these
differences,
ImageDecoder
uses
the
same
codec
processing
model
as
the
other
codec
interfaces.
Additionally,
ImageDecoder
uses
the
VideoFrame
interface
to
describe
decoded
outputs.
10.2. ImageDecoder Interface
[Exposed =(Window ,DedicatedWorker ),SecureContext ]interface {
ImageDecoder constructor (ImageDecoderInit );
init readonly attribute DOMString type ;readonly attribute boolean complete ;readonly attribute Promise <undefined >completed ;readonly attribute ImageTrackList tracks ;Promise <ImageDecodeResult >decode (optional ImageDecodeOptions = {});
options undefined reset ();undefined close ();static Promise <boolean >isTypeSupported (DOMString ); };
type
10.2.1. Internal Slots
-
[[control message queue]]
-
A queue of control messages to be performed upon this codec instance. See [[control message queue]] .
-
[[message queue blocked]]
-
A boolean indicating when processing the
[[control message queue]]
is blocked by a pending control message . See [[message queue blocked]] . -
[[codec work queue]]
-
A parallel queue used for running parallel steps that reference the
[[codec implementation]]
. See [[codec work queue]] . -
[[ImageTrackList]]
-
An
ImageTrackList
describing the tracks found in[[encoded data]]
-
[[type]]
-
A string reflecting the value of the MIME
type
given at construction. -
[[complete]]
-
A boolean indicating whether
[[encoded data]]
is completely buffered. -
[[completed promise]]
-
The promise used to signal when
[[complete]]
becomestrue
. -
[[codec implementation]]
-
An underlying image decoder implementation provided by the User Agent. See [[codec implementation]] .
-
[[encoded data]]
-
A byte sequence containing the encoded image data to be decoded.
-
[[prefer animation]]
-
A boolean reflecting the value of
preferAnimation
given at construction. -
[[pending decode promises]]
-
A list of unresolved promises returned by calls to decode().
-
[[internal selected track index]]
-
Identifies the image track within
[[encoded data]]
that is used by decoding algorithms. -
[[tracks established]]
-
A boolean indicating whether the track list has been established in
[[ImageTrackList]]
. -
[[closed]]
-
A boolean indicating that the
ImageDecoder
is in a permanent closed state and can no longer be used. -
[[progressive frame generations]]
-
A mapping of frame indices to Progressive Image Frame Generations . The values represent the Progressive Image Frame Generation for the
VideoFrame
which was most recently output by a call todecode()
with the given frame index.
10.2.2. Constructor
-
ImageDecoder(init)
-
NOTE: Calling
decode()
on the constructedImageDecoder
will trigger aNotSupportedError
if the User Agent does not support type . Authors are encouraged to first check support by callingisTypeSupported()
with type . User Agents don’t have to support any particular type.When invoked, run these steps:
-
If init is not valid ImageDecoderInit , throw a
TypeError
. -
If init .
transfer
contains more than one reference to the sameArrayBuffer
, then throw aDataCloneError
DOMException
. -
For each transferable in init .
transfer
:-
If
[[Detached]]
internal slot istrue
, then throw aDataCloneError
DOMException
.
-
-
Let d be a new
ImageDecoder
object. In the steps below, all mentions ofImageDecoder
members apply to d unless stated otherwise. -
Assign a new queue to
[[control message queue]]
. -
Assign
false
to[[message queue blocked]]
. -
Assign the result of starting a new parallel queue to
[[codec work queue]]
. -
Assign
[[ImageTrackList]]
a newImageTrackList
initialized as follows:-
Assign a new list to
[[track list]]
. -
Assign
-1
to[[selected index]]
.
-
-
Assign
null
to[[codec implementation]]
. -
If
init.preferAnimation
exists , assigninit.preferAnimation
to the[[prefer animation]]
internal slot. Otherwise, assign 'null' to[[prefer animation]]
internal slot. -
Assign a new list to
[[pending decode promises]]
. -
Assign
-1
to[[internal selected track index]]
. -
Assign
false
to[[tracks established]]
. -
Assign
false
to[[closed]]
. -
Assign a new map to
[[progressive frame generations]]
. -
If init ’s
data
member is of typeReadableStream
:-
Assign a new list to
[[encoded data]]
. -
Assign
false
to[[complete]]
-
Queue a control message to configure the image decoder with init .
-
Let reader be the result of getting a reader for
data
. -
In parallel, perform the Fetch Stream Data Loop on d with reader .
-
-
Otherwise:
-
Assert that
init.data
is of typeBufferSource
. -
If init .
transfer
contains anArrayBuffer
referenced by init .data
the User Agent MAY choose to:-
Let
[[encoded data]]
reference bytes in data representing an encoded image.
-
-
Otherwise:
-
Assign a copy of
init.data
to[[encoded data]]
.
-
-
Assign
true
to[[complete]]
. -
Resolve
[[completed promise]]
. -
Queue a control message to configure the image decoder with init .
-
Queue a control message to decode track metadata .
-
-
For each transferable in init .
transfer
:-
Perform DetachArrayBuffer on transferable
-
-
return d .
Running a control message to configure the image decoder means running these steps:
-
Let supported be the result of running the Check Type Support algorithm with
init.type
. -
If supported is
false
, run the Close ImageDecoder algorithm with aNotSupportedError
DOMException
and return"processed"
. -
Otherwise, assign the
[[codec implementation]]
internal slot with an implementation supportinginit.type
-
Assign
true
to[[message queue blocked]]
. -
Enqueue the following steps to the
[[codec work queue]]
:-
Configure
[[codec implementation]]
in accordance with the values given forcolorSpaceConversion
,desiredWidth
, anddesiredHeight
. -
Assign
false
to[[message queue blocked]]
.
-
-
Return
"processed"
.
Running a control message to decode track metadata means running these steps:
-
Enqueue the following steps to the
[[codec work queue]]
:-
Run the Establish Tracks algorithm.
-
-
10.2.3. Attributes
-
type
, of type DOMString , readonly -
A string reflecting the value of the MIME
type
given at construction. -
complete
, of type boolean , readonly -
Indicates whether
[[encoded data]]
is completely buffered.The
complete
getter steps are to return[[complete]]
. -
completed
, of type Promise< undefined >, readonly -
The promise used to signal when
complete
becomestrue
.The
completed
getter steps are to return[[completed promise]]
. -
tracks
, of type ImageTrackList , readonly -
Returns a live
ImageTrackList
, which provides metadata for the available tracks and a mechanism for selecting a track to decode.The
tracks
getter steps are to return[[ImageTrackList]]
.
10.2.4. Methods
-
decode(options)
-
Enqueues a control message to decode the frame according to options .
When invoked, run these steps:
-
If
[[closed]]
istrue
, return aPromise
rejected with anInvalidStateError
DOMException
. -
If
[[ImageTrackList]]
’s[[selected index]]
is '-1', return aPromise
rejected with anInvalidStateError
DOMException
. -
If options is
undefined
, assign a newImageDecodeOptions
to options . -
Let promise be a new
Promise
. -
Append promise to
[[pending decode promises]]
. -
Queue a control message to decode the image with options , and promise .
-
Return promise .
Running a control message to decode the image means running these steps:
-
Enqueue the following steps to the
[[codec work queue]]
:-
Wait for
[[tracks established]]
to becometrue
. -
If options .
completeFramesOnly
isfalse
and the image is a Progressive Image for which the User Agent supports progressive decoding, run the Decode Progressive Frame algorithm with options .frameIndex
and promise . -
Otherwise, run the Decode Complete Frame algorithm with options .
frameIndex
and promise .
-
-
-
reset()
-
Immediately aborts all pending work.
When invoked, run the Reset ImageDecoder algorithm with an
AbortError
DOMException
. -
close()
-
Immediately aborts all pending work and releases system resources. Close is final.
When invoked, run the Close ImageDecoder algorithm with an
AbortError
DOMException
. -
isTypeSupported(type)
-
Returns a promise indicating whether the provided config is supported by the User Agent.
When invoked, run these steps:
-
If type is not a valid image MIME type , return a
Promise
rejected withTypeError
. -
Let p be a new
Promise
. -
In parallel, resolve p with the result of running the Check Type Support algorithm with type .
-
Return p .
-
10.2.5. Algorithms
- Fetch Stream Data Loop (with reader )
-
Run these steps:
-
Let readRequest be the following read request .
- chunk steps , given chunk
-
-
If
[[closed]]
istrue
, abort these steps. -
If chunk is not a Uint8Array object, queue a task to run the Close ImageDecoder algorithm with a
DataError
DOMException
and abort these steps. -
Let bytes be the byte sequence represented by the Uint8Array object.
-
Append bytes to the
[[encoded data]]
internal slot. -
If
[[tracks established]]
isfalse
, run the Establish Tracks algorithm. -
Otherwise, run the Update Tracks algorithm.
-
Run the Fetch Stream Data Loop algorithm with reader .
-
- close steps
-
-
Assign
true
to[[complete]]
-
Resolve
[[completed promise]]
.
-
- error steps
-
-
Queue a task to run the Close ImageDecoder algorithm with a
NotReadableError
DOMException
-
-
Read a chunk from reader given readRequest .
-
- Establish Tracks
-
Run these steps:
-
Assert
[[tracks established]]
isfalse
. -
If
[[encoded data]]
does not contain enough data to determine the number of tracks:-
If
complete
istrue
, queue a task to run the Close ImageDecoder algorithm with aInvalidStateError
DOMException
. -
Abort these steps.
-
-
If the number of tracks is found to be
0
, queue a task to run the Close ImageDecoder algorithm and abort these steps. -
Let newTrackList be a new list .
-
For each image track found in
[[encoded data]]
:-
Let newTrack be a new
ImageTrack
, initialized as follows:-
Assign this to
[[ImageDecoder]]
. -
Assign
tracks
to[[ImageTrackList]]
. -
If image track is found to be animated, assign
true
to newTrack ’s[[animated]]
internal slot. Otherwise, assignfalse
. -
If image track is found to describe a frame count, assign that count to newTrack ’s
[[frame count]]
internal slot. Otherwise, assign0
.NOTE: If this was constructed with
data
as aReadableStream
, theframeCount
can change as additional bytes are appended to[[encoded data]]
. See the Update Tracks algorithm. -
If image track is found to describe a repetition count, assign that count to
[[repetition count]]
internal slot. Otherwise, assign0
.NOTE: A value of
Infinity
indicates infinite repetitions. -
Assign
false
to newTrack ’s[[selected]]
internal slot.
-
-
Append newTrack to newTrackList .
-
-
Let selectedTrackIndex be the result of running the Get Default Selected Track Index algorithm with newTrackList .
-
Let selectedTrack be the track at position selectedTrackIndex within newTrackList .
-
Assign
true
to selectedTrack ’s[[selected]]
internal slot. -
Assign selectedTrackIndex to
[[internal selected track index]]
. -
Assign
true
to[[tracks established]]
. -
Queue a task to perform the following steps:
-
Assign newTrackList to the
tracks
[[track list]]
internal slot. -
Assign selectedTrackIndex to
tracks
[[selected index]]
. -
Resolve
[[ready promise]]
.
-
-
- Get Default Selected Track Index (with trackList )
-
Run these steps:
-
If
[[encoded data]]
identifies a Primary Image Track :-
Let primaryTrack be the
ImageTrack
from trackList that describes the Primary Image Track . -
Let primaryTrackIndex be position of primaryTrack within trackList .
-
If
[[prefer animation]]
isnull
, return primaryTrackIndex . -
If primaryTrack .
animated
equals[[prefer animation]]
, return primaryTrackIndex .
-
-
If any
ImageTrack
s in trackList haveanimated
equal to[[prefer animation]]
, return the position of the earliest such track in trackList . -
Return
0
.
-
- Update Tracks
-
A track update struct is a struct that consists of a track index (
unsigned long
) and a frame count (unsigned long
).Run these steps:
-
Assert
[[tracks established]]
istrue
. -
Let trackChanges be a new list .
-
Let trackList be a copy of
tracks
'[[track list]]
. -
For each track in trackList :
-
Let trackIndex be the position of track in trackList .
-
Let latestFrameCount be the frame count as indicated by
[[encoded data]]
for the track corresponding to track . -
Assert that latestFrameCount is greater than or equal to
track.frameCount
. -
If latestFrameCount is greater than
track.frameCount
:-
Let change be a track update struct whose track index is trackIndex and frame count is latestFrameCount .
-
Append change to tracksChanges .
-
-
-
If tracksChanges is empty , abort these steps.
-
Queue a task to perform the following steps:
-
For each update in trackChanges :
-
Let updateTrack be the
ImageTrack
at positionupdate.trackIndex
withintracks
'[[track list]]
. -
Assign
update.frameCount
to updateTrack ’s[[frame count]]
.
-
-
-
- Decode Complete Frame (with frameIndex and promise )
-
-
Assert that
[[tracks established]]
istrue
. -
Assert that
[[internal selected track index]]
is not-1
. -
Let encodedFrame be the encoded frame identified by frameIndex and
[[internal selected track index]]
. -
Wait for any of the following conditions to be true (whichever happens first):
-
[[encoded data]]
contains enough bytes to completely decode encodedFrame . -
[[encoded data]]
is found to be malformed. -
complete
istrue
. -
[[closed]]
istrue
.
-
-
If
[[encoded data]]
is found to be malformed, run the Fatally Reject Bad Data algorithm and abort these steps. -
If
[[encoded data]]
does not contain enough bytes to completely decode encodedFrame , run the Reject Infeasible Decode algorithm with promise and abort these steps. -
Attempt to use
[[codec implementation]]
to decode encodedFrame . -
If decoding produces an error, run the Fatally Reject Bad Data algorithm and abort these steps.
-
If
[[progressive frame generations]]
contains an entry keyed by frameIndex , remove the entry from the map. -
Let output be the decoded image data emitted by
[[codec implementation]]
corresponding to encodedFrame . -
Let decodeResult be a new
ImageDecodeResult
initialized as follows:-
Assign 'true' to
complete
. -
Let duration be the presentation duration for output as described by encodedFrame . If encodedFrame does not have a duration, assign
null
to duration . -
Let timestamp be the presentation timestamp for output as described by encodedFrame . If encodedFrame does not have a timestamp:
-
If encodedFrame is a still image assign
0
to timestamp . -
If encodedFrame is a constant rate animated image and duration is not
null
, assign|frameIndex| * |duration|
to timestamp . -
If a timestamp can otherwise be trivially generated from metadata without further decoding, assign that to timestamp .
-
Otherwise, assign
0
to timestamp .
-
-
If
[[encoded data]]
contains orientation metadata describe it as rotation and flip , otherwise set rotation to 0 and flip to false. -
Assign
image
with the result of running the Create a VideoFrame algorithm with output , timestamp , duration , rotation , and flip .
-
-
Run the Resolve Decode algorithm with promise and decodeResult .
-
- Decode Progressive Frame (with frameIndex and promise )
-
-
Assert that
[[tracks established]]
istrue
. -
Assert that
[[internal selected track index]]
is not-1
. -
Let encodedFrame be the encoded frame identified by frameIndex and
[[internal selected track index]]
. -
Let lastFrameGeneration be
null
. -
If
[[progressive frame generations]]
contains a map entry with the key frameIndex , assign the value of the map entry to lastFrameGeneration . -
Wait for any of the following conditions to be true (whichever happens first):
-
[[encoded data]]
contains enough bytes to decode encodedFrame to produce an output whose Progressive Image Frame Generation exceeds lastFrameGeneration . -
[[encoded data]]
is found to be malformed. -
complete
istrue
. -
[[closed]]
istrue
.
-
-
If
[[encoded data]]
is found to be malformed, run the Fatally Reject Bad Data algorithm and abort these steps. -
Otherwise, if
[[encoded data]]
does not contain enough bytes to decode encodedFrame to produce an output whose Progressive Image Frame Generation exceeds lastFrameGeneration , run the Reject Infeasible Decode algorithm with promise and abort these steps. -
Attempt to use
[[codec implementation]]
to decode encodedFrame . -
If decoding produces an error, run the Fatally Reject Bad Data algorithm and abort these steps.
-
Let output be the decoded image data emitted by
[[codec implementation]]
corresponding to encodedFrame . -
Let decodeResult be a new
ImageDecodeResult
. -
If output is the final full-detail progressive output corresponding to encodedFrame :
-
Assign
true
to decodeResult ’scomplete
. -
If
[[progressive frame generations]]
contains an entry keyed by frameIndex , remove the entry from the map.
-
-
Otherwise:
-
Assign
false
to decodeResult ’scomplete
. -
Let frameGeneration be the Progressive Image Frame Generation for output .
-
Add a new entry to
[[progressive frame generations]]
with key frameIndex and value frameGeneration .
-
-
Let duration be the presentation duration for output as described by encodedFrame . If encodedFrame does not describe a duration, assign
null
to duration . -
Let timestamp be the presentation timestamp for output as described by encodedFrame . If encodedFrame does not have a timestamp:
-
If encodedFrame is a still image assign
0
to timestamp . -
If encodedFrame is a constant rate animated image and duration is not
null
, assign|frameIndex| * |duration|
to timestamp . -
If a timestamp can otherwise be trivially generated from metadata without further decoding, assign that to timestamp .
-
Otherwise, assign
0
to timestamp .
-
-
If
[[encoded data]]
contains orientation metadata describe it as rotation and flip , otherwise set rotation to 0 and flip to false. -
Assign
image
with the result of running the Create a VideoFrame algorithm with output , timestamp , duration , rotation , and flip . -
Remove promise from
[[pending decode promises]]
. -
Resolve promise with decodeResult .
-
- Resolve Decode (with promise and result )
-
-
Queue a task to perform these steps:
-
If
[[closed]]
, abort these steps. -
Assert that promise is an element of
[[pending decode promises]]
. -
Remove promise from
[[pending decode promises]]
. -
Resolve promise with result .
-
-
- Reject Infeasible Decode (with promise )
-
-
Assert that
complete
istrue
or[[closed]]
istrue
. -
If
complete
istrue
, let exception be aRangeError
. Otherwise, let exception be anInvalidStateError
DOMException
. -
Queue a task to perform these steps:
-
If
[[closed]]
, abort these steps. -
Assert that promise is an element of
[[pending decode promises]]
. -
Remove promise from
[[pending decode promises]]
. -
Reject promise with exception .
-
-
- Fatally Reject Bad Data
-
-
Queue a task to perform these steps:
-
If
[[closed]]
, abort these steps. -
Run the Close ImageDecoder algorithm with an
EncodingError
DOMException
.
-
-
- Check Type Support (with type )
-
-
If the User Agent can provide a codec to support decoding type , return
true
. -
Otherwise, return
false
.
-
- Reset ImageDecoder (with exception )
-
-
Signal
[[codec implementation]]
to abort any active decoding operation. -
For each decodePromise in
[[pending decode promises]]
:-
Reject decodePromise with exception .
-
Remove decodePromise from
[[pending decode promises]]
.
-
-
- Close ImageDecoder (with exception )
-
-
Run the Reset ImageDecoder algorithm with exception .
-
Assign
true
to[[closed]]
. -
Clear
[[codec implementation]]
and release associated system resources . -
If
[[ImageTrackList]]
is empty, reject[[ready promise]]
with exception . Otherwise perform these steps,-
Remove all entries from
[[ImageTrackList]]
. -
Assign
-1
to[[ImageTrackList]]
’s[[selected index]]
.
-
-
If
[[complete]]
is false resolve[[completed promise]]
with exception .
-
10.3. ImageDecoderInit Interface
typedef (AllowSharedBufferSource or ReadableStream );
ImageBufferSource dictionary {
ImageDecoderInit required DOMString type ;required ImageBufferSource data ;ColorSpaceConversion colorSpaceConversion = "default"; [EnforceRange ]unsigned long desiredWidth ; [EnforceRange ]unsigned long desiredHeight ;boolean preferAnimation ;sequence <ArrayBuffer >= []; };
transfer
To
determine
if
an
ImageDecoderInit
is
a
valid
ImageDecoderInit
,
run
these
steps:
-
If type is not a valid image MIME type , return
false
. -
If data is of type
ReadableStream
and the ReadableStream is disturbed or locked , returnfalse
. -
If data is of type
BufferSource
: -
If
desiredWidth
exists anddesiredHeight
does not exist, returnfalse
. -
If
desiredHeight
exists anddesiredWidth
does not exist, returnfalse
. -
Return
true
.
A
valid
image
MIME
type
is
a
string
that
is
a
valid
MIME
type
string
and
for
which
the
type
,
per
Section
8.3.1
of
[RFC9110]
,
is
image
.
-
type
, of type DOMString -
String containing the MIME type of the image file to be decoded.
-
data
, of type ImageBufferSource -
BufferSource
orReadableStream
of bytes representing an encoded image file as described bytype
. -
colorSpaceConversion
, of type ColorSpaceConversion , defaulting to"default"
-
Controls whether decoded outputs' color space is converted or ignored, as defined by
colorSpaceConversion
inImageBitmapOptions
. -
desiredWidth
, of type unsigned long -
Indicates a desired width for decoded outputs. Implementation is best effort; decoding to a desired width MAY not be supported by all formats/ decoders.
-
desiredHeight
, of type unsigned long -
Indicates a desired height for decoded outputs. Implementation is best effort; decoding to a desired height MAY not be supported by all formats/decoders.
-
preferAnimation
, of type boolean -
For images with multiple tracks, this indicates whether the initial track selection SHOULD prefer an animated track.
NOTE: See the Get Default Selected Track Index algorithm.
10.4. ImageDecodeOptions Interface
dictionary { [
ImageDecodeOptions EnforceRange ]unsigned long frameIndex = 0;boolean completeFramesOnly =true ; };
-
frameIndex
, of type unsigned long , defaulting to0
-
The index of the frame to decode.
-
completeFramesOnly
, of type boolean , defaulting totrue
-
For Progressive Images , a value of
false
indicates that the decoder MAY output animage
with reduced detail. Each subsequent call todecode()
for the sameframeIndex
will resolve to produce an image with a higher Progressive Image Frame Generation (more image detail) than the previous call, until finally the full-detail image is produced.If
completeFramesOnly
is assignedtrue
, or if the image is not a Progressive Image , or if the User Agent does not support progressive decoding for the given image type, calls todecode()
will only resolve once the full detail image is decoded.NOTE: For Progressive Images , settingcompleteFramesOnly
tofalse
can be used to offer users a preview an image that is still being buffered from the network (via thedata
ReadableStream
).Upon decoding the full detail image, the
ImageDecodeResult
’scomplete
will be set to true.
10.5. ImageDecodeResult Interface
dictionary {
ImageDecodeResult ;required VideoFrame image ;required boolean complete ; };
-
image
, of type VideoFrame -
The decoded image.
-
complete
, of type boolean -
Indicates whether
image
contains the final full-detail output.NOTE:
complete
is alwaystrue
whendecode()
is invoked withcompleteFramesOnly
set totrue
.
10.6. ImageTrackList Interface
[Exposed =(Window ,DedicatedWorker )]interface {
ImageTrackList getter ImageTrack (unsigned long );
index readonly attribute Promise <undefined >ready ;readonly attribute unsigned long length ;readonly attribute long selectedIndex ;readonly attribute ImageTrack ?selectedTrack ; };
10.6.1. Internal Slots
-
[[ready promise]]
-
The promise used to signal when the
ImageTrackList
has been populated withImageTrack
s.NOTE:
ImageTrack
frameCount
can receive subsequent updates untilcomplete
istrue
. -
[[track list]]
-
The list of
ImageTrack
s describe by thisImageTrackList
. -
[[selected index]]
-
The index of the selected track in
[[track list]]
. A value of-1
indicates that no track is selected. The initial value is-1
.
10.6.2. Attributes
-
ready
, of type Promise< undefined >, readonly -
The
ready
getter steps are to return the[[ready promise]]
. -
length
, of type unsigned long , readonly -
The
length
getter steps are to return the length of[[track list]]
. -
selectedIndex
, of type long , readonly -
The
selectedIndex
getter steps are to return[[selected index]]
; -
selectedTrack
, of type ImageTrack , readonly, nullable -
The
selectedTrack
getter steps are:-
If
[[selected index]]
is-1
, returnnull
. -
Otherwise, return the ImageTrack from
[[track list]]
at the position indicated by[[selected index]]
.
-
10.7. ImageTrack Interface
[Exposed =(Window ,DedicatedWorker )]interface {
ImageTrack readonly attribute boolean animated ;readonly attribute unsigned long frameCount ;readonly attribute unrestricted float repetitionCount ;attribute boolean selected ; };
10.7.1. Internal Slots
-
[[ImageDecoder]]
-
The
ImageDecoder
instance that constructed thisImageTrack
. -
[[ImageTrackList]]
-
The
ImageTrackList
instance that lists thisImageTrack
. -
[[animated]]
-
Indicates whether this track contains an animated image with multiple frames.
-
[[frame count]]
-
The number of frames in this track.
-
[[repetition count]]
-
The number of times the animation is intended to repeat.
-
[[selected]]
-
Indicates whether this track is selected for decoding.
10.7.2. Attributes
-
animated
, of type boolean , readonly -
The
animated
getter steps are to return the value of[[animated]]
.NOTE: This attribute provides an early indication that
frameCount
will ultimately exceed 0 for images where theframeCount
starts at0
and later increments as new chunks of theReadableStream
data
arrive. -
frameCount
, of type unsigned long , readonly -
The
frameCount
getter steps are to return the value of[[frame count]]
. -
repetitionCount
, of type unrestricted float , readonly -
The
repetitionCount
getter steps are to return the value of[[repetition count]]
. -
selected
, of type boolean -
The
selected
getter steps are to return the value of[[selected]]
.The
selected
setter steps are:-
If
[[ImageDecoder]]
’s[[closed]]
slot istrue
, abort these steps. -
Let newValue be the given value .
-
If newValue equals
[[selected]]
, abort these steps. -
Assign newValue to
[[selected]]
. -
Let parentTrackList be
[[ImageTrackList]]
-
Let oldSelectedIndex be the value of parentTrackList
[[selected index]]
. -
If oldSelectedIndex is not
-1
:-
Let oldSelectedTrack be the
ImageTrack
in parentTrackList[[track list]]
at the position of oldSelectedIndex . -
Assign
false
to oldSelectedTrack[[selected]]
-
-
If newValue is
true
, let selectedIndex be the index of thisImageTrack
within parentTrackList ’s[[track list]]
. Otherwise, let selectedIndex be-1
. -
Assign selectedIndex to parentTrackList
[[selected index]]
. -
Run the Reset ImageDecoder algorithm on
[[ImageDecoder]]
. -
Queue a control message to
[[ImageDecoder]]
’s control message queue to update the internal selected track index with selectedIndex . -
Process the control message queue belonging to
[[ImageDecoder]]
.
Running a control message to update the internal selected track index means running these steps:
-
Enqueue the following steps to
[[ImageDecoder]]
’s[[codec work queue]]
:-
Assign selectedIndex to
[[internal selected track index]]
. -
Remove all entries from
[[progressive frame generations]]
.
-
-
11. Resource Reclamation
When resources are constrained, a User Agent MAY proactively reclaim codecs. This is particularly true in the case where hardware codecs are limited, and shared accross web pages or platform apps.
To
reclaim
a
codec
,
a
User
Agent
MUST
run
the
appropriate
close
algorithm
(amongst
Close
AudioDecoder
,
Close
AudioEncoder
,
Close
VideoDecoder
and
Close
VideoEncoder
)
with
a
QuotaExceededError
DOMException
.
The rules governing when a codec may be reclaimed depend on whether the codec is an active or inactive codec and/or a background codec.
An
active
codec
is
a
codec
that
has
made
progress
on
the
[[codec
work
queue]]
in
the
past
10
seconds
.
NOTE:
A
reliable
sign
of
the
working
queue’s
progress
is
a
call
to
output()
callback.
An inactive codec is any codec that does not meet the definition of an active codec .
A
background
codec
is
a
codec
whose
ownerDocument
(or
owner
set
’s
Document
,
for
codecs
in
workers)
has
a
attribute
equal
to
true
.
A User Agent MUST only reclaim a codec that is either an inactive codec , a background codec , or both. A User Agent MUST NOT reclaim a codec that is both active and in the foreground, i.e. not a background codec .
Additionally, User Agents MUST NOT reclaim an active background codec if it is:
-
An encoder, e.g. an
AudioEncoder
orVideoEncoder
.NOTE: This prevents long running encode tasks from being interrupted.
-
An
AudioDecoder
orVideoDecoder
, when there is, respectively, an activeAudioEncoder
orVideoEncoder
in the same global object .NOTE: This prevents prevents breaking long running transcoding tasks.
-
An
AudioDecoder
, when its tab is audibly playing audio.
12. Security Considerations
The primary security impact is that features of this API make it easier for an attacker to exploit vulnerabilities in the underlying platform codecs. Additionally, new abilities to configure and control the codecs can allow for new exploits that rely on a specific configuration and/or sequence of control operations.
Platform
codecs
are
historically
an
internal
detail
of
APIs
like
HTMLMediaElement
,
[WEBAUDIO]
,
and
[WebRTC]
.
In
this
way,
it
has
always
been
possible
to
attack
the
underlying
codecs
by
using
malformed
media
files/streams
and
invoking
the
various
API
control
methods.
For
example,
you
can
send
any
stream
to
a
decoder
by
first
wrapping
that
stream
in
a
media
container
(e.g.
mp4)
and
setting
that
as
the
src
of
an
HTMLMediaElement
.
You
can
then
cause
the
underlying
video
decoder
to
be
reset()
by
setting
a
new
value
for
<video>.currentTime
.
WebCodecs makes such attacks easier by exposing low level control when inputs are provided and direct access to invoke the codec control methods. This also affords attackers the ability to invoke sequences of control methods that were not previously possible via the higher level APIs.
The Working Group expects User Agents to mitigate this risk by extensively fuzzing their implementation with random inputs and control method invocations. Additionally, User Agents are encouraged to isolate their underlying codecs in processes with restricted privileges (sandbox) as a barrier against successful exploits being able to read user data.
An additional concern is exposing the underlying codecs to input mutation race conditions, such as allowing a site to mutate a codec input or output while the underlying codec is still operating on that data. This concern is mitigated by ensuring that input and output interfaces are immutable.
13. Privacy Considerations
The primary privacy impact is an increased ability to fingerprint users by querying for different codec capabilities to establish a codec feature profile. Much of this profile is already exposed by existing APIs. Such profiles are very unlikely to be uniquely identifying, but can be used with other metrics to create a fingerprint.
An
attacker
can
accumulate
a
codec
feature
profile
by
calling
IsConfigSupported()
methods
with
a
number
of
different
configuration
dictionaries.
Similarly,
an
attacker
can
attempt
to
configure()
a
codec
with
different
configuration
dictionaries
and
observe
which
configurations
are
accepted.
Attackers
can
also
use
existing
APIs
to
establish
much
of
the
codec
feature
profile.
For
example,
the
[media-capabilities]
decodingInfo()
API
describes
what
types
of
decoders
are
supported
and
its
powerEfficient
attribute
can
signal
when
a
decoder
uses
hardware
acceleration.
Similarly,
the
[WebRTC]
getCapabilities()
API
can
be
used
to
determine
what
types
of
encoders
are
supported
and
the
getStats()
API
can
be
used
to
determine
when
an
encoder
uses
hardware
acceleration.
WebCodecs
will
expose
some
additional
information
in
the
form
of
low
level
codec
features.
A codec feature profile alone is unlikely to be uniquely identifying. Underlying codecs are often implemented entirely in software (be it part of the User Agent binary or part of the operating system), such that all users who run that software will have a common set of capabilities. Additionally, underlying codecs are often implemented with hardware acceleration, but such hardware is mass produced and devices of a particular class and manufacture date (e.g. flagship phones manufactured in 2020) will often have common capabilities. There will be outliers (some users can be running outdated versions of software codecs or use a rare mix of custom assembled hardware), but most of the time a given codec feature profile is shared by a large group of users.
Segmenting groups of users by codec feature profile still amounts to a bit of entropy that can be combined with other metrics to uniquely identify a user. User Agents MAY partially mitigate this by returning an error whenever a site attempts to exhaustively probe for codec capabilities. Additionally, User Agents MAY implement a "privacy budget", which depletes as authors use WebCodecs and other identifying APIs. Upon exhaustion of the privacy budget, codec capabilities could be reduced to a common baseline or prompt for user approval.
14. Best Practices for Authors Using WebCodecs
While
WebCodecs
internally
operates
on
background
threads,
authors
working
with
realtime
media
or
in
contended
main
thread
environments
are
encouraged
to
ensure
their
media
pipelines
operate
in
worker
contexts
entirely
independent
of
the
main
thread
where
possible.
For
example,
realtime
media
processing
of
VideoFrame
s
are
generally
to
be
done
in
a
worker
context.
The main thread has significant potential for high contention and jank that can go unnoticed in development, yet degrade inconsistently across devices and User Agents in the field -- potentially dramatically impacting the end user experience. Ensuring the media pipeline is decoupled from the main thread helps provide a smooth experience for end users.
Authors using the main thread for their media pipeline ought to be sure of their target frame rates, main thread workload, how their application will be embedded, and the class of devices their users will be using.
15. Acknowledgements
The editors would like to thank Alex Russell, Chris Needham, Dale Curtis, Dan Sanders, Eugene Zemtsov, Francois Daoust, Guido Urdaneta, Harald Alvestrand, Jan-Ivar Bruaroey, Jer Noble, Mark Foltz, Peter Thatcher, Steve Anton, Matt Wolenetz, Rijubrata Bhaumik, Thomas Guilbert, Tuukka Toivonen, and Youenn Fablet for their contributions to this specification. Thank you also to the many others who contributed to the specification, including through their participation on the mailing list and in the issues.
The Working Group dedicates this specification to our colleague Bernard Aboba.