Copyright © 2023 World Wide Web Consortium . W3C ® liability , trademark and permissive document license rules apply.
This document defines a set of JavaScript APIs that let a Web application manage how audio is rendered on the user audio output devices.
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
The WebRTC and Device and Sensors Working Group intend to publish this specification as a Candidate Recommendation soon. Consequently, this is a Request for wide review of this document.
This document was published by the Web Real-Time Communications Working Group as an Editor's Draft.
Publication as an Editor's Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
This document is governed by the 2 November 2021 W3C Process Document .
This section is non-normative.
This proposal allows JavaScript to direct the audio output of a media element to permitted devices other than the system or user agent default. This can be helpful in a variety of real-time communication scenarios as well as general media applications. For example, an application can use this API to programmatically direct output to a device such as a Bluetooth headset or speakerphone.
HTMLMediaElement
Extensions
This
section
specifies
additions
to
the
HTMLMediaElement
[
HTML
]
when
the
Audio
Output
Devices
API
is
supported.
When
the
HTMLMediaElement
constructor
is
invoked,
the
user
agent
MUST
add
the
following
initializing
step:
Let
the
element
have
a
[[SinkId]]
internal
slot,
initialized
to
""
.
WebIDLpartial interface HTMLMediaElement {
  [SecureContext] readonly attribute DOMString sinkId;
  [SecureContext] Promise<undefined> setSinkId (DOMString sinkId);
};
sinkId
of
type
DOMString
,
readonly
This
attribute
contains
the
ID
of
the
audio
device
through
which
output
is
being
delivered,
or
the
empty
string
if
output
is
delivered
through
the
user-agent
default
device.
If
nonempty,
this
ID
should
be
equal
to
the
deviceId
attribute
of
one
of
the
MediaDeviceInfo
values
returned
from
enumerateDevices
()
.
On
getting,
the
attribute
MUST
return
the
value
of
the
[[SinkId]]
slot.
setSinkId
Sets the ID of the audio device through which audio output should be rendered if the application is permitted to play out of a given device.
When this method is invoked, the user agent must run the following steps:
Let
document
be
the
current
settings
object
this
's
relevant
global
object
's
associated
Document
.
If
document
is
not
allowed
to
use
the
feature
identified
by
"speaker-selection"
,
return
a
promise
rejected
with
a
new
DOMException
whose
name
is
NotAllowedError
.
Let
element
be
the
HTMLMediaElement
object
on
which
this
method
was
invoked.
Let sinkId be the method's first argument.
If
sinkId
is
equal
to
element
's
[[SinkId]]
,
return
a
promise
resolved
with
undefined
.
Let p be a new promise.
Run the following substeps in parallel:
If
sinkId
is
not
the
empty
string
and
does
not
match
any
audio
output
device
identified
by
the
result
that
would
be
provided
by
enumerateDevices
()
,
reject
p
with
a
new
DOMException
whose
name
is
NotFoundError
and
abort
these
substeps.
If
sinkId
is
not
the
empty
string,
and
the
application
would
not
be
permitted
to
play
audio
through
the
device
identified
by
sinkId
if
it
weren't
the
current
user
agent
default
device,
reject
p
with
a
new
DOMException
whose
name
is
NotAllowedError
and
abort
these
substeps.
Switch the underlying audio output device for element to the audio device identified by sinkId .
If
the
preceding
substep
failed,
reject
p
with
a
new
DOMException
whose
name
is
AbortError
,
and
abort
these
substeps.
Queue a task that runs the following steps:
Set
element
's
[[SinkId]]
to
sinkId
.
Resolve p .
Return p .
New
audio
devices
may
become
available
to
the
user
agent,
or
an
audio
device
(identified
by
a
media
element's
sinkId
attribute)
that
had
previously
become
unavailable
may
become
available
again,
for
example,
if
it
is
unplugged
and
later
plugged
back
in.
In this scenario, the user agent must run the following steps:
Let sinkId be the identifier for the newly available device.
For
each
media
element
whose
sinkId
attribute
is
equal
to
sinkId
:
The following paragraph is non-normative.
If
the
application
wishes
to
react
to
the
device
change,
the
application
can
listen
to
the
devicechange
event
and
query
enumerateDevices
()
for
the
list
of
updated
devices.
MediaDevices
Extensions
This
section
specifies
additions
to
the
MediaDevices
when
the
Audio
Output
Devices
API
is
supported.
WebIDLpartial interface MediaDevices {
  Promise<MediaDeviceInfo> selectAudioOutput(optional AudioOutputOptions options = {});
};
selectAudioOutput
Prompts the user to select a specific audio output device.
When
the
selectAudioOutput
method
is
called,
the
user
agent
MUST
run
the
following
steps:
If
the
this
's
relevant
global
object
of
this
does
not
have
transient
activation
,
return
a
promise
rejected
with
a
DOMException
object
whose
name
attribute
has
the
value
InvalidStateError
.
Let options be the method's first argument.
Let
deviceId
be
options
.deviceId
.
Let p be a new promise.
Run the following steps in parallel:
Let
descriptor
be
a
PermissionDescriptor
with
its
name
set
to
"speaker-selection"
If
descriptor
's
permission
state
is
"
denied
",
reject
p
with
a
new
DOMException
whose
name
attribute
has
the
value
NotAllowedError
,
and
abort
these
steps.
Probe the user agent for available audio output devices.
If
there
is
no
audio
output
device,
reject
p
with
a
new
DOMException
whose
name
attribute
has
the
value
NotFoundError
and
abort
these
steps.
If
deviceId
is
not
""
and
matches
an
id
previously
exposed
by
selectAudioOutput
in
an
earlier
browsing
session,
the
user
agent
MAY
decide,
based
on
its
previous
decision
of
whether
to
persist
this
id
or
not
for
this
set
of
origins,
to
run
the
following
sub
steps:
Let device be the device identified by deviceId , if available.
If device is available, resolve p with either deviceId or a freshly rotated device id for device , and abort the in-parallel steps.
Prompt the user to choose an audio output device, with descriptor .
If
the
result
of
the
request
is
"
denied
",
reject
p
with
a
new
DOMException
whose
name
attribute
has
the
value
NotAllowedError
and
abort
these
steps.
Let
deviceInfo
be
a
new
MediaDeviceInfo
object
to
represent
the
selected
audio
output
device.
Add
deviceInfo
.
deviceId
to
[[explicitlyGrantedAudioOutputDevices]]
.
Resolve p with deviceInfo .
Return p .
Once
a
device
is
exposed
after
a
call
to
selectAudioOutput
,
it
MUST
be
listed
by
enumerateDevices
()
for
the
current
browsing
context.
If
the
promise
returned
by
selectAudioOutput
is
resolved,
then
the
user
agent
MUST
ensure
the
document
is
both
immediately
allowed
to
play
media
in
an
HTMLMediaElement
,
and
immediately
allowed
to
start
an
AudioContext
,
without
needing
any
additional
user
gesture.
This is imprecise due to the current lack of standardization of autoplay in browsers.
This dictionary describes the options that can be used to obtain access to an audio output device.
WebIDLdictionary AudioOutputOptions {
  DOMString deviceId = "";
};
deviceId
of
type
DOMString
,
defaulting
to
""
When
the
value
of
this
dictionary
member
is
not
""
,
and
matches
the
id
previously
exposed
by
selectAudioOutput
in
an
earlier
session,
the
user
agent
MAY
opt
to
skip
prompting
the
user
in
favor
of
resolving
with
this
id
or
a
new
rotated
id
for
the
same
device,
assuming
that
device
is
currently
available.
Applications
that
wish
to
rely
on
user
agents
supporting
persisted
device
ids
must
pass
these
through
selectAudioOutput
successfully
before
they
will
work
with
setSinkId
.
The
reason
for
this
is
that
it
exposes
fingerprinting
information,
but
at
the
risk
of
prompting
the
user
if
the
device
is
not
available
or
the
user
agent
decides
not
to
honor
the
device
id.
This document extends the Web platform with the ability to direct audio output to non-default devices, when user permission is given. User permission is necessary because playing audio out of a non-default device may be unexpected behavior to the user, and may cause a nuisance. For example, suppose a user is in a library or other quiet public place where she is using a laptop with system audio directed to a USB headset. Her expectation is that the laptop’s audio is private and she will not disturb others. If any Web application can direct audio output through arbitrary output devices, a mischievous website may play loud audio out of the laptop’s external speakers without the user’s consent.
To prevent these kinds of nuisance scenarios, the user agent must acquire the user’s consent to access non-default audio output devices. This would prevent the library example outlined earlier, because the application would not be permitted to play out audio from the system speakers.
The specification adds no permission requirement to the default audio output device.
The
user
agent
may
explicitly
obtain
user
consent
to
play
audio
out
of
non-default
output
devices
using
selectAudioOutput
.
Implementations
MUST
also
support
implicit
consent
via
the
getUserMedia
()
permission
prompt;
when
an
audio
input
device
is
permitted
and
opened
via
getUserMedia
()
,
this
also
permits
access
to
any
associated
audio
output
devices
(i.e.,
those
with
the
same
groupId
).
This
conveniently
handles
the
common
case
of
wanting
to
route
both
input
and
output
audio
through
a
headset
or
speakerphone
device.
On
page
load,
run
the
following
step:
On
the
relevant
global
Upon
creation
of
a
MediaDevices
object
,
create
mediaDevices
,
initialize
mediaDevices
with
an
additional
internal
slot:
[[explicitlyGrantedAudioOutputDevices]]
,
used
to
store
devices
that
the
user
grants
explicitly
through
selectAudioOutput
,
initialized
to
an
empty
set.
set
.
This
specification
specifies
the
exposure
decision
algorithm
for
devices
other
than
camera
and
microphone
.
The
algorithm
runs
as
follows,
with
device
,
microphoneList
,
cameraList
and
cameraList
mediaDevices
as
input:
Let
document
be
the
current
settings
object
mediaDevices
's
relevant
global
object
's
associated
Document
.
Let
deviceInfo
be
a
new
MediaDeviceInfo
object
to
represent
the
device.
If
document
is
not
allowed
to
use
the
feature
identified
by
"speaker-selection"
,
or
deviceInfo
.
kind
is
not
"
audiooutput
",
return
false
.
If
deviceInfo
.
deviceId
is
in
[[explicitlyGrantedAudioOutputDevices]]
,
return
true
.
If
deviceInfo
.
groupId
is
the
same
as
the
groupId
of
any
microphone
in
microphoneList
,
return
true
.
return
false
.
The Audio Output Devices API is a powerful feature that is identified by the name "speaker-selection".
It defines the following types and algorithms:
A
permission
covers
access
to
the
device
given
in
the
associated
DevicePermissionDescriptor
descriptor.
If
the
descriptor
does
not
have
a
deviceId
,
its
semantic
is
that
it
queries
for
access
to
all
devices
of
that
class.
Thus,
if
a
query
for
the
"speaker-selection"
powerful
feature
with
no
deviceId
returns
"
granted
",
the
client
knows
that
there
will
not
be
a
permission
prompt
for
any
audio
output
device
known
to
it,
if
requested
using
the
deviceId
option
to
selectAudioOutput
,
and
if
"
denied
"
is
returned,
it
knows
that
no
selectAudioOutput
request
for
an
audio
output
device
will
succeed.
If
a
permission
state
is
present
for
access
to
some,
but
not
all,
audio
output
devices,
a
query
without
the
deviceId
will
return
"
prompt
".
deviceId
values
for
the
devices
the
user
has
made
a
non-default
decision
on
access
to.
status
.state
to
permissionDesc
's
permission
state
and
terminate
these
steps.
deviceId
member
removed.
status
.state
to
global
's
permission
state
.
name
and
deviceId
as
arguments.
If
the
descriptor
does
not
have
a
deviceId
,
then
undefined
is
passed
in
place
of
deviceId
.
This
specification
defines
one
policy-controlled
feature
identified
by
the
string
"speaker-selection"
.
It
has
a
default
allowlist
of
"self"
.
A
document
's
permissions
policy
determines
whether
any
content
in
that
document
is
allowed
to
use
selectAudioOutput
to
prompt
the
user
for
an
audio
output
device,
or
allowed
to
use
setSinkId
to
change
the
device
through
which
audio
output
should
be
rendered,
to
a
non-system-default
user-permitted
device.
For
selectAudioOutput
this
is
enforced
by
the
prompt
the
user
to
choose
algorithm.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY and MUST in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [ WEBIDL ], as this specification uses that specification and terminology.
The following people have contributed directly to the development of this specification: Harald Alvestrand, Rick Byers, Dominique Hazael-Massieux (via the HTML5Apps project), Philip Jägenstedt, Victoria Kirst, Shijun Sun, Martin Thomson, Chris Wilson.
Referenced in:
Referenced in:
Referenced in: