1. Introduction
As Virtual Reality and Augmented Reality becomes more prevalent, new features are being introduced by the native APIs that enable accessing more detailed information about the environment in which the user is located. Depth Sensing API brings one such capability to WebXR Device API, allowing authors of WebXR-powered experiences to obtain information about the distance from the user’s device to the real world geometry in the user’s environment.
This
document
assumes
readers'
familiarity
with
WebXR
Device
API
and
WebXR
Augmented
Reality
Module
specifications,
as
it
builds
on
top
of
them
to
provide
additional
features
to
XRSessions
.
1.1. Terminology
This document uses the acronyms AR to signify Augmented Reality, and VR to signify Virtual Reality.
This
document
uses
terms
like
"depth
buffer",
"depth
buffer
data"
and
"depth
data"
interchangably
when
referring
to
an
array
of
bytes
containing
depth
information,
either
returned
by
the
XR
device,
or
returned
by
the
API
itself.
More
information
about
the
specific
contents
of
the
depth
buffer
can
be
found
in
data
and
texture
entries
of
the
specification.
This document uses the term normalized view coordinates when referring to a coordinate system that has an origin in the top left corner of the view, with X axis growing to the right, and Y axis growing downward.
2. Initialization
2.1. Feature descriptor
The applications can request that depth sensing be enabled on an XRSession by passing an appropriate feature descriptor . This module introduces new string - depth-sensing , as a new valid feature descriptor for depth sensing feature.
A device is capable of supporting the depth sensing feature if the device exposes native depth sensing capability. The inline XR device MUST NOT be treated as capable of supporting the depth sensing feature.
The
depth
sensing
feature
is
subject
to
feature
policy
and
requires
"xr-spatial-tracking"
policy
to
be
allowed
on
the
requesting
document’s
origin.
2.2. Intended depth type, data usage, and data formats
enum {XRDepthType "raw" ,"smooth" , };
-
An usage of
"raw"indicates that no additional processing of the depth data should be done. -
An usage of
"smooth"indicates that the runtime should perform additional processing on the depth texture to remove potential noise.
enum {XRDepthUsage "cpu-optimized" ,"gpu-optimized" , };
-
An usage of
"cpu-optimized"indicates that the depth data is intended to be used on the CPU, by interacting withXRCPUDepthInformationinterface. -
An usage of
"gpu-optimized"indicates that the depth data is intended to be used on the GPU, by interacting withXRWebGLDepthInformationinterface.
enum {XRDepthDataFormat "luminance-alpha" ,"float32" ,"unsigned-short" , };
-
A data format of
"luminance-alpha"or"unsigned-short"indicates that items in the depth data buffers obtained from the API are 16 bit unsigned integer values. -
A data format
"float32"indicates that items in the depth data buffers obtained from the API are 32 bit floating point values.
The following table summarizes the ways various data formats can be consumed:
| Data format |
GLenum
value
equivalent
| Size of depth buffer entry | Usage on CPU | Usage on GPU |
|---|---|---|---|---|
"luminance-alpha"
| LUMINANCE_ALPHA | 2 times 8 bit |
Interpret
data
as
Uint16Array
| Inspect Luminance and Alpha channels to reassemble single value. |
"float32"
| R32F | 32 bit |
Interpret
data
as
Float32Array
| Inspect Red channel and use the value. |
"unsigned-short"
| R16UI | 16 bit |
Interpret
data
as
Uint16Array
| Inspect Red channel and use the value. |
2.3. Session configuration
dictionary {XRDepthStateInit required sequence <XRDepthUsage >;usagePreference required sequence <XRDepthDataFormat >;dataFormatPreference sequence <XRDepthType >;depthTypeRequest boolean =matchDepthView true ; };
The
usagePreference
is
an
ordered
sequence
of
XRDepthUsage
s,
used
to
describe
the
desired
depth
sensing
usage
for
the
session.
The
dataFormatPreference
is
an
ordered
sequence
of
XRDepthDataFormat
s,
used
to
describe
the
desired
depth
sensing
data
format
for
the
session.
The
depthTypeRequest
is
an
ordered
sequence
of
XRDepthType
s,
used
to
describe
the
desired
depth
sensing
type
for
the
session.
This
request
MAY
be
ignored
by
the
user
agent.
The
matchDepthView
requests
that
view
of
the
depth
information
must
MUST
be
in
sync
aligned
with
the
XRView
.
If
this
is
true
,
the
XRSystem
SHOULD
return
depth
information
that
reflects
the
current
frame.
If
this
is
false
,
the
XRSystem
MAY
return
depth
information
that
was
captured
at
an
earlier
point
in
time.
NOTE:
If
matchDepthView
is
false
,
the
author
SHOULD
do
the
reprojection
using
the
view
from
XRDepthInformation
.
The
XRSessionInit
dictionary
is
expanded
by
adding
new
depthSensing
key.
The
key
is
optional
in
XRSessionInit
,
but
it
MUST
be
provided
when
depth-sensing
is
included
in
either
requiredFeatures
or
optionalFeatures
.
partial dictionary XRSessionInit {XRDepthStateInit ; };depthSensing
If
the
depth
sensing
feature
is
a
required
feature
but
the
application
did
not
supply
a
depthSensing
key,
the
user
agent
MUST
treat
this
as
an
unresolved
required
feature
and
reject
the
requestSession(mode,
options)
promise
with
a
NotSupportedError
.
If
it
was
requested
as
an
optional
feature,
the
user
agent
MUST
ignore
the
feature
request
and
not
enable
depth
sensing
on
the
newly
created
session.
If
the
depth
sensing
feature
is
a
required
feature
but
the
result
of
finding
supported
configuration
combination
algorithm
invoked
with
XRDepthStateInit
is
null
,
the
user
agent
MUST
treat
this
as
an
unresolved
required
feature
and
reject
the
requestSession(mode,
options)
promise
with
a
NotSupportedError
.
If
it
was
requested
as
an
optional
feature,
the
user
agent
MUST
ignore
the
feature
request
and
not
enable
depth
sensing
on
the
newly
created
session.
When
an
XRSession
is
created
with
depth
sensing
enabled,
the
depthUsage
,
depthDataFormat
,
and
depthType
attributes
MUST
be
set
to
the
result
of
finding
supported
configuration
combination
algorithm
invoked
with
XRDepthStateInit
.
Note: The intention of the algorithm is to process preferences from most restrictive to least restrictive. Thus, we begin processing items where only a single item is indicated, then multiple, and finally where no preference is indicated.
-
Let depthTypeRequest be the value contained by the
depthTypeRequestkey in depthStateInit if it is set, and an empty sequence otherwise. -
Let selectedType be
null -
Let usagePreference be the value contained by the
usagePreferencekey in depthStateInit -
Let selectedUsage be
null. -
Let dataFormatPreference be the value contained by the
dataFormatPreferencekey in depthStateInit -
Let selectedDataFormat be
null. -
Let processingOrder be the sequence of (preferences, selection) pairs, where selection is a reference to one of the variables introduced in the previous steps: [( depthTypeRequest , selectedType ), ( usagePreference , selectedUsage ),( dataFormatPreference , selectedDataFormat )]
-
For each ( preferences , selection ) in processingOrder perform the following steps
-
If preferences contains only a single value, set selection to that value.
-
-
For each ( preferences , selection ) in processingOrder perform the following steps:
-
If selection is not
null, continue to the next entry. -
If the preferences sequence is empty, continue to the next entry.
-
For each preference in preferences , preform the following steps:
-
If preference with the other values of selectedType , selectedUsage , selectedDataFormat is not considered a supported depth sensing configuration by the native depth sensing capabilities of the device, continue to the next entry.
-
Set selection to preference and abort these nested steps.
-
-
For each ( preferences , selection ) in processingOrder perform the following steps:
-
If selection is not
null, continue to the next entry. -
Set selection to the value determined by the preferred native depth sensing capability with the other values of selectedType , selectedUsage , selectedDataFormat .
-
-
If any of selectedType , selectedUsage , selectedDataFormat are
null, returnnulland abort these steps. -
If selectedType , selectedUsage , selectedDataFormat is considered a supported depth sensing configuration by the native depth sensing capabilities of the device return the depth sensing configuration of selectedType , selectedUsage , selectedDataFormat and abort these steps.
-
If depthTypeRequest is not an empty list, set it to an empty list and repeat these steps.
-
Return
nulland abort these steps.
Note: user agents are not required to support all existing combinations of usages and data formats. This is intended to allow them to provide data in an efficient way, and depends on the underlying platforms. This decision places additional burden on the application developers - it could be mitigated by creation of libraries hiding the API complexity, possibly sacrificing performance.
The
user
agent
that
is
capable
of
supporting
the
depth
sensing
API
MUST
support
at
least
one
XRDepthUsage
mode.
The
user
agent
that
is
capable
of
supporting
the
depth
sensing
API
MUST
support
"luminance-alpha"
data
format,
and
MAY
support
other
formats.
const session= await navigator. xr. requestSession( "immersive-ar" , { requiredFeatures: [ "depth-sensing" ], depthSensing: { usagePreference: [ "cpu-optimized" , "gpu-optimized" ], dataFormatPreference: [ "luminance-alpha" , "float32" ], }, });
partial interface XRSession {readonly attribute XRDepthUsage ;depthUsage readonly attribute XRDepthDataFormat ;depthDataFormat readonly attribute XRDepthType ?; };depthType
The
depthUsage
describes
depth
sensing
usage
with
which
the
session
was
configured.
If
this
attribute
is
accessed
on
a
session
that
does
not
have
depth
sensing
enabled,
the
user
agent
MUST
throw
an
InvalidStateError
.
The
depthDataFormat
describes
depth
sensing
data
format
with
which
the
session
was
configured.
If
this
attribute
is
accessed
on
a
session
that
does
not
have
depth
sensing
enabled,
the
user
agent
MUST
throw
an
InvalidStateError
.
The
depthType
describes
the
depth
sensing
type
with
which
the
session
was
configured.
If
this
attribute
is
accessed
on
a
session
that
does
not
have
depth
sensing
enabled,
the
user
agent
MUST
throw
an
InvalidStateError
.
If
the
runtime
only
supports
a
single
XRDepthType
or
otherwise
ignored
{XRDepthStateInit/depthTypeRequest}}
this
may
return
null
.
3. Obtaining depth data
3.1. XRDepthInformation
[SecureContext ,Exposed =Window ]interface {XRDepthInformation readonly attribute unsigned long ;width readonly attribute unsigned long ; [height SameObject ]readonly attribute XRRigidTransform ;normDepthBufferFromNormView readonly attribute float ;rawValueToMeters ;};XRDepthInformation includes XRViewGeometry ;
The
width
attribute
contains
width
of
the
depth
buffer
(i.e.
number
of
columns).
The
height
attribute
contains
height
of
the
depth
buffer
(i.e.
number
of
rows).
The
normDepthBufferFromNormView
attribute
contains
a
XRRigidTransform
that
needs
to
be
applied
when
indexing
into
the
depth
buffer
.
The
transformation
that
the
matrix
represents
changes
the
coordinate
system
from
normalized
view
coordinates
to
normalized
depth
buffer
coordinates
that
can
then
be
scaled
by
depth
buffer’s
width
and
height
to
obtain
the
absolute
depth
buffer
coordinates.
Note: if the applications intend to use the resulting depth buffer for texturing a mesh, care must be taken to ensure that the texture coordinates of the mesh vertices are expressed in normalized view coordinates , or that the appropriate coordniate system change is peformed in a shader.
The
rawValueToMeters
attribute
contains
the
scale
factor
by
which
the
raw
depth
values
from
a
depth
buffer
must
be
multiplied
in
order
to
get
the
depth
in
meters.
The
optional
view
transform
attribute
contains
is
given
in
the
XRView
associated
view
that
was
active
when
’s
reference
space
.
If
the
sensor
is
aligned
with
the
associated
view
,
the
XRSystem
transform
calculated
the
and
MUST
be
equivalent
to
the
identity
transform.
These
attributes
MAY
otherwise
be
used
by
experiences
to
better
align
with
the
real
world.
XRDepthInformation
projectionMatrix
.
This
attribute
If
the
Each
has
an
associated
view
which
is
view
XRDepthInformation
not
provided,
the
user
MUST
assume
it
is
the
same
as
the
one
from
the
current
closest
XRFrame
XRView
’s
to
the
sensor
,
and
is
used
to
retrieve
the
.
XRViewerPose
XRDepthInformation
Each
XRDepthInformation
has
an
associated
view
sensor
that
reflects
is
the
containing
object
for
the
from
which
the
depth
information
XRView
XRViewGeometry
instance
was
created.
obtained.
Each
XRDepthInformation
has
an
associated
depth
buffer
that
contains
depth
buffer
data.
Different
XRDepthInformation
s
may
store
objects
of
different
concrete
types
in
the
depth
buffer.
When
attempting
to
access
the
depth
buffer
of
XRDepthInformation
or
any
interface
that
inherits
from
it,
the
user
agent
MUST
run
the
following
steps:
-
Let depthInformation be the instance whose member is accessed.
-
Let view be the depthInformation ’s view .
-
Let frame be the view ’s frame .
-
If frame is not active , throw
InvalidStateErrorand abort these steps. -
If frame is not an animationFrame , throw
InvalidStateErrorand abort these steps. -
Proceed with normal steps required to access the member of depthInformation .
3.2. XRCPUDepthInformation
[Exposed =Window ]{interface :XRCPUDepthInformation XRDepthInformation { [SameObject ]readonly attribute ArrayBuffer ;data float (getDepthInMeters float ,x float ); };y
The
data
attribute
contains
depth
buffer
information
in
raw
format,
suitable
for
uploading
to
a
WebGL
texture
if
needed.
The
data
is
stored
in
row-major
format,
without
padding,
with
each
entry
corresponding
to
distance
from
the
view
sensor
’s
near
plane
to
the
users'
environment,
in
unspecified
units.
The
size
of
each
data
entry
and
the
type
is
determined
by
depthDataFormat
.
The
values
can
be
converted
from
unspecified
units
to
meters
by
multiplying
them
by
rawValueToMeters
.
The
normDepthBufferFromNormView
can
be
used
to
transform
from
normalized
view
coordinates
into
depth
buffer’s
coordinate
system.
When
accessed,
the
algorithm
to
access
the
depth
buffer
MUST
be
run.
Note:
Applications
SHOULD
NOT
attempt
to
change
the
contents
of
data
array
as
this
can
lead
to
incorrect
results
returned
by
the
getDepthInMeters(x,
y)
method.
The
getDepthInMeters(x,
y)
method
can
be
used
to
obtain
depth
at
coordinates
.
When
invoked,
the
algorithm
to
access
the
depth
buffer
MUST
be
run.
When
getDepthInMeters(x,
y)
method
is
invoked
on
an
XRCPUDepthInformation
depthInformation
with
x
,
y
,
the
user
agent
MUST
obtain
depth
at
coordinates
by
running
the
following
steps:
-
Let view be the depthInformation ’s view , frame be view ’s frame , and session be the frame ’s
session. -
If x is greater than
1.0or less than0.0, throwRangeErrorand abort these steps. -
If y is greater than
1.0or less than0.0, throwRangeErrorand abort these steps. -
Let normalizedViewCoordinates be a vector representing 3-dimensional point in space, with
xcoordinate set to x ,ycoordinate set to y ,zcoordinate set to0.0, andwcoordinate set to1.0. -
Let normalizedDepthCoordinates be a result of premultiplying normalizedViewCoordinates vector from the left by depthInformation ’s
normDepthBufferFromNormView. -
Let depthCoordinates be a result of scaling normalizedDepthCoordinates , with
xcoordinate multiplied by depthInformation ’swidthandycoordinate multiplied by depthInformation ’sheight. -
Let column be the value of depthCoordinates ’
xcoordinate, truncated to an integer, and clamped to[0, width-1]integer range. -
Let row be the value of depthCoordinates ’
ycoordinate, truncated to an integer, and clamped to[0, height-1]integer range. -
Let index be equal to row multiplied by
width& added to column . -
Let byteIndex be equal to index multiplied by the size of the depth data format.
-
Let rawDepth be equal to a value found at index byteIndex in
data, interpreted as a number accordingly to session ’sdepthDataFormat. -
Let rawValueToMeters be equal to depthInformation ’s
rawValueToMeters. -
Return rawDepth multiplied by rawValueToMeters .
partial interface XRFrame {XRCPUDepthInformation ?getDepthInformation (XRView ); };view
The
getDepthInformation(view)
method,
when
invoked
on
an
XRFrame
,
signals
that
the
application
wants
to
obtain
CPU
depth
information
relevant
for
the
frame.
When
getDepthInformation(view)
method
is
invoked
on
an
XRFrame
frame
with
an
XRView
view
,
the
user
agent
MUST
obtain
CPU
depth
information
by
running
the
following
steps:
-
Let session be frame ’s
session. -
If depth-sensing feature descriptor is not contained in the session ’s XR device ’s list of enabled features for session ’s mode , throw a
NotSupportedErrorand abort these steps. -
If frame ’s active boolean is
false, throw anInvalidStateErrorand abort these steps. -
If frame ’s animationFrame boolean is
false, throw anInvalidStateErrorand abort these steps. -
If frame does not match view ’s frame , throw an
InvalidStateErrorand abort these steps. -
If the session ’s
depthUsageis not"cpu-optimized", throw anInvalidStateErrorand abort these steps. -
Let depthInformation be a result of creating a CPU depth information instance given frame and view .
-
Return depthInformation .
In
order
to
create
a
CPU
depth
information
instance
given
XRFrame
frame
and
XRView
view
,
the
user
agent
MUST
run
the
following
steps:
-
Let result be a new instance of
XRCPUDepthInformation. -
Let time be frame ’s time .
-
Let session be frame ’s
session. -
Let device be the session ’s XR device .
-
Let nativeDepthInformation be a result of querying device for the depth information valid as of time , for specified view , taking into account session ’s
depthType,depthUsage, anddepthDataFormat. -
If nativeDepthInformation is
null, returnnulland abort these steps. -
If the depth buffer present in nativeDepthInformation meets user agent’s criteria to block access to the depth data, return
nulland abort these steps. -
If the depth buffer present in nativeDepthInformation meets user agent’s criteria to limit the amount of information available in depth buffer, adjust the depth buffer accordingly.
-
Initialize result ’s
widthto the width of the depth buffer returned in nativeDepthInformation . -
Initialize result ’s
heightto the height of the depth buffer returned in nativeDepthInformation . -
Initialize result ’s
normDepthBufferFromNormViewto a newXRRigidTransform, based on nativeDepthInformation ’s depth coordinates transformation matrix . -
Initialize result ’s
datato the raw depth buffer returned in nativeDepthInformation . -
Initialize result ’s view to view .
-
Initialize result ’s
transformto the sensor ’s pose at time in view ’s reference space . Return result .
XRFrameRequestCallback
.
It
is
assumed
that
the
session
that
has
depth
sensing
enabled,
with
usage
set
to
"cpu-optimized"
and
data
format
set
to
"luminance-alpha":
const session= ...; // Session created with depth sensing enabled. const referenceSpace= ...; // Reference space created from the session. function requestAnimationFrameCallback( t, frame) { session. requestAnimationFrame( requestAnimationFrameCallback); const pose= frame. getViewerPose( referenceSpace); if ( pose) { for ( const viewof pose. views) { const depthInformation= frame. getDepthInformation( view); if ( depthInformation) { useCpuDepthInformation( view, depthInformation); } } } }
Once
the
XRCPUDepthInformation
is
obtained,
it
can
be
used
to
discover
a
distance
from
the
view
plane
to
user’s
environment
(see
§ 4
Interpreting
the
results
section
for
details).
The
below
code
demonstrates
obtaining
the
depth
at
normalized
view
coordinates
of
(0.25,
0.75):
function useCpuDepthInformation( view, depthInformation) { const depthInMeters= depthInformation. getDepthInMeters( 0.25 , 0.75 ); console. log( "Depth at normalized view coordinates (0.25, 0.75) is:" , depthInMeters); }
3.3. XRWebGLDepthInformation
[Exposed =Window ]{interface :XRWebGLDepthInformation XRDepthInformation { [SameObject ]readonly attribute WebGLTexture ;texture readonly attribute XRTextureType ;textureType readonly attribute unsigned long ?; };imageIndex
The
texture
attribute
contains
depth
buffer
information
as
an
opaque
texture
.
Each
texel
corresponds
to
distance
from
the
view
sensor
’s
near
plane
to
the
users'
environment,
in
unspecified
units.
The
size
of
each
data
entry
and
the
type
is
determined
by
depthDataFormat
.
The
values
can
be
converted
from
unspecified
units
to
meters
by
multiplying
them
by
rawValueToMeters
.
The
normDepthBufferFromNormView
can
be
used
to
transform
from
normalized
view
coordinates
into
depth
buffer’s
coordinate
system.
When
accessed,
the
algorithm
to
access
the
depth
buffer
of
XRDepthInformation
MUST
be
run.
The
textureType
attribute
describes
if
the
texture
is
of
type
TEXTURE_2D
or
TEXTURE_2D_ARRAY
.
The
imageIndex
attribute
returns
the
offset
into
the
texture
array.
It
MUST
be
defined
when
textureType
is
equal
to
TEXTURE_2D_ARRAY
and
MUST
be
undefined
if
it’s
TEXTURE_2D
.
partial interface XRWebGLBinding {);XRWebGLDepthInformation ?(getDepthInformation XRView ); };view
The
getDepthInformation(view)
method,
when
invoked
on
an
XRWebGLBinding
,
signals
that
the
application
wants
to
obtain
WebGL
depth
information
relevant
for
the
frame.
When
getDepthInformation(view)
method
is
invoked
on
a
XRWebGLBinding
binding
with
an
XRView
view
,
the
user
agent
MUST
obtain
WebGL
depth
information
by
running
the
following
steps:
-
Let session be binding ’s session .
-
Let frame be view ’s frame .
-
If session does not match frame ’s
session, throw anInvalidStateErrorand abort these steps. -
If depth-sensing feature descriptor is not contained in the session ’s XR device ’s list of enabled features for session ’s mode , throw a
NotSupportedErrorand abort these steps. -
If the session ’s
depthUsageis not"gpu-optimized", throw anInvalidStateErrorand abort these steps. -
If frame ’s active boolean is
false, throw anInvalidStateErrorand abort these steps. -
If frame ’s animationFrame boolean is
false, throw anInvalidStateErrorand abort these steps. -
Let depthInformation be a result of creating a WebGL depth information instance given frame and view .
-
Return depthInformation .
In
order
to
create
a
WebGL
depth
information
instance
given
XRFrame
frame
and
XRView
view
,
the
user
agent
MUST
run
the
following
steps:
-
Let result be a new instance of
XRWebGLDepthInformation. -
Initialize time as follows:
-
If
the
XRSessionwas created withmatchDepthViewset totrue: - Let time be frame ’s time .
- Otherwise
- Let time be time that the device captured the depth information.
-
If
the
-
Let session be frame ’s
session. -
Let device be the session ’s XR device .
-
Let nativeDepthInformation be a result of querying device ’s native depth sensing for the depth information valid as of time , for specified view , taking into account session ’s
depthType,depthUsage, anddepthDataFormat. -
If nativeDepthInformation is
null, returnnulland abort these steps. -
If the depth buffer present in nativeDepthInformation meets user agent’s criteria to block access to the depth data, return
nulland abort these steps. -
If the depth buffer present in nativeDepthInformation meets user agent’s criteria to limit the amount of information available in depth buffer, adjust the depth buffer accordingly.
-
Initialize result ’s
widthto the width of the depth buffer returned in nativeDepthInformation . -
Initialize result ’s
heightto the height of the depth buffer returned in nativeDepthInformation . -
Initialize result ’s
normDepthBufferFromNormViewto a newXRRigidTransform, based on nativeDepthInformation ’s depth coordinates transformation matrix . -
Initialize result ’s
textureto an opaque texture containing the depth buffer returned in nativeDepthInformation . -
Initialize result ’s view to
theview . Initialize result ’s
XRViewtransformcapturedto the sensor ’s pose at time.in view ’s reference space .-
Initialize result ’s
textureTypeas follows:-
If
the
result
’s
texturewas created with a textureType of texture-array : -
Initialize
result
’s
textureTypeto " texture-array ". - Otherwise
-
Initialize
result
’s
textureTypeto " texture ".
-
If
the
result
’s
-
Initialize result ’s
imageIndexas follows:-
If
textureTypeis texture -
Initialize
result
’s
imageIndextonull. -
Else
if
view
’s
eyeis"right" -
Initialize
result
’s
imageIndexto1. - Otherwise
-
Initialize
result
’s
imageIndexto0.
-
If
-
Return result .
XRFrameRequestCallback
.
It
is
assumed
that
the
session
that
has
depth
sensing
enabled,
with
usage
set
to
"gpu-optimized"
and
data
format
set
to
"luminance-alpha":
const session= ...; // Session created with depth sensing enabled. const referenceSpace= ...; // Reference space created from the session. const glBinding= ...; // XRWebGLBinding created from the session. function requestAnimationFrameCallback( t, frame) { session. requestAnimationFrame( requestAnimationFrameCallback); const pose= frame. getViewerPose( referenceSpace); if ( pose) { for ( const viewof pose. views) { const depthInformation= glBinding. getDepthInformation( view); if ( depthInformation) { useGpuDepthInformation( view, depthInformation); } } } }
Once
the
XRWebGLDepthInformation
is
obtained,
it
can
be
used
to
discover
a
distance
from
the
view
plane
to
user’s
environment
(see
§ 4
Interpreting
the
results
section
for
details).
The
below
code
demonstrates
how
the
data
can
transferred
to
the
shader:
const gl= ...; // GL context to use. const shaderProgram= ...; // Linked WebGLProgram. const programInfo= { uniformLocations: { depthTexture: gl. getUniformLocation( shaderProgram, 'uDepthTexture' ), uvTransform: gl. getUniformLocation( shaderProgram, 'uUvTransform' ), rawValueToMeters: gl. getUniformLocation( shaderProgram, 'uRawValueToMeters' ), } }; function useGpuDepthInformation( view, depthInformation) { // ... gl. bindTexture( gl. TEXTURE_2D, depthInformation. texture); gl. activeTexture( gl. TEXTURE0); gl. uniform1i( programInfo. uniformLocations. depthTexture, 0 ); gl. uniformMatrix4fv( programInfo. uniformLocations. uvTransform, false , depthData. normDepthBufferFromNormView. matrix); gl. uniform1f( programInfo. uniformLocations. rawValueToMeters, depthData. rawValueToMeters); // ... }
The fragment shader that makes use of the depth buffer can be for example:
precision mediump float ; uniform sampler2D uDepthTexture ; uniform mat4 uUvTransform ; uniform float uRawValueToMeters ; varying vec2 vTexCoord ; float DepthGetMeters ( in sampler2D depth_texture , in vec2 depth_uv ) { // Depth is packed into the luminance and alpha components of its texture. // The texture is a normalized format, storing millimeters. vec2 packedDepth = texture2D ( depth_texture , depth_uv ). ra ; return dot ( packedDepth , vec2 ( 255.0 , 256.0 * 255.0 )) * uRawValueToMeters ; } void main ( void ) { vec2 texCoord = ( uUvTransform * vec4 ( vTexCoord . xy , 0 , 1 )). xy ; float depthInMeters = DepthGetMeters ( uDepthTexture , texCoord ); gl_FragColor= ...; }
4. Interpreting the results
If a given pixel is determined to have invalid depth data or the depth data cannot otherwise be determined, the user agent MUST return a depth value of 0.
The
values
stored
in
data
and
texture
represent
distance
from
the
camera
plane
to
the
real-world-geometry
(as
understood
by
the
XR
system).
In
the
below
example,
the
depth
value
at
point
a
=
(x,
y)
corresponds
to
the
distance
of
point
A
to
the
camera
plane.
Specifically,
the
depth
value
does
NOT
represent
the
length
of
aA
vector.
The above image corresponds to the following code:
// depthInfo is of type XRCPUDepthInformation: const depthInMeters= depthInfo. getDepthInMeters( x, y);
5. Native device concepts
5.1. Native depth sensing
Depth
sensing
specification
assumes
that
the
native
device
on
top
of
which
the
depth
sensing
API
is
implemented
provides
a
way
to
query
device’s
native
depth
sensing
capabilities.
The
device
is
said
to
support
querying
device’s
native
depth
sensing
capabilities
if
it
exposes
a
way
of
obtaining
depth
buffer
data.
The
depth
buffer
data
MUST
contain
buffer
dimensions,
the
information
about
the
units
used
for
the
values
stored
in
the
buffer,
a
depth
coordinates
transformation
matrix
that
performs
a
coordinate
system
change
from
normalized
view
coordinates
to
normalized
depth
buffer
coordinates
and
optionally
a
view
that
contains
the
XRView
active
when
the
XRSystem
calculated
the
XRDepthInformation
.
coordinates.
This
transform
should
leave
z
coordinate
of
the
transformed
3D
vector
unaffected.
The
device
must
also
provide
some
mechanism
of
exposing
the
projection
matrix
and
transform
of
the
sensor
.
The
device
can
support
depth
sensing
type
in
2
ways.
If
the
device
simply
returns
estimated
depth
values
with
minimal
post-processing,
it
is
said
to
support
"raw"
depth
type.
If
the
device
or
runtime
can
apply
additional
processing
to
"smooth"
out
noise
from
this
data
(e.g.
into
larger
regions
of
the
same
depth
value),
it
is
said
to
support
"smooth"
depth
type.
Note: "Raw" depth data is often accompanied by confidence values. The UA can choose to treat depth data with a low confidence value as invalid depth data when returning such data to the page.
The
device
can
support
depth
sensing
usage
in
2
ways.
If
the
device
is
primarily
capable
of
returning
the
depth
data
through
CPU-accessible
memory,
it
is
said
to
support
"cpu-optimized"
usage.
If
the
device
is
primarily
capable
of
returning
the
depth
data
through
GPU-accessible
memory,
it
is
said
to
support
"gpu-optimized"
usage.
Note: The user agent can choose to support both usage modes (e.g. when the device is capable of providing both CPU- and GPU-accessible data, or by performing the transfer between CPU- and GPU-accessible data manually).
The
device
can
support
depth
sensing
data
format
given
depth
sensing
usage
and
type
in
the
following
ways.
If,
given
the
depth
sensing
usage
and
type,
the
device
is
able
to
return
depth
data
as
a
buffer
containing
16
bit
unsigned
integers,
it
is
said
to
support
the
"luminance-alpha"
and
"unsigned-short"
data
formats.
If,
given
the
depth
sensing
usage
and
type,
the
device
is
able
to
return
depth
data
as
a
buffer
containing
32
bit
floating
point
values,
it
is
said
to
support
the
"float32"
data
format.
A
depth
sensing
configuration
is
represented
by
a
combination
of
one
XRDepthType
,
one
XRDepthUsage
,
and
one
XRDepthDataFormat
.
The device is said to support the depth sensing configuration if it supports the depth sensing type , supports depth sensing usage , and supports depth sensing data format given the specified configuration.
Note: the support of depth sensing API is not limited only to hardware classified as AR-capable, although it is expected that the feature will be more common in such devices. VR devices that contain appropriate sensors and/or use other techniques to provide depth buffer should also be able to provide the data necessary to implement depth sensing API.
For
each
of
depthTypeRequest
,
usagePreference
,
and
dataFormatPreference
,
The
device
MUST
have
a
preferred
native
depth
sensing
capability
that
MUST
be
used
if
the
corresponding
array
is
empty.
The
type,
usage,
and
format
SHOULD
reflect
the
most
efficient
ones
of
the
device,
though
they
may
be
dependent
upon
each
other.
6. Privacy & Security Considerations
The depth sensing API provides additional information about users' environment to the websites, in the format of depth buffer. Given a depth buffer with sufficiently high resolution, and with sufficiently high precision, the websites could potentially learn more detailed information than the users are comfortable with. Depending on the underlying technology used, the depth data may be created based on camera image and IMU sensors.
In order to mitigate privacy risks to the users, user agents should seek user consent prior to enabling the depth sensing API on a session. In addition, as the depth sensing technologies & hardware improve, the user agents should consider limiting the amount of information exposed through the API, or blocking access to the data returned from the API if it is not feasible to introduce such limitations. To limit the amount the information, the user agents could for example reduce the resolution of the resulting depth buffer, or reduce the precision of values present in the depth buffer (for example by quantization). User agents that decide to limit the amount of data in such way will still be considered as implementing this specification.
In case the user agent is capable of providing depth buffers that are detailed enough that they become equivalent to information provided by the device’s cameras, it must first obtain user consent that is equivalent to consent needed to obtain camera access.
Changes
Changes from the First Public Working Draft 31 August 2021
7. Acknowledgements
The following individuals have contributed to the design of the WebXR Depth Sensing specification: