1. Introduction
This API exposes the poses of each of the users' hand skeleton joint s. This can be used to do gesture detection or to render a hand model in VR scenarios.
2. Initialization
If an application wants to view articulated hand pose information during a session, the session MUST be requested with an appropriate feature descriptor . The string " hand-tracking " is introduced by this module as a new valid feature descriptor for articulated hand tracking.
The
"
hand-tracking
"
feature
descriptor
should
only
be
granted
for
an
XRSession
its
XR
device
has
physical
hand
input
sources
that
supports
hand
tracking
.
The
user
agent
MAY
gate
support
for
hand
based
XRInputSources
based
upon
this
feature
descriptor
.
NOTE:
This
means
that
if
an
XRSession
does
not
request
the
"
hand-tracking
"
feature
descriptor
,
the
user
agent
may
choose
to
not
support
input
controllers
that
are
hand
based.
3. Physical Hand Input Sources
An
XRInputSource
is
a
physical
hand
input
source
if
it
tracks
a
physical
hand.
A
physical
hand
input
source
supports
hand
tracking
if
it
supports
reporting
the
poses
of
one
or
more
skeleton
joint
s
defined
in
this
specification.
Physical
hand
input
sources
MUST
include
the
input
profile
name
of
"generic-hand-select"
in
their
profiles
.
For many physical hand input sources , there can be overlap between the gestures used for the primary action and the squeeze action. For example, a pinch gesture may indicate both a "select" and "squeeze" event, depending on whether you are interacting with nearby or far away objects. Since content may assume that these are independent events, user agents MAY, instead of surfacing the squeeze action as the primary squeeze action , surface it as an additional "grasp button", using an input profile derived from the "generic-hand-select-grasp" profile.
3.1. XRInputSource
partial interface XRInputSource {readonly attribute XRHand ?hand ; };
The
hand
attribute
on
a
physical
hand
input
source
that
supports
hand
tracking
will
be
an
XRHand
object
giving
access
to
the
underlying
hand-tracking
capabilities.
hand
will
have
its
input
source
set
to
this
.
If
the
XRInputSource
belongs
to
an
XRSession
that
has
not
been
requested
with
the
"
hand-tracking
"
feature
descriptor
,
hand
MUST
be
null
.
3.2. Skeleton Joints
A physical hand input source is made up of many skeleton joints .
A skeleton joint for a given hand can be uniquely identified by a skeleton joint index , which is a nonnegative integer.
A
skeleton
joint
may
have
an
associated
bone
that
it
is
named
after
and
used
to
orient
its
-Z
axis.
The
associated
bone
of
a
skeleton
joint
is
the
bone
that
comes
after
the
joint
when
moving
towards
the
fingertips.
The
tip
and
wrist
joints
have
no
associated
bones
.
A skeleton joint has a radius which is the radius of a sphere placed at its center so that it roughly touches the skin on both sides of the hand.
This specification defines the following skeleton joints :
Skeleton joint | Skeleton joint index | |
---|---|---|
Wrist | 0 | |
Thumb | Metacarpal | 1 |
Proximal Phalanx | 2 | |
Distal Phalanx | 3 | |
Tip | 4 | |
Index finger | Metacarpal | 5 |
Proximal Phalanx | 6 | |
Intermediate Phalanx | 7 | |
Distal Phalanx | 8 | |
Tip | 9 | |
Middle finger | Metacarpal | 10 |
Proximal Phalanx | 11 | |
Intermediate Phalanx | 12 | |
Distal Phalanx | 13 | |
Tip | 14 | |
Ring finger | Metacarpal | 15 |
Proximal Phalanx | 16 | |
Intermediate Phalanx | 17 | |
Distal Phalanx | 18 | |
Tip | 9 | |
Little finger | Metacarpal | 20 |
Proximal Phalanx | 21 | |
Intermediate Phalanx | 22 | |
Distal Phalanx | 23 | |
Tip | 24 |
3.3. XRHand
[Exposed =Window ]interface {
XRHand iterable <XRJointSpace >;readonly attribute unsigned long length ;getter XRJointSpace (
joint unsigned long );
jointIndex const unsigned long = 0;
WRIST const unsigned long = 1;
THUMB_METACARPAL const unsigned long = 2;
THUMB_PHALANX_PROXIMAL const unsigned long = 3;
THUMB_PHALANX_DISTAL const unsigned long = 4;
THUMB_PHALANX_TIP const unsigned long = 5;
INDEX_METACARPAL const unsigned long = 6;
INDEX_PHALANX_PROXIMAL const unsigned long = 7;
INDEX_PHALANX_INTERMEDIATE const unsigned long = 8;
INDEX_PHALANX_DISTAL const unsigned long = 9;
INDEX_PHALANX_TIP const unsigned long = 10;
MIDDLE_METACARPAL const unsigned long = 11;
MIDDLE_PHALANX_PROXIMAL const unsigned long = 12;
MIDDLE_PHALANX_INTERMEDIATE const unsigned long = 13;
MIDDLE_PHALANX_DISTAL const unsigned long = 14;
MIDDLE_PHALANX_TIP const unsigned long = 15;
RING_METACARPAL const unsigned long = 16;
RING_PHALANX_PROXIMAL const unsigned long = 17;
RING_PHALANX_INTERMEDIATE const unsigned long = 18;
RING_PHALANX_DISTAL const unsigned long = 19;
RING_PHALANX_TIP const unsigned long = 20;
LITTLE_METACARPAL const unsigned long = 21;
LITTLE_PHALANX_PROXIMAL const unsigned long = 22;
LITTLE_PHALANX_INTERMEDIATE const unsigned long = 23;
LITTLE_PHALANX_DISTAL const unsigned long = 24; };
LITTLE_PHALANX_TIP
Every
XRHand
has
an
associated
input
source
,
which
is
the
physical
hand
input
source
that
it
tracks.
Each
XRHand
has
a
list
of
joint
spaces
which
is
a
list
of
XRJointSpace
s
corresponding
to
each
skeleton
joint
defined
in
this
specification.
These
all
will
have
their
hand
set
to
this
.
If an individual device does not support a joint defined in this specification, it MUST emulate it instead.
The list of joint spaces MUST NOT change over the course of a session.
The
length
attribute
MUST
return
the
number
25
joint(
jointIndex
)
getter
when
invoked
runs
the
following
steps:
-
Look for an
XRJointSpace
in this 's list of joint spaces with joint index corresponding to jointIndex . -
Handle the result of the search as follows:
- If found:
-
Return
the
XRJointSpace
. - Otherwise:
-
Return
null
3.4. XRJointSpace
[Exposed =Window ]interface :
XRJointSpace XRSpace {};
The
native
origin
of
an
XRJointSpace
is
the
position
and
orientation
of
the
underlying
joint
.
The
native
origin
of
the
XRJointSpace
may
only
be
reported
when
native
origins
of
all
other
XRJointSpace
s
on
the
same
hand
are
being
reported.
When
a
hand
is
partially
obscured
the
user
agent
MAY
emulate
the
obscured
joints,
or
it
MAY
report
null
poses
for
all
of
the
joints.
Note: This means that when fetching poses you will either get an entire hand or none of it.
This by default precludes faithfully exposing polydactyl/oligodactyl hands, however for fingerprinting concerns it will likely need to be a separate opt-in, anyway. See Issue 11 for more details.
The
native
origin
has
its
-Y
direction
pointing
perpendicular
to
the
skin,
outwards
from
the
palm,
and
-Z
direction
pointing
along
their
associated
bone,
away
from
the
wrist.
For
tip
skeleton
joints
where
there
is
no
associated
bone
,
the
-Z
direction
is
the
same
as
that
for
the
associated
distal
joint,
i.e.
the
direction
is
along
that
of
the
previous
bone.
For
wrist
skeleton
joints
the
-Z
direction
SHOULD
point
roughly
towards
the
center
of
the
palm.
Every
XRJointSpace
has
an
associated
hand
,
which
is
the
XRHand
that
created
it.
Every
XRJointSpace
has
an
associated
joint
index
,
which
is
the
joint
index
corresponding
to
the
joint
it
tracks.
Every
XRJointSpace
has
an
associated
joint
,
which
is
skeleton
joint
corresponding
to
its
joint
index
.
4. Frame Loop
4.1. XRFrame
partial interface XRFrame {XRJointPose ?getJointPose (XRJointSpace ,
joint XRSpace );
baseSpace boolean fillJointRadii (sequence <XRJointSpace >,
jointSpaces Float32Array );
radii boolean fillPoses (sequence <XRSpace >,
spaces XRSpace ,
baseSpace Float32Array ); };
transforms
The
getJointPose(XRJointSpace
joint
,
XRSpace
baseSpace
)
method
provides
the
pose
of
joint
relative
to
baseSpace
as
an
XRJointPose
,
at
the
XRFrame
's
time
.
When this method is invoked, the user agent MUST run the following steps:
-
Let frame be this .
-
Let session be frame ’s
session
object. -
If frame ’s active boolean is
false
, throw anInvalidStateError
and abort these steps. -
If baseSpace ’s session or joint ’s session are different from this
session
, throw anInvalidStateError
and abort these steps. -
Let pose be a new
XRJointPose
object in the relevant realm of session . -
Populate the pose of joint in baseSpace at the time represented by frame into pose , with
force emulation
set tofalse
. -
If pose is
null
returnnull
. -
Set pose ’s
radius
to the radius of joint , emulating it if necessary. -
Return pose .
The
fillJointRadii(sequence<XRJointSpace>
jointSpaces
,
Float32Array
radii
)
method
populates
radii
with
the
radii
of
the
jointSpaces
,
and
returns
a
boolean
indicating
whether
all
of
the
spaces
have
a
valid
pose.
When
this
method
is
invoked
on
an
XRFrame
frame
,
the
user
agent
MUST
run
the
following
steps:
-
Let frame be this .
-
Let session be frame ’s
session
object. -
If frame ’s active boolean is
false
, throw anInvalidStateError
and abort these steps. -
For each joint in the jointSpaces :
-
If joint ’s session is different from session , throw an
InvalidStateError
and abort these steps.
-
-
If the length of jointSpaces is larger than the number of elements in radii , throw a
TypeError
and abort these steps. -
let offset be a new number with the initial value of
0
. -
Let allValid be
true
. -
For each joint in the jointSpaces :
-
Set the float value of radii at offset to
NaN
. -
If the user agent can determine the pose of joint , set the float value of radii at offset to that radius .
-
If the user agent cannot determine the pose of joint , set allValid to
false
. -
Increase offset by
1
.
-
-
Return allValid .
The
fillPoses(sequence<XRSpace>
spaces
,
XRSpace
baseSpace
,
Float32Array
transforms
)
method
populates
transforms
with
the
matrices
of
the
poses
of
the
spaces
relative
to
the
baseSpace
,
and
returns
a
boolean
indicating
whether
all
of
the
spaces
have
a
valid
pose.
When
this
method
is
invoked
on
an
XRFrame
frame
,
the
user
agent
MUST
run
the
following
steps:
-
Let frame be this .
-
Let session be frame ’s
session
object. -
If frame ’s active boolean is
false
, throw anInvalidStateError
and abort these steps. -
For each space in the spaces sequence:
-
If space ’s session is different from session , throw an
InvalidStateError
and abort these steps.
-
-
If baseSpace ’s session is different from session , throw an
InvalidStateError
and abort these steps. -
If the length of spaces multiplied by
16
is larger than the number of elements in transforms , throw aTypeError
and abort these steps. -
let offset be a new number with the initial value of
0
. -
Initialize pose as follows:
-
If
fillPoses()
was called previously, the user agent MAY: - Let pose be the same object as used by an earlier call.
- Otherwise
-
Let
pose
be
a
new
XRPose
object in the relevant realm of session .
-
If
-
Let allValid be
true
. -
For each space in the spaces sequence:
-
Populate the pose of space in baseSpace at the time represented by frame into pose .
-
If pose is
null
, perform the following steps: -
Set
16
consecutive elements of the transforms array starting at offset toNaN
. -
Set allValid to
false
. -
If pose is not
null
, copy all elements from pose ’smatrix
member to the transforms array starting at offset . -
Increase offset by
16
.
-
-
Return allValid .
4.2. XRJointPose
An
XRJointPose
is
an
XRPose
with
additional
information
about
the
size
of
the
skeleton
joint
it
represents.
[Exposed =Window ]interface :
XRJointPose XRPose {readonly attribute float radius ; };
The
radius
attribute
returns
the
radius
of
the
skeleton
joint
in
meters.
The
user-agent
MUST
set
radius
to
an
emulated
value
if
the
XR
device
does
not
have
the
capability
of
determining
this
value,
either
in
general
or
in
the
current
animation
frame
(e.g.
when
the
skeleton
joint
is
partially
obscured).
5. Privacy & Security Considerations
The WebXR Hand Input API is a powerful feature with that carries significant privacy risks.Since this feature returns new sensor data, the User Agent MUST ask for explicit consent from the user at session creation time.
Data returned from this API, MUST NOT be so specific that one can detect individual users. If the underlying hardware returns data that is too precise, the User Agent MUST anonymize this data (ie by adding noise or rounding) before revealing it through the WebXR Hand Input API.
This
API
is
only
supported
in
XRSessions
created
with
XRSessionMode
of
"immersive-vr"
or
"immersive-ar"
.
"inline"
sessions
MUST
not
support
this
API.