1. Introduction
The Web Neural Network API defines a web-friendly hardware-agnostic abstraction layer that makes use of Machine Learning capabilities of operating systems and underlying hardware platforms without being tied to platform-specific capabilities. The abstraction layer addresses the requirements of key Machine Learning JavaScript frameworks and also allows web developers familiar with the ML domain to write custom code without the help of libraries.
For an illustrated introduction, please see the explainer .
2. Use cases
2.1. Application Use Cases
This section illustrates application-level use cases for neural network inference hardware acceleration. All applications in those use cases can be built on top of pre-trained deep neural network (DNN) [models] .
Note: Please be aware that some of the use cases described here, are by their very nature, privacy-invasive. Developers who are planning to use the API for such use cases should ensure that the API is being used to benefit users, for purposes that users understand, and approve. They should apply the Ethical Principles for Web Machine Learning [webmachinelearning-ethics] and implement appropriate privacy risk mitigations such as transparency, data minimisation, and users controls.
2.1.1. Person Detection
A user opens a web-based video conferencing application, but she temporarily leaves from her room. The application is watching whether she is in front of her PC by using object detection (for example, using object detection approaches such as [SSD] or [YOLO] that use a single DNN) to detect regions in a camera input frame that include persons.
When she comes back, the application automatically detects her and notifies other online users that she is active now.
2.1.2. Semantic Segmentation
A user joins a teleconference via a web-based video conferencing application at her desk since no meeting room in her office is available. During the teleconference, she does not wish that her room and people in the background are visible. To protect the privacy of the other people and the surroundings, the application runs a machine learning model such as [DeepLabv3+] , [MaskR-CNN] or [SegAny] to semantically split an image into segments and replaces segments that represent other people and background with another picture.
2.1.3. Skeleton Detection
A web-based video conferencing application tracks a pose of user’s skeleton by running a machine learning model, which allows for real-time human pose estimation, such as [PoseNet] to recognize her gesture and body language. When she raises her hand, her microphone is automatically unmuted and she can start speaking on the teleconference.
2.1.4. Face Recognition
There are multiple people in the conference room and they join an online meeting using a web-based video conferencing application. The application detects faces of participants by using object detection (for example, using object detection approaches such as [SSD] ) and checks whether each face was present at the previous meeting or not by running a machine learning model such as [FaceNet] , which verifies whether two faces would be identical or not.
2.1.5. Facial Landmark Detection
A user wants to find new glasses that beautifully fits her on an online glasses store. The online store offers web-based try-on simulator that runs a machine learning model such as Face Alignment Network [FAN] to detect facial landmarks like eyes, nose, mouth, etc. When she chooses a pair of glasses, the simulator properly renders the selected glasses on the detected position of eyes on her facial image.
2.1.6. Style Transfer
A user is looking for cosmetics on an online store and wondering which color may fit her face. The online store shows sample facial makeup images of cosmetics, and offers makeup simulator that runs a machine learning model like [ContextualLoss] or [PairedCycleGAN] to transfer the makeup style of the sample makeup image to her facial image. She can check how the selected makeup looks like on her face by the simulator.
2.1.7. Super Resolution
A web-based video conferencing is receiving a video stream from its peer, but the resolution of the video becomes lower due to network congestion. To prevent degradation of the perceived video quality, the application runs a machine learning model for super-resolution such as [SRGAN] to generate higher-resolution video frames.
2.1.8. Image Captioning
For better accessibility, a web-based presentation application provides automatic image captioning by running a machine learning model such as [im2txt] which predicts explanatory words of the presentation slides.
2.1.9. Text-to-image
Images are a core part of modern web experiences. An ability to generate images based on text input in a privacy-preserving manner enables visual personalization and adaptation of web applications and content. For example, a web application can use as an input a natural language description on the web page or a description provided by the user within a text prompt to produce an image matching the text description. This text-to-image use case enabled by latent diffusion model architecture [LDM] forms the basis for additional text-to-image use cases. For example, inpainting where a portion of an existing image on the web page is selectively modified using the newly generated content, or the converse, outpainting, where an original image is extended beyond its original dimensions filling the empty space with generated content.
2.1.10. Machine Translation
Multiple people from various countries are talking via a web-based real-time text chat application. The application translates their conversation by using a machine learning model such as [GNMT] or [OpenNMT] , which translates every text into different language.
2.1.11. Emotion Analysis
A user is talking to her friend via a web-based real-time text chat application, and she is wondering how the friend feels because she cannot see the friend’s face. The application analyses the friend’s emotion by using a machine learning model such as [DeepMoji] , which infers emotion from input texts, and displays an emoji that represents the estimated emotion.
2.1.12. Video Summarization
A web-based video conferencing application records received video streams, and it needs to reduce recorded video data to be stored. The application generates the short version of the recorded video by using a machine learning model for video summarization such as [Video-Summarization-with-LSTM] .
2.1.13. Noise Suppression
A web-based video conferencing application records received audio streams, but usually the background noise is everywhere. The application leverages real-time noise suppression using Recurrent Neural Network such as [RNNoise] for suppressing background dynamic noise like baby cry or dog barking to improve audio experiences in video conferences.
2.1.14. Speech Recognition
Speech recognition, also known as speech to text, enables recognition and translation of spoken language into text. Example applications of speech recognition include transcription, automatic translation, multimodal interaction, real-time captioning and virtual assistants. Speech recognition improves accessibility of auditory content and makes it possible to interact with such content in a privacy-preserving manner in a textual form. Examples of common use cases include watching videos or participating in online meetings using real-time captioning. Models such as [Whisper] approach humans in their accuracy and robustness and are well positioned to improve accessibility of such use cases.
2.1.15. Text Generation
Various text generation use cases are enabled by large language models (LLM) that are able to perform tasks where a general ability to predict the next item in a text sequence is required. This class of models can translate texts, answer questions based on a text input, summarize a larger body of text, or generate text output based on a textual input. LLMs enable better performance compared to older models based on RNN, CNN, or LSTM architectures and further improve the performance of many other use cases discussed in this section. Examples of LLMs include [t5-small] , [m2m100_418M] , [gpt2] , and [llama-2-7b] .
2.1.16. Detecting fake video
A user is exposed to realistic fake videos generated by ‘deepfake’ on the web. The fake video can swap the speaker’s face into the president’s face to incite a user politically or to manipulate user’s opinion. The deepfake detection applications such as [FaceForensics++] analyze the videos and protect a user against the fake videos or images. When she watches a fake video on the web, the detection application alerts her of the fraud video in real-time.
2.2. Framework Use Cases
This section collects framework-level use cases for a dedicated low-level API for neural network inference hardware acceleration. It is expected that Machine Learning frameworks will be key consumers of the Web Neural Network API (WebNN API) and the low-level details exposed through the WebNN API are abstracted out from typical web developers. However, it is also expected that web developers with specific interest and competence in Machine Learning will want to interface with the WebNN API directly instead of a higher-level ML framework.
2.2.1. Custom Layer
A web application developer wants to run a DNN model on the WebNN API. However, she has found that some of activation functions like [LeakyReLU] , [ELU] , etc. are not included in the WebNN API. To address this issue, she constructs custom layers of the additional activation functions on top of the WebNN API. Note that the scope of custom layers may include convolution, normalization, etc. as well as activation.
2.2.2. Network Concatenation
A web application uses a DNN model, and its model data of upper convolutional layers and lower fully-connected layers are stored in separate files, since model data of the fully-connected layers are periodically updated due to fine tuning at the server side.
Therefore, the application downloads both partial model files at first and concatenates them into a single model. When the model is updated, the application downloads fine-tuned part of the model and replace only the fully-connected layers with it.
2.2.3. Performance Adaptation
A web application developer has a concern about performance of her DNN model on mobile devices. She has confirmed that it may run too slow on mobile devices which do not have GPU acceleration. To address this issue, her web application refers to the WebNN API to confirm whether acceleration is available or not, so that the application can display the warning for devices without acceleration.
After several weeks, she has developed a tiny DNN model that can even run on CPU. In order to accommodate CPU execution, she modifies the application so that the application loads the tiny model in the case of CPU-only devices.
2.2.4. Operation Level Execution
A JavaScript ML framework is responsible for loading, interpreting and executing a ML model. During the model execution phase, the framework iterates through the operations of the model and executes each operation on the hardware device, like CPU, GPU or ML accelerator. To avoid the unnecessary data copying across devices, the framework selects the same device to execute the operations. For a compute intensive operation, such as convolution 2D or matrix multiplication, the framework uses WebNN API to execute it with the ML-specific acceleration available on that selected device.
2.2.5. Integration with real-time video processing
The user experience of WebRTC-based video conferencing is enhanced using real-time video processing. For example, background blur implemented using a § 2.1.2 Semantic Segmentation model blurs the background in the user’s live camera feed. To satisfy the performance requirements of this use case, the WebNN API integrates with primitives from other Web APIs that make up the media pipeline to allow WebNN API-based transformation of real-time video streams.
3. Security Considerations
This specification defines a low-level API for neural network inference hardware acceleration. This API is considered a powerful feature [POWERFUL-FEATURES] because it grants low-level access to a user’s computer. To meet the authentication and confidentiality expectations of a powerful feature and to prevent man-in-the-middle attacks, all interfaces defined by this specification are only available in a secure context.This API is disabled by default in all cross-origin frames using the § 6.5 Permissions Policy Integration . This prevents third-party content from using this API unless the embedding page explicitly sets a policy that grants permission.
This
API
allows
creation
of
an
MLContext
from
a
GPUDevice
defined
by
WebGPU
specification.
See
WebGPU
Security
Considerations
for
more
information
regarding
security
characteristics
of
this
context.
This API provides an abstraction across GPU, CPU, and dedicated ML accelerator hardware. When using a GPU, denial of service considerations similar to WebGPU apply. When using a CPU or a dedicated ML accelerator, the types of potential resource contention are different and mitigations will be implementation and configuration dependent. Implementations should use whatever mechanisms are available from the platform to prevent sites from using an unfair amount of system resources. These compute units are shared resources, and the use of any compute API will affect overall performance on a fully-loaded system.
Once the graph is fully constructed and compiled, the input shapes into each of the operations in the graph are inferred and finalized. The bounds checking occurs when the compute method is invoked that executes the graph against the actual data. No actual data is bound to the compiled graph before this stage. It is the implementation’s responsibility to make sure proper bounds checking occurs against the shapes of the data already inferred by that time.
Document operations susceptible to out-of-bounds access as a guidance to implementers.
Implementations must defend against control-flow attacks based on changes to data considered to be constant. For example, optimizations in the underlying platform may assume that a weight remains unchanged throughout a computation. If the API allowed the contents of buffers holding weights to change during a computation then those optimization assumptions would be invalidated, causing undefined behavior in the underlying platform. The API mitigates this category of attacks from script by always copying or transferring buffers, but implementations should consider additional defenses such as process isolation of data assumed to be constant.
As a future-proofing measure, the API design allows certain operations that can be generically emulated to be deprecated for security, performance, or other reasons without breaking compatibility. This is made possible by high-level functions that are defined in terms of smaller primitive operations defined in this specifications. This enables a native implementation of a high-level function to be replaced with a polyfill implementation.
Investigate side channel attack feasibility considering the current state where CPU is shared between processes running renderers.
In order to not allow an attacker to target a specific implementation that may contain a flaw, the § 6.2 Device Selection mechanism is a hint only, and the concrete device selection is left to the implementation - a user agent could for instance choose never to run a model on a device with known vulnerabilities. As a further mitigation, no device enumeration mechanism is defined.
Hinting partially mitigates the concern. Investigate additional mitigations.
The
API
design
minimizes
the
attack
surface
for
the
compiled
computational
graph.
The
MLGraphBuilder
interface
that
hosts
the
various
operations
is
a
data
definition
API
and
as
such
doesn’t
execute
anything,
only
constructs
data.
What
follows,
is
that
the
potential
for
an
attack
is
limited
to
when
binding
the
data
to
the
graph
before
executing
it
by
invoking
the
MLContext
.
dispatch()
method.
This
enables
implementers
to
focus
on
hardening
the
MLContext
.
dispatch()
method.
For
example,
by
making
sure
it
honors
the
boundary
of
data
and
fails
appropriately
when
the
bounds
are
not
respected.
Purpose-built Web APIs for measuring high-resolution time mitigate against timing attacks using techniques such as resolution reduction, adding jitter, detection of abuse and API call throttling [hr-time-3] . The practical deployment of WebNN implementations are likely to bring enough jitter to make timing attacks impractical (e.g. because they would use IPC) but implementers are advised to consider and test their implementations against timing attacks.
Note:
Security
risks
related
to
Unicode
sequences
are
discussed
in
context
of
the
label
USVString
definition.
3.1. Guidelines for new operations
This section is non-normative.
To ensure operations defined in this specification are shaped in a way they can be implemented securely, this section includes guidelines on how operations are expected to be defined to reduce potential for implementation problems. These guidelines are expected to evolve over time to align with industry best practices:
-
Prefer simplicity of arguments
-
Don’t use parsers for complex data formats
-
If an operation can be decomposed to low level primitives:
-
Add an informative emulation path
-
Prefer primitives over new high level operations but consider performance consequences
-
-
Follow a consistent style for operation inputs and attributes
-
Share API shape and options for operation families such as pooling and reduction
-
Formalize failure cases into test cases whenever possible
-
When in doubt, leave it out: keep the API surface as small as possible to satisfy the use cases, but no smaller
-
Try to keep the API free of implementation details that might inhibit future evolution, do not overspecify
-
Fail fast: the sooner the web developer is informed of an issue, the better
In general, always consider the security and privacy implications as documented in [security-privacy-questionnaire] by the Technical Architecture Group and the Privacy Interest Group when adding new features.
4. Privacy Considerations
This API enhances privacy compared to cloud-based inference, since input data such as locally sourced images or video streams stay within the browser’s sandbox.
This API exposes the minimum amount of information necessary to address the identified § 2 Use cases for the best performance and reliability of results.
No information from the underlying platform is exposed directly. An execution time analysis may reveal indirectly the performance of the underlying platform’s neural network hardware acceleration capabilities relative to another underlying platform.
Note: The group is soliciting further input on the proposed execution time analysis fingerprinting vector and will augment this section with more information and mitigations to inform the implementers of this API.
Unlike
WebGPU,
this
API
does
not
intrinsically
support
custom
shader
authoring;
and
as
a
result
is
not
prone
to
timing
attacks
that
rely
on
shader
caches,
or
other
persistent
data.
The
API
builds
upon
pre-existing
shaders
and
lower
level
primitives
of
the
browser
or
the
underlying
OS.
Web
developers
who
interface
with
GPUDevice
are
expected
to
be
aware
of
WebGPU
compilation
cache
considerations
.
The WebGPU API identifies machine-specific artifacts as a privacy consideration. Similarly, the WebNN API’s compute unit scheduling may under certain circumstances introduce a fingerprint. However, similarly to WebGPU, such fingerprints are identical across most or all of the devices of each vendor, mitigating the concern. Furthermore, software implementations can be used to further eliminate such artifacts.
The
WebNN
API
defines
developer-settable
preferences
to
help
inform
§ 6.2
Device
Selection
and
allow
the
implementation
to
better
select
the
underlying
execution
device
for
the
workload.
An
MLPowerPreference
indicates
preference
as
related
to
the
desired
low
power
consumption
or
high
performance,
is
considered
a
hint
only
and
as
such
does
not
increase
entropy
of
the
fingerprint.
MLContextOptions
is
under
active
development,
and
the
design
is
expected
to
change,
informed
by
further
implementation
experience
and
new
use
cases
from
the
wider
web
community.
[Issue
#623]
If
a
future
version
of
this
specification
introduces
support
for
a
new
MLContextOptions
member
for
supporting
only
a
subset
of
MLOperandDataType
s,
that
could
introduce
a
new
fingerprint.
In general, implementers of this API are expected to apply WebGPU Privacy Considerations to their implementations where applicable.
5. Ethical Considerations
The Working Group has started documenting ethical issues associated with using Machine Learning on the Web, to help identify what mitigations its normative specifications should take into account. The Working Group publishes and maintains an Ethical Principles for Web Machine Learning document [webmachinelearning-ethics] open to contributions from the wider community via a dedicated GitHub repository .
6. Programming Model
6.1. Overview
At the heart of neural networks is a computational graph of mathematical operations. These operations are the building blocks of modern machine learning technologies in computer vision, natural language processing, and robotics. The WebNN API is a specification for constructing, compiling, and executing computational graphs of neural networks.
The
MLGraph
interface
represents
a
compiled
computational
graph
that
is
immutable
(that
is,
a
model).
The
MLGraphBuilder
interface
serves
as
a
builder
(factory)
to
construct
a
computational
graph
(its
graph
)
that
is
then
compiled
to
create
an
MLGraph
.
In
WebNN,
a
computational
graph
is
composed
of
operators
which
act
on
data,
and
are
the
nodes
of
the
graph.
MLOperand
s
are
a
representation
of
data
that
flows
within
the
computational
graph,
and
are
the
edges
of
the
graph.
MLOperand
s
include
a
computational
graph
’s
input
values
for
inference,
constants
(including
trained
weights)
used
for
inference,
intermediate
values
(often
referred
to
as
activations)
computed
during
inference,
as
well
as
the
output
values
of
inference.
An
operator
’s
input
is
one
or
more
MLOperand
s.
An
operator
’s
output
is
one
or
more
MLOperand
s.
Operators
have
operator-specific
parameters
that
control
their
behavior,
which
can
include
zero
or
more
activation
functions
.
A
key
part
of
the
MLGraphBuilder
interface
are
methods
such
as
gemm()
and
relu()
which
create
an
operator
which
represents
the
actual
operation
to
perform
on
the
input
data
when
the
computation
is
run,
and
return
a
new
MLOperand
holding
the
operator.
Methods
that
create
an
MLOperand
connect
any
inputs
and
activations
to
the
operator.
Each
method
invocation
returns
a
distinct
new
value,
without
changing
the
value
of
any
other
MLOperand
.
An
operator
has
a
label
,
a
string
which
may
be
included
in
diagnostics
such
as
exception
messages.
When
an
operator
is
created
its
label
is
initialized
in
an
implementation-defined
manner
and
may
include
the
passed
label
.
Consider
adding
a
mechanism
for
reporting
errors
during
dispatch()
.
[Issue
#778]
At
inference
time,
every
MLOperand
will
be
bound
to
a
tensor
(the
actual
data),
which
are
essentially
multidimensional
arrays.
The
representation
of
the
tensors
is
implementation
dependent,
but
it
typically
includes
the
array
data
stored
in
some
buffer
(memory)
and
some
metadata
describing
the
array
data
(such
as
its
shape).
Operations within the computational graph have functional semantics. This allows the implementation to potentially share the array data between multiple tensors. For example, the implementation of operations such as reshape, or slice may return a view of its input tensor that shares the same buffer as the input tensor. (In the case of reshape, the entire data is shared, while in the case of slice, a part of the input data is shared.) The implementation may use views, as above, for intermediate values.
Before the execution, the computation graph that is used to compute one or more specified outputs needs to be converted, compiled, and optimized. The key purpose of the compilation step is to enable optimizations that span two or more operations, such as operation or loop fusion. The user agent may also perform these optimizations during graph conversion.
The
MLGraphBuilder
.
build()
method
compiles
the
graph
in
the
background
without
blocking
the
calling
thread,
and
returns
a
Promise
that
resolves
to
an
MLGraph
.
Each
MLGraphBuilder
can
build
at
most
one
MLGraph
.
The
MLGraph
underlying
implementation
will
be
composed
of
platform-specific
representations
of
operators
and
operands
which
correspond
to
the
MLGraphBuilder
’s
operators
and
MLOperand
s,
but
which
are
not
script-visible
and
may
be
compositions
or
decompositions
of
the
graph
as
constructed
by
script.
Once
the
MLGraph
is
constructed,
the
MLContext
.
dispatch()
method
performs
the
execution
of
the
graph
asynchronously
either
on
a
parallel
timeline
in
a
separate
worker
thread
for
the
CPU
execution
or
on
a
GPU
timeline
in
a
GPU
command
queue.
This
method
returns
immediately
without
blocking
the
calling
thread
while
the
actual
execution
is
offloaded
to
a
different
timeline.
The
caller
supplies
the
input
values
using
MLNamedTensors
,
binding
the
input
MLOperand
s
to
their
values.
The
caller
also
supplies
MLNamedTensors
for
output
MLOperand
s
which
will
contain
the
result
of
graph
execution,
if
successful,
which
may
be
read
back
to
script
using
the
MLContext
.
readTensor(tensor)
method.
This
type
of
execution
supports
CPU,
GPU,
and
NPU
devices.
6.2. Device Selection
An
MLContext
interface
represents
a
global
state
of
neural
network
execution.
One
of
the
important
context
states
is
the
underlying
execution
device
that
manages
the
resources
and
facilitates
the
compilation
and
the
eventual
execution
of
the
neural
network
graph.
In
addition
to
the
default
method
of
creation
with
MLContextOptions
,
an
MLContext
could
also
be
created
from
a
specific
GPUDevice
that
is
already
in
use
by
the
application.
In
a
situation
when
a
GPU
context
executes
a
graph
with
a
constant
or
an
input
in
the
system
memory
as
an
ArrayBufferView
,
the
input
content
is
automatically
uploaded
from
the
system
memory
to
the
GPU
memory,
and
downloaded
back
to
the
system
memory
of
an
ArrayBufferView
output
buffer
at
the
end
of
the
graph
execution.
This
data
upload
and
download
cycles
will
only
occur
whenever
the
execution
device
requires
the
data
to
be
copied
out
of
and
back
into
the
system
memory,
such
as
in
the
case
of
the
GPU.
It
doesn’t
occur
when
the
device
is
a
CPU
device.
Additionally,
the
result
of
the
graph
execution
is
in
a
known
layout
format.
While
the
execution
may
be
optimized
for
a
native
memory
access
pattern
in
an
intermediate
result
within
the
graph,
the
output
of
the
last
operation
of
the
graph
must
convert
the
content
back
to
a
known
layout
format
at
the
end
of
the
graph
in
order
to
maintain
the
expected
behavior
from
the
caller’s
perspective.
MLContext
is
created
with
MLContextOptions
,
the
user
agent
selects
and
creates
the
underlying
execution
device
by
taking
into
account
these
options,
currently
only
the
MLPowerPreference
option.
Depending on the underlying platform, the user agent may select different combinations of CPU, NPU and GPU devices.
For a history and rationale of this design, please see the device selection explainer .
6.3. Operators
This section is non-normative.
The WebNN API defines a set of operators required by well-known CNN and RNN, transformer and generative models that address key § 2.1 Application Use Cases . The details of each operator are defined in the normative sections of this specification, in alphabetical order by the operator name. These operators are grouped into categories based on their functionality in the following non-normative table to give a functional overview of the API surface.
Note:
Some
operators
belong
to
multiple
categories.
For
example,
clamp()
is
both
a
math
function
and
also
used
as
an
activation.
6.4. Task Source
The
ML
task
source
is
a
task
source
to
be
used
for
all
tasks
related
to
asynchronous
compilation
and
execution
of
MLGraph
s
and
creation
of
MLContext
s.
To queue an ML task given a global object global and a series of steps steps , queue a global task on the ML task source with global and steps .
6.5. Permissions Policy Integration
This
specification
defines
a
policy-controlled
feature
identified
by
the
string
"
webnn
".
Its
default
allowlist
is
'self'
.
7. API
7.1. The navigator.ml interface
An
ML
object
is
available
in
the
Window
and
WorkerGlobalScope
contexts
through
the
Navigator
and
WorkerNavigator
interfaces
respectively
and
is
exposed
via
navigator.ml
.
interface mixin { [
NavigatorML SecureContext ,SameObject ]readonly attribute ML ; };
ml Navigator includes NavigatorML ;WorkerNavigator includes NavigatorML ;
7.2.
ML
interface
enum MLPowerPreference {"default" ,"high-performance" ,"low-power" };dictionary {
MLContextOptions MLPowerPreference powerPreference = "default"; }; [SecureContext ,Exposed =(Window ,Worker )]interface {
ML Promise <MLContext >createContext (optional MLContextOptions options = {});Promise <MLContext >createContext (GPUDevice gpuDevice ); };
7.2.1.
MLContextOptions
MLContextOptions
is
under
active
development,
and
the
design
is
expected
to
change,
informed
by
further
implementation
experience
and
new
use
cases
from
the
wider
web
community.
The
Working
Group
is
considering
additional
API
controls
to
allow
the
definition
of
a
fallback
device,
multiple
devices
in
a
preferred
order,
or
an
exclusion
of
a
specific
device.
Other
considerations
under
discussion
include
error
handling,
ultimate
fallback,
and
quantized
operators.
Feedback
is
welcome
on
any
of
these
design
considerations
from
web
developers,
library
authors,
OS
and
hardware
vendors,
and
other
stakeholders
via
GitHub:
[Issue
#623]
The
powerPreference
option
is
an
MLPowerPreference
and
indicates
the
application’s
preference
as
related
to
power
consumption.
It
is
one
of
the
following:
-
"
default
" - Let the user agent select the most suitable behavior.
-
"
high-performance
" - Prioritizes execution speed over power consumption.
-
"
low-power
" - Prioritizes power consumption over other considerations such as execution speed.
7.2.2.
createContext()
-
options
: anMLContextOptions
. Provides the application’s preferences for the context. -
gpuDevice
: aGPUDevice
. A specific device to use with the context.
MLContext
.
To
create
a
context
given
realm
realm
and
options
(a
GPUDevice
or
MLContextOptions
),
run
these
steps:
-
Let context be a new
MLContext
in realm . -
If options is a
GPUDevice
object, then:-
Set context .
[[contextType]]
to " webgpu ". -
Set context .
[[powerPreference]]
to"default"
.
-
-
Otherwise:
-
Set context .
[[contextType]]
to " default ". -
Set context .
[[lost]]
to a new promise in realm . -
If options ["
powerPreference
"] exists , then set context .[[powerPreference]]
to options ["powerPreference
"]. -
Otherwise, set context .
[[powerPreference]]
to"default"
.
-
-
If the user agent cannot support context .
[[contextType]]
, then return failure. -
Return context .
The
createContext(
options
)
steps
are:
-
Let global be this ’s relevant global object .
-
Let realm be this ’s relevant realm .
-
If global ’s associated Document is not allowed to use the webnn feature, then return a new promise in realm rejected with a "
SecurityError
"DOMException
. -
Let promise be a new promise in realm .
-
Run the following steps in parallel .
-
Let context be the result of creating a context given realm and options . If that returns failure, then queue an ML task with global to reject promise with a "
NotSupportedError
"DOMException
and abort these steps. -
Queue an ML task with global to resolve promise with context .
-
-
Return promise .
The
createContext(
gpuDevice
)
method
steps
are:
-
Let global be this ’s relevant global object .
-
Let realm be this ’s relevant realm .
-
If global ’s associated Document is not allowed to use the webnn feature, then return a new promise in realm rejected with a "
SecurityError
"DOMException
. -
Let promise be a new promise in realm .
-
Run the following steps in parallel .
-
Let context be the result of creating a context given realm and gpuDevice . If that returns failure, then queue an ML task with global to reject promise with a "
NotSupportedError
"DOMException
and abort these steps. -
Queue an ML task with global to resolve promise with context .
-
-
Return promise .
7.3.
MLContext
interface
The
MLContext
interface
represents
a
global
state
of
neural
network
compute
workload
and
execution
processes.
Each
MLContext
object
has
associated
context
type
and
MLPowerPreference
.
typedef record <USVString ,MLTensor >;
MLNamedTensors dictionary {
MLContextLostInfo DOMString message ; }; [SecureContext ,Exposed =(Window ,Worker )]interface {
MLContext undefined dispatch (MLGraph graph ,MLNamedTensors inputs ,MLNamedTensors outputs );Promise <MLTensor >createTensor (MLTensorDescriptor descriptor );Promise <MLTensor >createConstantTensor (MLOperandDescriptor descriptor ,AllowSharedBufferSource inputData );Promise <ArrayBuffer >readTensor (MLTensor tensor );Promise <undefined >readTensor (MLTensor tensor ,AllowSharedBufferSource outputData );undefined writeTensor (MLTensor tensor ,AllowSharedBufferSource inputData );MLOpSupportLimits ();
opSupportLimits undefined destroy ();readonly attribute Promise <MLContextLostInfo >lost ; };
MLContext
has
the
following
internal
slots:
-
[[contextType]]
of type context type . -
The
MLContext
’s context type . -
[[powerPreference]]
of typeMLPowerPreference
. -
The
MLContext
’sMLPowerPreference
. -
[[lost]]
of typePromise
<MLContextLostInfo
>. -
A
Promise
that is resolved when theMLContext
’s underlying execution device is no longer available. -
[[timeline]]
-
A timeline associated with the execution of operations on the compute units of the
MLContext
. These operations include inferencing on computational graphs and modifying the[[data]]
ofMLTensor
s.More rigorously define this timeline. [Issue #529]
The context type is the type of the execution context that manages the resources and facilitates the compilation and execution of the neural network graph:
- " default "
- Context created per user preference options.
- " webgpu "
- Context created from WebGPU device.
To
validate
buffer
with
descriptor
given
AllowSharedBufferSource
bufferSource
and
MLOperandDescriptor
descriptor
,
run
the
following
steps:
-
If bufferSource ’s byte length is not equal to descriptor ’s byte length , then return false.
-
Switch on the type of bufferSource :
-
ArrayBuffer
-
Return true.
-
SharedArrayBuffer
-
Return true.
-
ArrayBufferView
-
-
If bufferSource is a
Uint8Array
object, then return true. -
If bufferSource matches descriptor ’s
dataType
according to this table , then return true. -
Return false.
-
-
Note:
Using
Uint8Array
regardless
of
the
descriptor
’s
dataType
is
supported
as
a
generic
way
of
representing
a
slice
of
an
ArrayBuffer
,
for
example
part
of
a
WebAssembly
.
Memory
instance.
Developers
are
encouraged
to
use
more
specific
view
types
when
authoring
WebNN
code
for
readability
and
maintainability.
To
validate
tensors
with
descriptors
given
an
MLNamedTensors
namedTensors
with
record
<
USVString
,
MLOperandDescriptor
>
namedDescriptors
:
-
If namedTensors ’s size is not equal to namedDescriptors ’s size , then return false.
-
For each name → tensor of namedTensors :
-
If tensor .
[[isConstant]]
is true, then return false. -
If namedDescriptors [ name ] does not exist , then return false.
-
If tensor .
[[descriptor]]
is not equal to namedDescriptors [ name ], then return false.
-
-
Return true.
7.3.1.
dispatch()
Schedules
the
computational
workload
of
a
compiled
MLGraph
on
the
MLContext
’s
[[timeline]]
.
-
graph
: anMLGraph
. The computational graph to be executed. -
inputs
: anMLNamedTensors
. The inputs to the computational graph. -
outputs
: anMLNamedTensors
. The outputs of the computational graph.
Returns:
undefined
.
Note:
dispatch()
itself
provides
no
signal
that
graph
execution
has
completed.
Rather,
callers
can
await
the
results
of
reading
back
the
output
tensors.
See
§ 7.3.1.1
Examples
below.
The
dispatch(
graph
,
inputs
,
outputs
)
method
steps
are:
-
If graph .
[[context]]
is not this , then throw aTypeError
. -
If graph .
[[isDestroyed]]
is true, then throw an "InvalidStateError
"DOMException
. -
Let allTensors be a list of
MLTensor
s consisting of inputs ’s values extended by outputs ’s values . -
If allTensors contains any duplicate items , then throw a
TypeError
. -
For each tensor of allTensors :
-
If tensor .
[[context]]
is not this , then throw aTypeError
. -
If tensor .
[[isDestroyed]]
is true, then throw aTypeError
.
-
-
If validating tensors with descriptors given inputs and graph .
[[inputDescriptors]]
returns false, then throw aTypeError
. -
If validating tensors with descriptors given outputs and graph .
[[outputDescriptors]]
returns false, then throw aTypeError
. -
Enqueue the following steps to graph .
[[context]]
.[[timeline]]
:-
Run these steps, but abort when this is lost :
-
Issue a compute request to graph .
[[implementation]]
given inputs and outputs .Add a mechanism for reporting errors during graph execution. [Issue #778]
-
-
When a constant operand is created using a tensor, it is legal for that tensor to be destroyed after build completes. Implementations are expected to ensure that the compiled graph remains valid and unaffected by such destruction.
7.3.1.1. Examples
The
following
code
showcases
executing
an
MLGraph
using
MLTensor
s.
const descriptor= { dataType: 'float32' , shape: [ 2 , 2 ] }; const context= await navigator. ml. createContext(); const builder= new MLGraphBuilder( context); // 1. Create a computational graph 'C = 0.2 * A + B'. const constant= builder. constant( descriptor, new Float32Array( 4 ). fill( 0.2 )); const A= builder. input( 'A' , descriptor); const B= builder. input( 'B' , descriptor); const C= builder. add( builder. mul( A, constant), B); // 2. Compile the graph. const graph= await builder. build({ 'C' : C}); // 3. Create reusable input and output tensors. const [ inputTensorA, inputTensorB, outputTensorC] = await Promise. all([ context. createTensor({ dataType: A. dataType, shape: A. shape, writable: true }), context. createTensor({ dataType: B. dataType, shape: B. shape, writable: true }), context. createTensor({ dataType: C. dataType, shape: C. shape, readable: true }) ]); // 4. Initialize the inputs. context. writeTensor( inputTensorA, new Float32Array( 4 ). fill( 1.0 )); context. writeTensor( inputTensorB, new Float32Array( 4 ). fill( 0.8 )); // 5. Execute the graph. const inputs= { 'A' : inputTensorA, 'B' : inputTensorB}; const outputs= { 'C' : outputTensorC}; context. dispatch( graph, inputs, outputs); // 6. Read back the computed result. const result= await context. readTensor( outputTensorC); console. log( 'Output value:' , new Float32Array( result)); // [1, 1, 1, 1]
7.3.2.
createTensor()
Creates
an
MLTensor
associated
with
this
MLContext
.
-
descriptor
: anMLTensorDescriptor
.
The
createTensor(
descriptor
)
method
steps
are:
-
Let global be this ’s relevant global object .
-
Let realm be this ’s relevant realm .
-
If this is lost , then return a new promise in realm rejected with an "
InvalidStateError
"DOMException
. -
Let tensor be the result of creating an MLTensor given this , and descriptor .
-
Let promise be a new promise in realm .
-
Enqueue the following steps to this .
[[timeline]]
:-
Run these steps, but abort when this is lost :
-
Create tensor .
[[data]]
given descriptor and initialize all bytes to zeros. -
If that fails, then queue an ML task with global to reject promise with an "
UnknownError
"DOMException
, and abort these steps. -
Otherwise, queue an ML task with global to resolve promise with tensor .
-
-
If aborted , then queue an ML task with global to reject promise with an "
InvalidStateError
"DOMException
.
-
-
Return promise .
7.3.3.
createConstantTensor()
Creates
a
constant
MLTensor
associated
with
this
MLContext
.
-
descriptor
: anMLOperandDescriptor
. -
inputData
: anAllowSharedBufferSource
. The buffer whose bytes will be written into the tensor.
The
createConstantTensor(
descriptor
,
inputData
)
method
steps
are:
-
Let global be this ’s relevant global object .
-
Let realm be this ’s relevant realm .
-
If this is lost , then return a new promise in realm rejected with an "
InvalidStateError
"DOMException
. -
If checking dimensions given descriptor returns false, then return a new promise in realm rejected with a
TypeError
. -
If validating buffer with descriptor given inputData and descriptor returns false, then return a new promise in realm rejected with a
TypeError
. -
Let bytes be the result of getting a copy of the bytes held by the buffer source given inputData .
-
Assert : bytes ’s length is equal to descriptor ’s byte length .
-
Let tensor be the result of creating a constant MLTensor given this , and descriptor .
-
Let promise be a new promise in realm .
-
Enqueue the following steps to this .
[[timeline]]
:-
Run these steps, but abort when this is lost :
-
Create tensor .
[[data]]
given descriptor . -
If that fails, then queue an ML task with global to reject promise with an "
UnknownError
"DOMException
, and abort these steps. -
Copy bytes to tensor .
[[data]]
. -
If that fails, then queue an ML task with global to reject promise with an "
UnknownError
"DOMException
, and abort these steps. -
Otherwise, queue an ML task with global to resolve promise with tensor .
-
-
If aborted , then queue an ML task with global to reject promise with an "
InvalidStateError
"DOMException
.
-
-
Return promise .
7.3.4.
readTensor(tensor)
Reads
back
the
[[data]]
of
an
MLTensor
from
the
MLContext
.
[[timeline]]
to
script.
-
tensor
: anMLTensor
. The tensor to be read.
Returns:
Promise
<
ArrayBuffer
>.
A
buffer
containing
the
result
of
the
read.
The
readTensor(
tensor
)
method
steps
are:
-
Let global be this ’s relevant global object .
-
Let realm be this ’s relevant realm .
-
If tensor .
[[context]]
is not this , then return a new promise in realm rejected with aTypeError
. -
If tensor .
[[isDestroyed]]
is true, then return a new promise in realm rejected with aTypeError
. -
If tensor .
[[descriptor]]
.readable
is false, then return a new promise in realm rejected with aTypeError
. -
Let promise be a new promise in realm .
-
Append promise to tensor .
[[pendingPromises]]
. -
Enqueue the following steps to tensor .
[[context]]
.[[timeline]]
:-
Run these steps, but abort when this is lost :
-
Let bytes be a byte sequence containing a copy of tensor .
[[data]]
. -
If that fails, then queue an ML task with global and the following steps:
-
Remove promise from tensor .
[[pendingPromises]]
. -
Reject promise with an "
UnknownError
"DOMException
, and abort these steps.
-
-
Otherwise, queue an ML task with global and the following steps:
-
Remove promise from tensor .
[[pendingPromises]]
. -
Let buffer be the result of creating an
ArrayBuffer
from bytes in realm . -
Resolve promise with buffer .
-
-
-
If aborted , then queue an ML task with global to reject promise with an "
InvalidStateError
"DOMException
.
-
-
Return promise .
7.3.5.
readTensor(tensor,
outputData)
Bring-your-own-buffer
variant
of
readTensor(tensor)
.
Reads
back
the
[[data]]
of
an
MLTensor
into
the
provided
buffer.
-
tensor
: anMLTensor
. The tensor to be read. -
outputData
: anAllowSharedBufferSource
. The buffer to read the result into.
The
readTensor(
tensor
,
outputData
)
method
steps
are:
-
Let global be this ’s relevant global object .
-
Let realm be this ’s relevant realm .
-
If tensor .
[[context]]
is not this , then return a new promise in realm rejected with aTypeError
. -
If tensor .
[[isDestroyed]]
is true, then return a new promise in realm rejected with aTypeError
. -
If tensor .
[[descriptor]]
.readable
is false, then return a new promise in realm rejected with aTypeError
. -
If validating buffer with descriptor given outputData and tensor .
[[descriptor]]
returns false, then return a new promise in realm rejected with aTypeError
. -
Let promise be a new promise in realm .
-
Append promise to tensor .
[[pendingPromises]]
. -
Enqueue the following steps to tensor .
[[context]]
.[[timeline]]
:-
Run these steps, but abort when this is lost :
-
Let bytes be a byte sequence containing a copy of tensor .
[[data]]
. -
If that fails, then queue an ML task with global to run these steps:
-
Remove promise from tensor .
[[pendingPromises]]
. -
Reject promise with an "
UnknownError
"DOMException
, and abort these steps.
-
-
Otherwise, queue an ML task with global to run these steps:
-
Remove promise from tensor .
[[pendingPromises]]
. -
If outputData is detached , then reject promise with a
TypeError
, and abort these steps.Note: Validating buffer with descriptor above will fail if outputData is detached, but it is possible that outputData could be detached between that step and this one.
-
Write bytes to outputData .
-
-
-
If aborted , then queue an ML task with global to reject promise with an "
InvalidStateError
"DOMException
.
-
-
Return promise .
7.3.6.
writeTensor()
Writes
data
to
the
[[data]]
of
an
MLTensor
on
the
MLContext
’s
[[timeline]]
.
-
tensor
: anMLTensor
. The tensor to be written to. -
inputData
: anAllowSharedBufferSource
. The buffer whose bytes will be written into the tensor.
Returns:
undefined
.
The
writeTensor(
tensor
,
inputData
)
method
steps
are:
-
If tensor .
[[context]]
is not this , then throw aTypeError
. -
If tensor .
[[isDestroyed]]
is true, then throw aTypeError
. -
If tensor .
[[descriptor]]
.writable
is false, then throw aTypeError
. -
If validating buffer with descriptor given inputData and tensor .
[[descriptor]]
returns false, then throw aTypeError
. -
Let bytes be the result of getting a copy of the bytes held by the buffer source given inputData .
-
Assert : bytes ’s length is equal to tensor .
[[descriptor]]
’s byte length . -
Enqueue the following steps to tensor .
[[context]]
.[[timeline]]
:-
Run these steps, but abort when this is lost :
-
Copy bytes to tensor .
[[data]]
.Add a mechanism for reporting errors while writing to a tensor. [Issue #778]
-
-
Note:
Similar
to
dispatch()
,
writeTensor()
itself
provides
no
signal
that
the
write
has
completed.
To
inspect
the
contents
of
a
tensor,
callers
can
await
the
results
of
reading
back
the
tensor.
7.3.7.
opSupportLimits()
The
opSupportLimits()
exposes
level
of
support
that
differs
across
implementations
at
operator
level.
Consumers
of
the
WebNN
API
are
encouraged
to
probe
feature
support
level
by
using
opSupportLimits()
to
determine
the
optimal
model
architecture
to
be
deployed
for
each
target
platform.
7.3.7.1.
MLOpSupportLimits
dictionary
The
MLOpSupportLimits
has
the
following
top
level
members,
aside
from
these,
each
operator
has
a
corresponding
member
defined
in
its
builder
method.
dictionary {
MLOpSupportLimits MLInputOperandLayout preferredInputLayout ; [EnforceRange ]unsigned long long maxTensorByteLength ;MLDataTypeLimits input ;MLDataTypeLimits constant ;MLDataTypeLimits output ; };
-
preferredInputLayout
, of type MLInputOperandLayout -
Preferred input layout for layout dependent operators like
conv2d()
. -
maxTensorByteLength
, of type unsigned long long -
The maximum supported length of tensors, in bytes.
-
input
, of type MLDataTypeLimits -
constant
, of type MLDataTypeLimits -
output
, of type MLDataTypeLimits
7.3.7.2.
MLDataTypeLimits
dictionary
typedef sequence <MLOperandDataType >;
MLDataTypeList dictionary {
MLDataTypeLimits MLDataTypeList dataTypes ; };
-
dataTypes
, of type MLDataTypeList -
Supported data types.
7.3.7.3.
MLRankRange
dictionary
dictionary {
MLRankRange unsigned long min ;unsigned long max ; };
-
min
, of type unsigned long -
Minimum supported rank.
-
max
, of type unsigned long -
Maximum supported rank.
7.3.7.4.
MLTensorLimits
dictionary
dictionary {
MLTensorLimits MLDataTypeList dataTypes ;MLRankRange rankRange ; };
-
dataTypes
, of type MLDataTypeList -
Supported data types.
-
rankRange
, of type MLRankRange -
Minimum and maximum supported ranks.
7.3.7.5.
MLBinarySupportLimits
dictionary
dictionary {
MLBinarySupportLimits MLTensorLimits a ;MLTensorLimits b ;MLDataTypeLimits output ; };
-
a
, of type MLTensorLimits -
MLTensorLimits
for a operand. -
b
, of type MLTensorLimits -
MLTensorLimits
for b operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
7.3.7.6.
MLSingleInputSupportLimits
dictionary
dictionary {
MLSingleInputSupportLimits MLTensorLimits input ;MLDataTypeLimits output ; };
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
7.3.8.
destroy()
The
destroy()
method
can
be
called
to
release
all
resources
associated
with
the
context.
Any
outstanding
compute
requests
and
MLTensor
creation/read/write
requests
will
fail.
The
destroy()
method
steps
are:
-
Run the steps to lose this with an implementation-defined message.
Note: A message indicating that
destroy()
was called can help developers distinguish the cause of the context loss.
7.3.9. Errors
When
a
user
agent
determines
that
an
MLContext
is
no
longer
available
to
fulfill
requests,
it
must
run
the
context
lost
steps
for
it.
The
context
lost
steps
for
MLContext
context
,
are:
-
Let global be context ’s relevant global object .
-
Queue an ML task with global to run these steps:
-
Lose context , with an implementation-defined message.
-
To
lose
MLContext
context
with
DOMString
message
:
-
Let info be a new
MLContextLostInfo
. -
Set info .
message
to message . -
For each
MLGraph
graph where graph .[[context]]
equals this : -
For each
MLTensor
tensor where tensor .[[context]]
equals this :
-
message
, of type DOMString -
An implementation-defined message providing information about the error that occurred.
A
MLContext
is
lost
if
its
[[lost]]
Promise
is
settled
.
7.4.
MLGraph
interface
The
MLGraph
interface
represents
a
compiled
computational
graph.
A
compiled
graph
once
constructed
is
immutable
and
cannot
be
subsequently
changed.
[SecureContext ,Exposed =(Window ,Worker )]interface {
MLGraph undefined destroy (); };
MLGraph
has
the
following
internal
slots:
-
[[context]]
of typeMLContext
-
The context of type
MLContext
associated with thisMLGraph
. -
[[inputDescriptors]]
of type record <USVString
,MLOperandDescriptor
> -
Maps the name of an input
MLOperand
to itsMLOperandDescriptor
for all inputMLOperand
s of thisMLGraph
. -
[[outputDescriptors]]
of type record <USVString
,MLOperandDescriptor
> -
Maps the name of an output
MLOperand
to itsMLOperandDescriptor
for all outputMLOperand
s of thisMLGraph
. -
[[implementation]]
-
The underlying implementation provided by the User Agent.
-
[[isDestroyed]]
of typeboolean
-
Whether the
MLGraph
.destroy()
method steps have been run. Once destroyed, theMLGraph
can no longer be used.
7.4.1.
destroy()
The
destroy()
method
can
be
called
to
release
all
resources
associated
with
the
graph.
The
destroy()
method
steps
are:
-
If this .
[[isDestroyed]]
is true, then abort these steps. -
Set this .
[[isDestroyed]]
to true. -
Queue a task on this .
[[context]]
.[[timeline]]
to mark resources owned by this graph as freeable.
Note: Since no further workloads can be enqueued using this graph, implementations can free any additional resource allocations associated with this graph once all previously submitted workloads using it are complete.
7.5.
MLOperandDescriptor
dictionary
An
MLOperandDescriptor
describes
the
shape
(dimensions)
and
data
type
of
an
operand.
They
are
used
to
describe
the
inputs
and
constants
for
an
MLGraph
,
and
every
MLOperand
has
an
internal
MLOperandDescriptor
.
enum {
MLInputOperandLayout ,
"nchw" };
"nhwc" enum {
MLOperandDataType ,
"float32" ,
"float16" ,
"int32" ,
"uint32" ,
"int64" ,
"uint64" ,
"int8" };
"uint8" dictionary {
MLOperandDescriptor required MLOperandDataType dataType ;required sequence <[EnforceRange ]unsigned long >shape ; };
-
dataType
, of type MLOperandDataType -
The operand data type.
-
shape
, of typesequence<[EnforceRange] unsigned long>
-
The list of dimensions of the operand. It is empty for scalar operands.
MLOperandDescriptor
A
is
equal
to
an
MLOperandDescriptor
B
if
A
.
dataType
equals
B
.
dataType
and
A
.
shape
equals
B
.
shape
.
To
create
an
MLOperandDescriptor
given
MLOperandDataType
dataType
and
list
shape
,
run
the
following
steps:
-
Let descriptor be a new
MLOperandDescriptor
. -
Set descriptor .
dataType
to dataType . -
Return descriptor .
The
byte
length
of
an
MLOperandDescriptor
desc
is
the
value
returned
by
the
following
steps:
-
Let elementLength be 1.
-
For each dimension of desc .
shape
:-
Set elementLength to elementLength * dimension .
-
-
Let elementSize be the element size of one of the
ArrayBufferView
types that matches desc .dataType
according to this table . -
Return elementLength * elementSize .
A
valid
dimension
is
an
integer
greater
than
zero
and
in
the
range
of
long
.
Implementations
may
impose
a
smaller
upper
bound.
Should 0-size dimensions be supported? [Issue #391]
To
check
dimensions
given
MLOperandDescriptor
descriptor
,
run
the
following
steps:
-
If any item of descriptor .
shape
is not a valid dimension , then return false. -
If descriptor .
shape
’s size is too large to be supported by the implementation, then return false.The maximum number of operand dimensions is not defined, but native ML APIs usually have a maximum supported size. [Issue #456]
-
If descriptor ’s byte length is not supported by the implementation, then return false.
-
Return true.
7.6.
MLOperand
interface
An
MLOperand
represents
an
intermediary
graph
being
constructed
as
a
result
of
compositing
parts
of
an
operation
into
a
fully
composed
operation.
For
instance,
an
MLOperand
can
represent
a
constant
feeding
to
an
operation
or
the
result
from
combining
multiple
constants
together
into
an
operation.
See
also
§ 6
Programming
Model
.
[SecureContext ,Exposed =(Window ,Worker )]interface {
MLOperand readonly attribute MLOperandDataType dataType ;readonly attribute FrozenArray <unsigned long >shape ; };dictionary {
MLOperatorOptions USVString label = ""; };typedef (bigint or unrestricted double )MLNumber ;
MLOperand
has
the
following
internal
slots:
-
[[builder]]
of typeMLGraphBuilder
-
The
MLOperand
’s associated builder object. -
[[descriptor]]
of typeMLOperandDescriptor
-
The
MLOperand
’s descriptor. -
[[name]]
of type string -
The
MLOperand
’s name (only for input operands). -
[[operator]]
of type operator -
[[constantTensor]]
of typeMLTensor
-
The
MLOperand
’s tensor (only for constant operands).
An
MLOperand
’s
dataType
is
its
[[descriptor]]
.
dataType
.
An
MLOperand
’s
shape
is
its
[[descriptor]]
.
shape
.
An
MLOperand
’s
rank
is
its
shape
’s
size
.
The
dataType
getter
steps
are
to
return
this
’s
dataType
.
The
shape
getter
steps
are
to
return
this
’s
shape
.
Since
the
[[builder]]
object
is
bound
by
the
MLGraphBuilder()
constructor
to
an
MLContext
object,
an
MLOperand
is
also
always
bound
to
the
same
MLContext
object.
If
an
operation
supports
only
a
subset
of
MLOperandDataType
s,
the
allowed
data
types
for
each
of
the
operation’s
input
operands,
including
both
positional
arguments
and
options,
are
given
as
either
an
explicit
list
of
MLOperandDataType
s,
or
a
constraint
that
the
operand’s
dataType
must
be
the
same
as
the
dataType
of
another
input
operand,
or
any
to
allow
any
MLOperandDataType
.
Implementations
may
support
fewer
data
types
for
operands
than
specified.
This
can
be
queried
for
each
operation
using
the
opSupportLimits()
method
on
MLContext
and
inspecting
the
dataTypes
value
of
the
corresponding
member
for
the
operation.
Should we specify the subset of data types that must be supported for each operator?
If an operation requires input operands with a particular rank , the allowed ranks for each of the operation’s input operands, including both positional arguments and options, are given as an explicit rank (e.g. 1), or N to allow any dimensionality, or the same as another operand. More specific constraints are common, such as when an input operand’s shape must be unidirectionally broadcastable to or bidirectionally broadcastable with another input operand; in these cases, the allowed ranks are listed as a range, with specific validation given as steps in the operation.
Implementations
may
impose
a
more
restricted
lower
bound
and/or
upper
bound
on
the
rank
of
operands
than
specified.
This
can
be
queried
for
each
operation
using
the
opSupportLimits()
method
on
MLContext
and
inspecting
the
rankRange
.
min
and
rankRange
.
max
values
of
the
corresponding
member
for
the
operation.
MLOperatorOptions
has
the
following
members:
-
label
, of type USVString , defaulting to""
-
Optionally provided when an operator is created using
MLGraphBuilder
methods that createMLOperand
s. The implementation may use this value to initialize the operator ’s label .
Note:
The
label
is
not
intended
to
be
a
natural
language
string.
It
is
a
language-independent
identifier,
analogous
to
a
variable
name
or
error
code,
like
"mul#1234"
.
Note:
Implementations
are
encouraged
to
use
the
label
provided
by
developers
to
enhance
error
messages
and
improve
debuggability,
including
both
synchronous
errors
during
graph
construction
and
for
errors
that
occur
during
the
asynchronous
build()
method.
When
displaying
labels
provided
by
developers
via
label
in
debugging
tools,
logs,
or
error
messages,
implementations
should
sanitize
the
output
to
prevent
security
risks,
such
as
injection
of
malicious
Unicode
sequences
(e.g.
Bidirectional
Text
Spoofing
[UTR36]
,
Source
Code
Spoofing
[UTS55]
and
other
concerns).
For
example,
implementations
should
escape
or
filter
control
characters
(e.g.,
U+202A
to
U+202E,
U+2066
to
U+2069)
or
use
a
safe
rendering
mechanism
to
neutralize
potential
spoofing.
7.6.1.
Creating
an
MLOperand
The
MLOperand
objects
are
created
by
the
methods
of
MLGraphBuilder
,
internally
using
the
following
algorithms.
To
create
an
MLOperand
given
MLGraphBuilder
builder
and
MLOperandDescriptor
desc
,
run
the
following
steps:
-
Let realm be builder ’s relevant realm .
-
Let operand be a new
MLOperand
in realm . -
Set operand .
[[builder]]
to builder . -
Set operand .
[[descriptor]]
to desc . -
Return operand .
To
copy
an
MLOperand
given
MLOperand
operand
,
run
the
following
steps:
-
Let builder be operand .
[[builder]]
. -
Let realm be builder ’s relevant realm .
-
Let result be a new
MLOperand
in realm . -
Set result .
[[builder]]
to builder . -
Set result .
[[descriptor]]
to operand .[[descriptor]]
. -
If operand .
[[name]]
exists , then set result .[[name]]
to operand .[[name]]
. -
Return result .
To
validate
operand
given
MLGraphBuilder
builder
and
MLOperand
operand
,
return
true
if
operand
.
[[builder]]
is
builder
,
and
false
otherwise.
7.6.1.1.
MLNumber
MLNumber
is
used
when
specifying
the
type
of
a
numeric
option
for
an
MLOperand
which
can
be
of
any
MLOperandDataType
,
including
both
64-bit
integer
types
(
"uint64"
and
"int64"
)
and
32-bit
floating
point
(
"float32"
).
Implementations
process
the
value
according
to
the
corresponding
MLOperandDataType
.
For
example,
if
clamp(input,
options)
is
called
with
an
MLOperand
with
dataType
"uint32"
,
the
MLNumber
parameters
are
explicitly
cast
to
unsigned
long
.
double
would
lose
accuracy
when
passing
values
over
2
53
,
and
specifying
long
long
would
disallow
values
over
2
63
.
Support
for
unions
of
bigint
and
numeric
types
is
new
in
[WEBIDL]
,
and
implementation
support
is
also
limited.
Prototype
implementations
are
encouraged
to
provide
feedback
for
this
approach.
[whatwg/webidl
Issue
#1388]
7.7.
MLTensorDescriptor
dictionary
An
MLTensorDescriptor
describes
the
characteristics
and
capabilities
of
an
MLTensor
.
dictionary :
MLTensorDescriptor MLOperandDescriptor {boolean readable =false ;boolean writable =false ; };
-
readable
, of type boolean , defaulting tofalse
-
Whether the tensor’s contents can be read via
readTensor(tensor)
orreadTensor(tensor, outputData)
. -
writable
, of type boolean , defaulting tofalse
-
Whether the tensor’s contents can be written to via
writeTensor()
.
7.8.
MLTensor
interface
The
MLTensor
interface
represents
a
tensor
which
may
be
used
as
an
input
or
output
to
an
MLGraph
.
The
memory
backing
an
MLTensor
should
be
allocated
in
an
implementation-defined
fashion
according
to
the
requirements
of
the
MLContext
and
the
MLTensorDescriptor
used
to
create
it.
Operations
involving
the
[[data]]
of
an
MLTensor
occur
on
the
[[timeline]]
of
its
associated
MLContext
.
The
implementation-defined
requirements
of
how
an
MLTensor
is
allocated
may
include
constraints
such
as
that
the
memory
is
allocated
with
a
particular
byte
alignment
or
in
a
particular
memory
pool.
[SecureContext ,Exposed =(Window ,Worker )]interface {
MLTensor readonly attribute MLOperandDataType dataType ;readonly attribute FrozenArray <unsigned long >shape ;readonly attribute boolean readable ;readonly attribute boolean writable ;readonly attribute boolean constant ;undefined destroy (); };
MLTensor
has
the
following
internal
slots:
-
[[context]]
of typeMLContext
-
The
MLTensor
’s associated context. -
[[descriptor]]
of typeMLTensorDescriptor
-
The
MLTensor
’s descriptor. -
[[pendingPromises]]
of type set ofPromise
s -
Promises corresponding to
MLContext
.readTensor(tensor)
method calls which are in-progress and have yet to resolve. All pending promises will be rejected when theMLTensor
is destroyed. -
[[isDestroyed]]
of typeboolean
-
Whether the
MLTensor
.destroy()
steps have been run. Once destroyed, theMLTensor
can no longer be used. -
[[data]]
of an implementation-defined type -
The bytes backing the
MLTensor
. This data may only be accessed or modified from the[[context]]
.[[timeline]]
. -
[[isConstant]]
of typeboolean
-
Whether the
MLTensor
was created by create a constant MLTensor .
An
MLTensor
’s
dataType
is
its
[[descriptor]]
’s
dataType
.
An
MLTensor
’s
shape
is
its
[[descriptor]]
’s
shape
.
The
dataType
getter
steps
are
to
return
this
’s
dataType
.
The
shape
getter
steps
are
to
return
this
’s
shape
.
The
readable
getter
steps
are
to
return
this
.
[[descriptor]]
.
readable
.
The
writable
getter
steps
are
to
return
this
.
[[descriptor]]
.
writable
.
The
constant
getter
steps
are
to
return
this
’s
[[isConstant]]
.
7.8.1.
Creating
an
MLTensor
An
MLTensor
is
created
by
its
associated
MLContext
.
To
create
an
MLTensor
given
MLContext
context
and
MLTensorDescriptor
descriptor
,
run
the
following
steps:
-
Let realm be context ’s relevant realm .
-
Let tensor be a new
MLTensor
in realm . -
Set tensor .
[[context]]
to context . -
Set tensor .
[[descriptor]]
to descriptor . -
Set tensor .
[[isDestroyed]]
to false. -
Set tensor .
[[isConstant]]
to false. -
Return tensor .
7.8.2.
destroy()
Releases
the
resources
associated
with
the
MLTensor
.
This
method
is
idempotent.
undefined
.
The
destroy()
method
steps
are:
-
Set this .
[[isDestroyed]]
to true. -
For each promise in this .
[[pendingPromises]]
:-
Remove promise from this .
[[pendingPromises]]
. -
Reject promise with an "
InvalidStateError
"DOMException
.
-
-
Enqueue the following steps to this .
[[context]]
.[[timeline]]
:
Note: Since no further operations can be enqueued using this tensor, implementations can free any additional resource allocations associated with this tensor once all previously submitted operations using it are complete.
7.8.3.
Creating
a
constant
MLTensor
A
constant
MLTensor
is
created
by
its
associated
MLContext
.
To
create
a
constant
MLTensor
given
MLContext
context
,
MLOperandDescriptor
inputDescriptor
,
run
the
following
steps:
-
Let realm be context ’s relevant realm .
-
Let tensor be a new
MLTensor
in realm . -
Set tensor .
[[context]]
to context . -
Let tensorDescriptor be a new
MLTensorDescriptor
. -
Set tensorDescriptor .
readable
to false. -
Set tensorDescriptor .
writable
to false. -
Set tensorDescriptor .
dataType
to inputDescriptor .dataType
. -
Set tensor .
[[descriptor]]
to tensorDescriptor . -
Set tensor .
[[isDestroyed]]
to false. -
Set tensor .
[[isConstant]]
to true. -
Return tensor .
7.9.
MLGraphBuilder
interface
The
MLGraphBuilder
interface
defines
a
set
of
operations
as
identified
by
the
§ 2
Use
cases
that
can
be
composed
into
a
computational
graph.
It
also
represents
the
intermediate
state
of
a
graph
building
session.
typedef record <USVString ,MLOperand >; [
MLNamedOperands SecureContext ,Exposed =(Window ,Worker )]interface { // Construct the graph builder from the context.
MLGraphBuilder constructor (MLContext context ); // Create an operand for a graph input.MLOperand input (USVString name ,MLOperandDescriptor descriptor ); // Create an operand for a graph constant.MLOperand constant (MLOperandDescriptor descriptor ,AllowSharedBufferSource buffer ); // Create a scalar operand from the specified number of the specified type.MLOperand constant (MLOperandDataType type ,MLNumber value ); // Create an operand from a specified constant tensor.MLOperand constant (MLTensor tensor ); // Compile the graph up to the specified output operands asynchronously.Promise <MLGraph >build (MLNamedOperands outputs ); };
MLGraphBuilder
.
build()
method
compiles
the
graph
builder
state
up
to
the
specified
output
operands
into
a
compiled
graph
according
to
the
type
of
MLContext
that
creates
it.
When
the
[[contextType]]
of
the
MLContext
is
set
to
"
default
",
the
compiled
graph
is
initialized
right
before
the
MLGraph
is
returned.
This
graph
initialization
stage
is
important
for
optimal
performance
of
the
subsequent
graph
executions.
It
typically
involves
a
process
known
as
"weight
preprocessing"
where
all
the
constant
inputs
to
the
graph
are
preprocessed
and
cached
at
the
operating
system
level
for
subsequent
graph
execution
calls.
The
initializing
inputs
are
typically
the
constant
weight
data
specified
through
the
constant()
method
as
constant
operands
during
graph
construction
time.
MLGraphBuilder
has
the
following
internal
slots:
-
[[context]]
of typeMLContext
-
The context of type
MLContext
associated with thisMLGraphBuilder
. -
[[hasBuilt]]
of typeboolean
-
Whether
MLGraphBuilder
.build()
has been called. Once built, theMLGraphBuilder
can no longer create operators or compileMLGraph
s.
7.9.1.
MLGraphBuilder
constructor
-
context
: anMLContext
. The context to associate with theMLGraphBuilder
.
The
new
MLGraphBuilder(
context
)
constructor
steps
are:
-
If this ’s relevant global object ’s associated Document is not allowed to use the webnn feature, then throw a "
SecurityError
"DOMException
. -
If context is lost , then throw an "
InvalidStateError
"DOMException
. -
Set this .
[[context]]
to context . -
Set this .
[[hasBuilt]]
to false.
7.9.2. input operands
Create
a
named
MLOperand
based
on
a
descriptor,
that
can
be
used
as
an
input.
-
name
: a string name of the input. -
descriptor
: anMLOperandDescriptor
object.
MLOperand
.
The
input(
name
,
descriptor
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If any
MLOperand
s in this ’s graph ’s inputs have a[[name]]
equal to name , then throw aTypeError
. -
If checking dimensions given descriptor returns false, then throw a
TypeError
. -
Make graph connections:
-
Return operand .
MLGraphBuilder
API
allows
creating
an
MLGraph
without
input
operands.
If
the
underlying
platform
doesn’t
support
that,
implementations
can
add
a
stub
input,
or
pass
constants
as
inputs
to
the
graph.
7.9.3. constant operands
Create a constant
MLOperand
that
can
be
used
in
MLGraphBuilder
methods.
7.9.3.1.
constant(descriptor,
buffer)
Create
a
constant
MLOperand
of
the
specified
data
type
and
shape
that
contains
the
initializing
data.
-
descriptor
: anMLOperandDescriptor
. The descriptor of the output tensor. -
buffer
: anAllowSharedBufferSource
. The buffer containing the initializing data.
MLOperand
.
The
constant
output
tensor.
The
constant(
descriptor
,
buffer
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If checking dimensions given descriptor returns false, then throw a
TypeError
. -
If validating buffer with descriptor given buffer and descriptor returns false, then throw a
TypeError
. -
Make graph connections:
-
Let operand be the result of creating an MLOperand given this and descriptor .
-
Let bytes be the result of getting a copy of the bytes held by the buffer source given buffer .
-
Add operand to this ’s graph ’s constants with bytes as value.
-
-
Return operand .
7.9.3.2.
constant(tensor)
Create
a
constant
MLOperand
of
the
specified
data
type
and
shape
that
contains
the
initialized
data.
-
tensor
: anMLTensor
. The constant tensor containing the initialized data.
MLOperand
.
The
constant
output
tensor.
The
constant(
tensor
)
method
steps
are:
-
If tensor .
[[context]]
is not this .[[context]]
, then throw aTypeError
. -
If tensor .
[[isDestroyed]]
is true, then throw aTypeError
. -
If tensor .
[[isConstant]]
is false, then throw aTypeError
. -
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
Make graph connections:
-
Let operand be the result of creating an MLOperand given this and tensor .
[[descriptor]]
. -
Set operand .
[[constantTensor]]
to tensor . -
Add operand to this ’s graph ’s constants with tensor as value.
-
-
Return operand .
7.9.3.3.
constant(type,
value)
Create
a
scalar
constant
MLOperand
of
the
specified
value
and
data
type.
"int8"
data
type,
etc.
-
type
: anMLOperandDataType
. -
value
: anMLNumber
. The value of the constant.
MLOperand
.
The
constant
output.
The
constant(
type
,
value
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
Set value to the result of casting value to type .
-
Let descriptor be the result of creating an MLOperandDescriptor given type and « ».
-
Make graph connections:
-
Let operand be the result of creating an MLOperand given this and descriptor .
-
Add operand to this ’s graph ’s constants with value as value.
-
-
Return operand .
7.9.4. build method
Build a composed graph up to a given output operand into a computational graph asynchronously.-
outputs
: anMLNamedOperands
. Identifies theMLOperand
s that will be the outputs of the graph.
MLGraph
>.
The
build(
outputs
)
method
steps
are:
-
Let realm be this ’s relevant realm .
-
If this can not build , then return a new promise in realm rejected with an "
InvalidStateError
"DOMException
. -
If outputs is empty, then return a new promise in realm rejected with a
TypeError
. -
For each name → operand of outputs :
-
If name is empty, then return a new promise in realm rejected with a
TypeError
. -
If validating operand given this and operand returns false, then return a new promise in realm rejected with a
TypeError
. -
If operand is in this ’s graph ’s inputs or constants , then return a new promise in realm rejected with a
TypeError
. -
If operand .
[[constantTensor]]
exists and operand .[[constantTensor]]
.[[isDestroyed]]
is true, then return a new promise in realm rejected with aTypeError
.
-
-
Let operands be a new empty set .
-
Let operators be a new empty set .
-
Let inputs be a new empty set .
-
While queue is not empty :
-
Let global be this ’s relevant global object .
-
Let graph be a new
MLGraph
in realm . -
Set graph .
[[context]]
to this .[[context]]
. -
Set graph .
[[isDestroyed]]
to false. -
For each operand in inputs :
-
Set graph .
[[inputDescriptors]]
[ operand .[[name]]
] to operand .[[descriptor]]
.
-
-
For each name → operand of outputs :
-
Set graph .
[[outputDescriptors]]
[ name ] to operand .[[descriptor]]
.
-
-
Set this .
[[hasBuilt]]
to true. -
Let promise be a new promise in realm .
-
Enqueue the following steps to graph .
[[context]]
.[[timeline]]
:-
Run these steps, but abort when graph .
[[context]]
is lost :-
Let graphImpl be the result of converting this ’s graph with operands , operators , inputs , and outputs ’s values into an implementation-defined format which can be interpreted by the underlying platform.
-
If the previous step failed, then queue an ML task with global to reject promise with an "
OperationError
"DOMException
, and abort these steps. -
Set graph .
[[implementation]]
to graphImpl . -
Queue an ML task with global to resolve promise with graph .
-
-
If aborted , then queue an ML task with global to reject promise with an "
InvalidStateError
"DOMException
.
-
-
Return promise .
NOTE:
Specifying
an
input
operand
or
constant
operand
as
a
graph
output
results
in
an
error,
as
this
is
usually
an
incorrect
usage
of
the
API.
Callers
can
work
around
this
by
introducing
an
identity()
operator.
7.9.5. argMin/argMax operations
Return the index location of the minimum or maximum values of all the input values along the axis. In case of ties, the identity of the return value is implementation dependent.dictionary :
MLArgMinMaxOptions MLOperatorOptions {boolean keepDimensions =false ;MLOperandDataType outputDataType = "int32"; };partial interface MLGraphBuilder {MLOperand argMin (MLOperand input , [EnforceRange ]unsigned long axis ,optional MLArgMinMaxOptions options = {});MLOperand argMax (MLOperand input , [EnforceRange ]unsigned long axis ,optional MLArgMinMaxOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits argMin ;MLSingleInputSupportLimits argMax ; };
MLArgMinMaxOptions
has
the
following
members:
-
keepDimensions
, of type boolean , defaulting tofalse
-
If true, retains reduced dimensions with size 1.
-
outputDataType
, of type MLOperandDataType , defaulting to"int32"
-
An
MLOperandDataType
. The output data type.
-
input
: anMLOperand
. The input N-D tensor. -
axis
: The dimension to reduce. The value must be in the range [0, N-1] where N is the rank of the input tensor. -
options
: an optionalMLArgMinMaxOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
N-D
tensor
of
the
reduced
shape.
The
values
must
be
of
type
outputDataType
in
the
range
[0,
N-1]
where
N
is
the
size
of
the
input
dimension
specified
by
axis
.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
outputDataType
|
input
’s
rank
-
1
to
input
’s
rank
|
MLOpSupportLimits
has
the
following
members
for
argMin()
and
argMax()
:
-
argMin
, of type MLSingleInputSupportLimits -
Support limits for operator
argMin()
. -
argMax
, of type MLSingleInputSupportLimits -
Support limits for operator
argMax()
.
To
create
an
argMin/argMax
operation
given
string
op
,
MLOperand
input
,
unsigned
long
axis
,
and
MLArgMinMaxOptions
options
,
run
the
following
steps:
-
Assert : op is one of "argMin", "argMax".
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s shape [ axis ] is greater than options .
outputDataType
’s maximum value, then throw aTypeError
. -
Let outputShape be the result of calculating reduction output sizes given input ’s shape , « axis », and options .
keepDimensions
. If that returns failure, then throw aTypeError
. -
Let desc be the result of creating an MLOperandDescriptor given options .
outputDataType
and outputShape . -
Make graph connections:
-
Let operator be an operator for the op operation, given options .
-
Let output be the result of creating an MLOperand given this and desc .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The following argMin/argMax algorithms are supported.
argMin(
input
,
axis
,
options
)
method
steps
are:
-
Let output be the result of creating an argMin/argMax operation given "argMin", input , axis and options .
-
Return output .
argMax(
input
,
axis
,
options
)
method
steps
are:
-
Let output be the result of creating an argMin/argMax operation given "argMax", input , axis and options .
-
Return output .
7.9.6. batchNormalization
Normalize the values of the input tensor using [Batch-Normalization] . For each input feature, the mean and variance values of that feature are computed across all the samples in the batch dimension while the model is trained. These mean and variance values are then subsequently given to this operation during model inference.dictionary :
MLBatchNormalizationOptions MLOperatorOptions {MLOperand scale ;MLOperand bias ; [EnforceRange ]unsigned long axis = 1;double epsilon = 1e-5; };partial interface MLGraphBuilder {MLOperand batchNormalization (MLOperand input ,MLOperand mean ,MLOperand variance ,optional MLBatchNormalizationOptions options = {}); };dictionary {
MLBatchNormalizationSupportLimits MLTensorLimits input ;MLTensorLimits mean ;MLTensorLimits variance ;MLTensorLimits scale ;MLTensorLimits bias ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLBatchNormalizationSupportLimits batchNormalization ; };
MLBatchNormalizationOptions
has
the
following
members:
-
scale
, of type MLOperand -
The 1-D tensor of the scaling values whose size is equal to the size of the input dimension denoted by
axis
. -
bias
, of type MLOperand -
The 1-D tensor of the bias values whose size is equal to the size of the input dimension denoted by
axis
. -
axis
, of type unsigned long , defaulting to1
-
The index to the feature count dimension of the input shape for which the mean and variance values are. Its value must be in the range [0, N-1] where N is the rank of the input tensor. The default value is 1, corresponding to the channel ( "c" ) dimension in the
"nchw"
data layout. -
epsilon
, of type double , defaulting to1e-5
-
A small value to prevent computational error due to divide-by-zero.
-
input
: anMLOperand
. The input N-D tensor. -
mean
: anMLOperand
. Specifies the 1-D tensor of the mean values of the input features across the batch. Its size is equal to the size of the input dimension denoted byaxis
. -
variance
: anMLOperand
. The 1-D tensor of the variance values of the input features across the batch whose size is equal to the size of the input dimension denoted byaxis
. -
options
: an optionalMLBatchNormalizationOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
batch-normalized
N-D
tensor
of
the
same
shape
as
input
.
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
mean
|
same
as
input
| 1 |
variance
|
same
as
input
| 1 |
scale
|
same
as
input
| 1 |
bias
|
same
as
input
| 1 |
output |
same
as
input
|
same
as
input
|
MLBatchNormalizationSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
mean
, of type MLTensorLimits -
MLTensorLimits
for mean operand. -
variance
, of type MLTensorLimits -
MLTensorLimits
for variance operand. -
scale
, of type MLTensorLimits -
MLTensorLimits
for scale operand. -
bias
, of type MLTensorLimits -
MLTensorLimits
for bias operand. -
output
, of type MLDataTypeLimits -
MLTensorLimits
for output operand.
MLOpSupportLimits
has
the
following
members
for
batchNormalization()
:
-
batchNormalization
, of type MLBatchNormalizationSupportLimits -
Support limits for operator
batchNormalization()
.
The
batchNormalization(
input
,
mean
,
variance
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , mean , variance , options .
scale
(if it exists ), and options .bias
(if it exists ) returns false, then throw aTypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If options .
axis
is not in the range 0 to input ’s rank , exclusive, then throw aTypeError
. -
If mean ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If mean ’s shape is not equal to « input ’s shape [ options .
axis
] », then throw aTypeError
. -
If variance ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If variance ’s shape is not equal to « input ’s shape [ options .
axis
] », then throw aTypeError
. -
Set options .
epsilon
to the result of casting options .epsilon
to input ’s dataType . -
Make graph connections:
-
Let operator be an operator for the "batchNormalization" operation, given input , mean , variance and options .
-
Let output be the result of creating an MLOperand given this and input .
[[descriptor]]
. -
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input , mean , and variance .
-
If options .
scale
exists , then add it to operator ’s inputs . -
If options .
bias
exists , then add it to operator ’s inputs . -
Set operator ’s output to output .
-
-
Return output .
The
behavior
of
this
operation
when
the
input
tensor
is
4-D
of
the
"nchw"
layout
can
be
generically
emulated
from
the
usage
of
other
operations
as
follows,
although
user
agents
typically
have
a
more
efficient
implementation.
In
cases
where
the
underlying
platform
does
not
directly
support
an
operation,
this
decomposition
can
be
used
as
a
template
to
guide
the
implementation.
function batchNormalization( builder, input, mean, variance, options) { const shape= [ 1 , input. shape[ options. axis], 1 , 1 ]; return builder. add( builder. mul( builder. reshape( options. scale, shape), builder. div( builder. sub( input, builder. reshape( mean, shape)), builder. sqrt( builder. add( builder. reshape( variance, shape), builder. constant( input. dataType, options. epsilon))))), builder. reshape( options. bias, shape)); }
7.9.7. cast
Cast each element in the input tensor to the target data type.partial interface MLGraphBuilder {MLOperand cast (MLOperand input ,MLOperandDataType type ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits cast ; };
-
input
: anMLOperand
. The input N-D tensor. -
type
: anMLOperandDataType
. The target data type. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
N-D
tensor
of
the
same
shape
as
input
with
each
element
casted
to
the
target
data
type.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
type
|
same
as
input
|
MLOpSupportLimits
has
the
following
members
for
cast()
:
-
cast
, of type MLSingleInputSupportLimits -
Support limits for operator
cast()
.
Casting
between
MLOperandDataType
s
is
specified
for
some
cases
and
implementation-defined
in
other
cases,
according
to
the
following
table:
Target type Input type |
"float32"
,
"float16"
|
"int32"
,
"uint32"
,
"int64"
,
"uint64"
,
"int8"
,
"uint8"
|
---|---|---|
"float32"
,
"float16"
|
If
in
range,
nearest
representable
value.
If out of range, +/-Infinity. |
If
in
range,
truncated.
If out of range, implementation-defined . |
"int32"
,
"uint32"
,
"int64"
,
"uint64"
,
"int8"
,
"uint8"
|
If
in
range,
nearest
representable
value.
If out of range, +/-Infinity. |
If
in
range,
same
value.
If out of range, lowest N bits reinterpreted as target type, assuming two’s complement for signed types. |
NOTE:
For
example,
casting
-1
from
"int8"
to
"uint8"
is
specified
to
yield
255.
But
casting
-1
from
"float32"
to
"uint8"
is
implementation-defined
.
The
cast(
input
,
type
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
Make graph connections:
-
Let operator be an operator for the "cast" operation, given type and options .
-
Let output be the result of copying an MLOperand given input .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.8. clamp
Clamp the input tensor element-wise within a range specified by the minimum and maximum values.dictionary :
MLClampOptions MLOperatorOptions {MLNumber minValue ;MLNumber maxValue ; };partial interface MLGraphBuilder {MLOperand clamp (MLOperand input ,optional MLClampOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits clamp ; };
MLClampOptions
has
the
following
members:
-
minValue
, of type MLNumber -
The minimum value of the range. When it is not specified, the clamping is not performed on the lower limit of the range.
-
maxValue
, of type MLNumber -
The maximum value of the range. When it is not specified, the clamping is not performed on the upper limit of the range.
-
input
: anMLOperand
. The input tensor. -
options
: an optionalMLClampOptions
. The optional parameters of the operation.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
clamp()
:
-
clamp
, of type MLSingleInputSupportLimits -
Support limits for operator
clamp()
.
The
clamp(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
Let minValue be the options .
minValue
if given, or Infinity otherwise. -
Set options .
minValue
to the result of casting minValue to input ’s dataType . -
Let maxValue be the options .
maxValue
if given, or -Infinity otherwise. -
Set options .
maxValue
to the result of casting maxValue to input ’s dataType . -
If options .
minValue
is greater than options .maxValue
, then throw aTypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "clamp" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function clamp( builder, input, options) { if ( options. minValue=== undefined ) { if ( options. maxValue=== undefined ) { return input; } else { return builder. min( input, builder. constant( input. dataType, options. maxValue)); } } else { if ( options. maxValue=== undefined ) { return builder. max( input, builder. constant( input. dataType, options. minValue)); } else { return builder. min( builder. max( input, builder. constant( input. dataType, options. minValue)), builder. constant( input. dataType, options. maxValue)); } } }
7.9.9. concat
Concatenates the input tensors along a given axis.partial interface MLGraphBuilder {MLOperand concat (sequence <MLOperand >inputs , [EnforceRange ]unsigned long axis ,optional MLOperatorOptions options = {}); };dictionary {
MLConcatSupportLimits MLTensorLimits inputs ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLConcatSupportLimits concat ; };
-
inputs
: a sequence <MLOperand
>. All input tensors must have the same shape, except for the size of the dimension to concatenate on. -
axis
: anunsigned long
scalar. The axis that the inputs concatenate along. Its value must be in the range [0, N-1] where N is the rank of the input tensors. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
concatenated
tensor
of
all
the
inputs
along
the
axis
.
The
output
tensor
has
the
same
shape
except
on
the
dimension
that
all
the
inputs
concatenated
along.
The
size
of
that
dimension
is
computed
as
the
sum
of
all
the
input
sizes
of
the
same
dimension.
operand | allowed data types | allowed ranks |
---|---|---|
inputs
’s
items
| any | N |
output |
same
as
inputs
’s
items
|
same
as
inputs
’s
items
|
MLConcatSupportLimits
has
the
following
members:
-
inputs
, of type MLTensorLimits -
MLTensorLimits
for all input operands. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
concat()
:
-
concat
, of type MLConcatSupportLimits -
Support limits for operator
concat()
.
The
concat(
inputs
,
axis
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any item in inputs returns false, then throw a
TypeError
. -
Let first be inputs [0].
-
If axis is greater than or equal to first ’s rank , then throw a
TypeError
. -
Let desc be the result of creating an MLOperandDescriptor given first ’s dataType and first ’s shape .
-
For each index in the range 1 to inputs ’s size , exclusive:
-
Let input be inputs [ index ].
-
If input ’s dataType is not equal to first ’s dataType , then throw a
TypeError
. -
If input ’s rank is not equal to first ’s rank , then throw a
TypeError
. -
For each dim in the range 0 to input ’s rank , exclusive:
If the shape of each corresponding dimension and type of the operands, except for those of the dimension given by axis , is not the same, fail.
-
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "concat" operation, given inputs , axis , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to inputs .
-
Set operator ’s output to output .
-
-
Return output .
7.9.10. conv2d
Compute a 2-D convolution given 4-D input and filter tensorsenum {
MLConv2dFilterOperandLayout ,
"oihw" ,
"hwio" ,
"ohwi" };
"ihwo" dictionary :
MLConv2dOptions MLOperatorOptions {sequence <[EnforceRange ]unsigned long >padding ;sequence <[EnforceRange ]unsigned long >strides ;sequence <[EnforceRange ]unsigned long >dilations ; [EnforceRange ]unsigned long groups = 1;MLInputOperandLayout inputLayout = "nchw";MLConv2dFilterOperandLayout filterLayout = "oihw";MLOperand bias ; };partial interface MLGraphBuilder {MLOperand conv2d (MLOperand input ,MLOperand filter ,optional MLConv2dOptions options = {}); };dictionary {
MLConv2dSupportLimits MLTensorLimits input ;MLTensorLimits filter ;MLTensorLimits bias ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLConv2dSupportLimits conv2d ; };
MLConv2dOptions
has
the
following
members:
-
padding
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 4: [beginningHeight, endingHeight, beginningWidth, endingWidth] . Specifies the additional rows and columns added to the beginning and ending of each spatial dimension of the convolution input. The default value is [0, 0, 0, 0].
-
strides
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2: [strideHeight, strideWidth] . Specifies the stride of the sliding window for each spatial dimension of the convolution input. The default value is [1, 1].
-
dilations
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2: [dilationHeight, dilationWidth] . Specifies the dilation factor for each spatial dimension applied on the convolution filter (kernel). The default value is [1, 1].
-
groups
, of type unsigned long , defaulting to1
-
The number of groups that input channels and output channels are divided into.
-
inputLayout
, of type MLInputOperandLayout , defaulting to"nchw"
-
Specifies the layout format of the input and output tensor as follows:
-
filterLayout
, of type MLConv2dFilterOperandLayout , defaulting to"oihw"
-
Specifies the layout format of the filter tensor as follows:
-
bias
, of type MLOperand -
An additional 1-D tensor with the shape of [outputChannels] whose values are to be added to the convolution result.
-
input
: anMLOperand
. The input 4-D tensor. The logical shape is interpreted according to the value ofinputLayout
. -
filter
: anMLOperand
. The filter 4-D tensor. The logical shape is interpreted according to the value offilterLayout
andgroups
. -
options
: anMLConv2dOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
4-D
tensor
that
contains
the
convolution
result.
The
output
shape
is
interpreted
according
to
inputLayout
.
More
specifically,
the
spatial
dimensions
or
the
sizes
of
the
last
two
dimensions
of
the
output
tensor
for
the
"nchw"
input
layout
can
be
calculated
as
follows:
outputSize
=
1
+
(inputSize
-
(filterSize
-
1)
*
dilation
-
1
+
beginningPadding
+
endingPadding)
/
stride
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| 4 |
filter
|
same
as
input
| 4 |
bias
|
same
as
input
| 1 |
output |
same
as
input
| 4 |
MLConv2dSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
filter
, of type MLTensorLimits -
MLTensorLimits
for filter operand. -
bias
, of type MLTensorLimits -
MLTensorLimits
for bias operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
conv2d()
:
-
conv2d
, of type MLConv2dSupportLimits -
Support limits for operator
conv2d()
.
groups
=
inputChannels
=
outputChannels
and
the
shape
of
filter
tensor
is
[options.groups,
1,
height,
width]
for
"oihw"
layout,
[height,
width,
1,
options.groups]
for
"hwio"
layout,
[options.groups,
height,
width,
1]
for
"ohwi"
layout
and
[1,
height,
width,
options.groups]
for
"ihwo"
layout.
To calculate conv output size given unsigned integers inputSize , filterSize , beginningPadding , endingPadding , stride and dilation , perform these steps. They return a number.
-
Let effectiveFilterSize be ( filterSize - 1 ) * dilation + 1.
-
Let outputSize be ( inputSize - effectiveFilterSize + beginningPadding + endingPadding ) / stride + 1.
-
Return outputSize .
To calculate conv2d output sizes given unsigned integers inputHeight , inputWidth , filterHeight and filterWidth , list of 4 unsigned integers padding , list of 2 unsigned integers strides , and list of 2 unsigned integers dilations , perform these steps. They return a list of 2 numbers.
-
Let outputHeight be the result of calculating conv output size given inputHeight , filterHeight , padding [0], padding [1], strides [0] and dilations [0].
-
Let outputWidth be the result of calculating conv output size given inputWidth , filterWidth , padding [2], padding [3], strides [1] and dilations [1].
-
Return « outputHeight , outputWidth ».
The
conv2d(
input
,
filter
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , filter , and options .
bias
(if it exists ) returns false, then throw aTypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If input ’s rank is not its allowed rank , then throw a
TypeError
. -
If filter ’s rank is not its allowed rank , then throw a
TypeError
. -
If filter ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If options .
padding
does not exist , then set it to the list « 0, 0, 0, 0 ». -
Otherwise, if options .
padding
’s size is not 4, then throw aTypeError
. -
If options .
strides
does not exist , then set it to the list « 1, 1 ». -
Otherwise, if options .
strides
’s size is not 2, then throw aTypeError
. -
If any item in options .
strides
is equal to 0, then throw aTypeError
. -
If options .
dilations
does not exist , then set it to the list « 1, 1 ». -
Otherwise, if options .
dilations
’s size is not 2, then throw aTypeError
. -
If any item in options .
dilations
is equal to 0, then throw aTypeError
. -
Calculate the output shape:
-
Let inputShape be input ’s shape .
-
Switch on options .
inputLayout
: -
Let filterShape be filter ’s shape .
-
Switch on options .
filterLayout
:-
"hwio"
-
Let « filterHeight , filterWidth , filterInputChannels , outputChannels » be filterShape .
-
"ohwi"
-
Let « outputChannels , filterHeight , filterWidth , filterInputChannels » be filterShape .
-
"ihwo"
-
Let « filterInputChannels , filterHeight , filterWidth , outputChannels » be filterShape .
-
"oihw"
-
Let « outputChannels , filterInputChannels , filterHeight , filterWidth » be filterShape .
-
-
If inputChannels % options .
groups
is not 0, then throw aTypeError
. -
Otherwise, if inputChannels / options .
groups
is not equal to filterInputChannels , then throw aTypeError
. -
If options .
bias
exists , then:-
If its shape is not equal to « outputChannels », then throw a
TypeError
. -
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
.
-
-
Let « outputHeight , outputWidth » be the result of calculating conv2d output sizes given inputHeight , inputWidth , filterHeight , filterWidth , options .
padding
, options .strides
, and options .dilations
. -
Set outputHeight to floor( outputHeight ).
-
Set outputWidth to floor( outputWidth ).
-
If either outputHeight or outputWidth is not a valid dimension , then throw a
TypeError
. -
Switch on options .
inputLayout
: -
Let desc be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "conv2d" operation, given options and filter .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input and filter .
-
If options .
bias
exists , then add it to operator ’s inputs . -
Set operator ’s output to output .
-
-
Return output .
7.9.11. convTranspose2d
Compute a 2-D transposed convolution given 4-D input and filter tensorsenum {
MLConvTranspose2dFilterOperandLayout ,
"iohw" ,
"hwoi" };
"ohwi" dictionary :
MLConvTranspose2dOptions MLOperatorOptions {sequence <[EnforceRange ]unsigned long >padding ;sequence <[EnforceRange ]unsigned long >strides ;sequence <[EnforceRange ]unsigned long >dilations ;sequence <[EnforceRange ]unsigned long >outputPadding ;sequence <[EnforceRange ]unsigned long >outputSizes ; [EnforceRange ]unsigned long groups = 1;MLInputOperandLayout inputLayout = "nchw";MLConvTranspose2dFilterOperandLayout filterLayout = "iohw";MLOperand bias ; };partial interface MLGraphBuilder {MLOperand convTranspose2d (MLOperand input ,MLOperand filter ,optional MLConvTranspose2dOptions options = {}); };partial dictionary MLOpSupportLimits {MLConv2dSupportLimits convTranspose2d ; };
MLConvTranspose2dOptions
has
the
following
members:
-
padding
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 4: [beginningHeight, endingHeight, beginningWidth, endingWidth] . Specifies the additional rows and columns added to the beginning and ending of each spatial dimension of the convolution input. The default value is [0, 0, 0, 0].
-
strides
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2: [strideHeight, strideWidth] . Specifies the stride of the sliding window for each spatial dimension of the convolution input. The default value is [1, 1].
-
dilations
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2: [dilationHeight, dilationWidth] . Specifies the dilation factor for each spatial dimension applied on the convolution filter (kernel). The default value is [1, 1].
-
outputPadding
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2. Specifies the padding values applied to each spatial dimension of the output tensor. The explicit padding values are needed to disambiguate the output tensor shape for transposed convolution when the value of the
strides
is greater than 1.Note that these values are only used to disambiguate output shape when needed; it does not necessarily cause any padding value to be written to the output tensor.
The default value is [0, 0].
-
outputSizes
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2. Specifies the sizes of the last two dimensions of the output tensor. When the output sizes are explicitly specified, the output padding values in
outputPadding
are ignored.If not specified, the output sizes are automatically computed.
-
groups
, of type unsigned long , defaulting to1
-
The number of groups that input channels and output channels are divided into.
-
inputLayout
, of type MLInputOperandLayout , defaulting to"nchw"
-
Specifies the layout format of the input and output tensor as follows:
-
filterLayout
, of type MLConvTranspose2dFilterOperandLayout , defaulting to"iohw"
-
Specifies the layout format of the filter tensor as follows:
-
bias
, of type MLOperand -
An additional 1-D tensor with the shape of [outputChannels] whose values are to be added to the convolution result.
-
input
: anMLOperand
. The input 4-D tensor. The logical shape is interpreted according to the value ofinputLayout
. -
filter
: anMLOperand
. The filter 4-D tensor. The logical shape is interpreted according to the value offilterLayout
andgroups
. -
options
: an optionalMLConvTranspose2dOptions
.
Returns:
an
MLOperand
.
The
output
4-D
tensor
that
contains
the
transposed
convolution
result.
The
output
shape
is
interpreted
according
to
inputLayout
.
More
specifically,
unless
outputSizes
is
explicitly
specified,
outputPadding
is
needed
to
compute
the
spatial
dimension
values
of
the
output
tensor
as
follows:
outputSize
=
(inputSize
-
1)
*
stride
+
(filterSize
-
1)
*
dilation
+
1
-
beginningPadding
-
endingPadding
+
outputPadding
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| 4 |
filter
|
same
as
input
| 4 |
bias
|
same
as
input
| 1 |
output |
same
as
input
| 4 |
MLOpSupportLimits
has
the
following
member
for
convTranspose2d()
:
-
convTranspose2d
, of type MLConv2dSupportLimits -
Support limits for operator
convTranspose2d()
.
To calculate convtranspose output size given unsigned integers inputSize , filterSize , beginningPadding , endingPadding , stride , and dilation , perform these steps. They return a number.
-
Let effectiveFilterSize be ( filterSize - 1 ) * dilation + 1.
-
Let outputSize be ( inputSize - 1 ) * stride + effectiveFilterSize - beginningPadding - endingPadding .
-
Return outputSize .
The
convTranspose2d(
input
,
filter
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , filter , and options .
bias
(if it exists ) returns false, then throw aTypeError
. -
If input ’s rank is not its allowed rank , then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If filter ’s rank is not its allowed rank , then throw a
TypeError
. -
If filter ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If options .
padding
does not exist , then set it to the list « 0, 0, 0, 0 ». -
Otherwise, if options .
padding
’s size is not 4, then throw aTypeError
. -
If options .
strides
does not exist , then set it to the list « 1, 1 ». -
Otherwise, if options .
strides
’s size is not 2, then throw aTypeError
. -
If any item in options .
strides
is equal to 0, then throw aTypeError
. -
If options .
dilations
does not exist , then set it to the list « 1, 1 ». -
Otherwise, if options .
dilations
’s size is not 2, then throw aTypeError
. -
If any item in options .
dilations
is equal to 0, then throw aTypeError
. -
If options .
outputPadding
does not exist , then set it to the list « 0, 0 ». -
Otherwise, if options .
outputPadding
’s size is not 2, then throw aTypeError
. -
If options .
outputSizes
exists , then: -
Otherwise:
-
If options .
outputPadding
[0] is greater than or equal to options .strides
[0], or options .outputPadding
[1] is greater than or equal to options .strides
[1], then throw aTypeError
.
-
-
Calculate the output shape:
-
Let inputShape be input ’s shape .
-
Switch on options .
inputLayout
: -
Let filterShape be filter ’s shape .
-
Switch on options .
filterLayout
:-
"iohw"
-
Let « filterInputChannels , filterOutputChannels , filterHeight , filterWidth » be filterShape .
-
"hwoi"
-
Let « filterHeight , filterWidth , filterOutputChannels , filterInputChannels » be filterShape .
-
"ohwi"
-
Let « filterOutputChannels , filterHeight , filterWidth , filterInputChannels » be filterShape .
-
-
If inputChannels is not equal to filterInputChannels , then throw a
TypeError
. -
Let outputChannels be filterOutputChannels * options .
groups
. -
If outputChannels is not a valid dimension , then throw a
TypeError
. -
If options .
bias
exists , then:-
If its shape is not equal to « outputChannels », then throw a
TypeError
. -
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
.
-
-
Let calculatedOutputHeight be the result of calculating convtranspose output size given inputHeight , filterHeight , padding [0], padding [1], strides [0] and dilations [0].
-
Let calculatedOutputWidth be the result of calculating convtranspose output size given inputWidth , filterWidth , padding [2], padding [3], strides [1] and dilations [1].
-
If options .
outputSizes
exists , then:-
Let « outputHeight , outputWidth » be options .
outputSizes
. -
If outputHeight is less than calculatedOutputHeight , or outputHeight is greater than or equal to calculatedOutputHeight + strides [0], then throw a
TypeError
. -
If outputWidth is less than calculatedOutputWidth , or outputWidth is greater than or equal to calculatedOutputWidth + strides [1], then throw a
TypeError
.
-
-
Otherwise:
-
Let outputHeight be calculatedOutputHeight + options .
outputPadding
[0]. -
Let outputWidth be calculatedOutputWidth + options .
outputPadding
[1].
-
-
If either outputHeight or outputWidth is not a valid dimension , then throw a
TypeError
. -
Switch on options .
inputLayout
: -
Let desc be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "convTranspose2d" operation, given options and filter .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input and filter .
-
If options .
bias
exists , then add it to operator ’s inputs . -
Set operator ’s output to output .
-
-
Return output .
7.9.12. cumulativeSum
Compute the accumulated sum of a series of values along the given axis, either including or excluding the current value.dictionary :
MLCumulativeSumOptions MLOperatorOptions {boolean exclusive =false ;boolean reversed =false ; };partial interface MLGraphBuilder {MLOperand cumulativeSum (MLOperand input ,unsigned long axis ,optional MLCumulativeSumOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits cumulativeSum ; };
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
same
as
input
|
MLCumulativeSumOptions
has
the
following
members:
-
exclusive
, of type boolean , defaulting tofalse
-
Whether to include or exclude the current value in the output, meaning inclusive prefix sum or exclusive prefix sum [Prefix-sum] . Given input [1,2,3,4] , inclusive summation would yield an output of [1,3,6,10] whereas exclusive would yield [0,1,3,6] . The default is inclusive.
-
reversed
, of type boolean , defaulting tofalse
-
Whether to reverse the summation direction along the active axis to instead start from the high coordinate to low coordinate. Given input [1,2,3,4] , inclusive forward summation would yield an output of [1,3,6,10] whereas inclusive backward summation would yield [10,9,7,4] . The default is forward.
-
input
: anMLOperand
. The input tensor. -
axis
: anunsigned long
scalar. The axis the summation will be performed on. Its value must be in the range [0, N-1] where N isinput
’s rank . -
options
: anMLCumulativeSumOptions
. Specifies the optional parameters of the operation.
Returns:
MLOpSupportLimits
has
the
following
member
for
cumulativeSum()
:
-
cumulativeSum
, of type MLSingleInputSupportLimits -
Support limits for operator
cumulativeSum()
.
The
cumulativeSum(
input
,
axis
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If axis is greater than or equal to input ’s rank , then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "cumulativeSum" operation and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.13. Element-wise binary operations
Compute the element-wise binary addition, subtraction, multiplication, division, power, maximum and minimum of the two input tensors.The operation will be broadcast according to [numpy-broadcasting-rule] . The input tensors must be bidirectionally broadcastable . The rank of the output tensor is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.
partial interface MLGraphBuilder {MLOperand add (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand sub (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand mul (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand div (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand max (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand min (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand pow (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLBinarySupportLimits add ;MLBinarySupportLimits sub ;MLBinarySupportLimits mul ;MLBinarySupportLimits div ;MLBinarySupportLimits max ;MLBinarySupportLimits min ;MLBinarySupportLimits pow ; };
-
a
: anMLOperand
. The first input tensor. -
b
: anMLOperand
. The second input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
result
of
element-wise
binary
operation
of
the
two
input
tensors.
-
add : Add the values of the two input tensors, element-wise.
-
sub : Subtract the values of the second input tensor from the values of the first input tensor, element-wise.
-
mul : Multiply the values of the two input tensors, element-wise.
-
div : Divide the values of the first input tensor with the values of the second tensor, element-wise.
-
max : Select the greater values of the two input tensors, element-wise.
-
min : Select the lesser values of the two input tensors, element-wise.
-
pow : Compute the values of the values of the first input tensor to the power of the values of the second input tensor, element-wise.
operand | allowed data types | allowed ranks |
---|---|---|
a
| any | N |
b
|
same
as
a
| N |
output |
same
as
a
|
maximum
of
a
’s
rank
and
b
’s
rank
|
MLOpSupportLimits
has
the
following
members
for
element-wise
binary
operations:
-
add
, of type MLBinarySupportLimits -
Support limits for operator
add()
. -
sub
, of type MLBinarySupportLimits -
Support limits for operator
sub()
. -
mul
, of type MLBinarySupportLimits -
Support limits for operator
mul()
. -
div
, of type MLBinarySupportLimits -
Support limits for operator
div()
. -
max
, of type MLBinarySupportLimits -
Support limits for operator
max()
. -
min
, of type MLBinarySupportLimits -
Support limits for operator
min()
. -
pow
, of type MLBinarySupportLimits -
Support limits for operator
pow()
.
To
create
an
element-wise
binary
operation
given
string
op
,
MLOperand
a
,
MLOperand
b
,
and
MLOperatorOptions
options
,
run
the
following
steps:
-
Assert : op is one of "add", "sub", "mul", "div", "max", "min", "pow".
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of a and b returns false, then throw a
TypeError
. -
If a ’s dataType is not equal to b ’s dataType , then throw a
TypeError
. -
Let outputShape be the result of bidirectionally broadcasting a ’s shape and b ’s shape .
-
Let descriptor be the result of creating an MLOperandDescriptor given a ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and descriptor .
-
Let operator be an operator for the op operation, given a , b , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to a and b .
-
Set operator ’s output to output .
-
-
Return output .
The element-wise binary operation algorithms invoke the creating an element-wise binary operation steps as follows.
add(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise binary operation given "add", a , b , and options .
-
Return output .
sub(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise binary operation given "sub", a , b , and options .
-
Return output .
mul(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise binary operation given "mul", a , b , and options .
-
Return output .
div(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise binary operation given "div", a , b , and options .
-
Return output .
max(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise binary operation given "max", a , b , and options .
-
Return output .
min(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise binary operation given "min", a , b , and options .
-
Return output .
pow(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise binary operation given "pow", a , b , and options .
-
Return output .
7.9.14. Element-wise logical operations
Compare input tensors element-wise and return a
"uint8"
tensor
of
values
0
(false)
or
1
(true)
for
the
comparisons.
For
single-operand
operations,
return
the
logical
results
of
the
operation.
For multiple-operand operations, the operation will be broadcast according to [numpy-broadcasting-rule] . The input tensors must be bidirectionally broadcastable . The rank of the output tensor is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.
partial interface MLGraphBuilder {MLOperand equal (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand notEqual (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand greater (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand greaterOrEqual (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand lesser (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand lesserOrEqual (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand logicalNot (MLOperand a ,optional MLOperatorOptions options = {});MLOperand logicalAnd (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand logicalOr (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {});MLOperand logicalXor (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {}); };dictionary {
MLLogicalNotSupportLimits MLTensorLimits a ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLBinarySupportLimits equal ;MLBinarySupportLimits notEqual ;MLBinarySupportLimits greater ;MLBinarySupportLimits greaterOrEqual ;MLBinarySupportLimits lesser ;MLBinarySupportLimits lesserOrEqual ;MLLogicalNotSupportLimits logicalNot ;MLBinarySupportLimits logicalAnd ;MLBinarySupportLimits logicalOr ;MLBinarySupportLimits logicalXor ; };
-
a
: anMLOperand
. The first input tensor. -
b
: anMLOperand
. The second input tensor when specified. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
result
of
element-wise
comparison
of
the
two
input
tensors.
operand | allowed data types | allowed ranks |
---|---|---|
a
| specified as part of operation steps | N |
b
|
same
as
a
| N |
output |
"uint8"
|
maximum
of
a
’s
rank
and
b
’s
rank
|
MLLogicalNotSupportLimits
has
the
following
members:
-
a
, of type MLTensorLimits -
MLTensorLimits
for a operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
members
for
element-wise
logical
operations:
-
equal
, of type MLBinarySupportLimits -
Support limits for operator
equal()
. -
notEqual
, of type MLBinarySupportLimits -
Support limits for operator
notEqual()
. -
greater
, of type MLBinarySupportLimits -
Support limits for operator
greater()
. -
greaterOrEqual
, of type MLBinarySupportLimits -
Support limits for operator
greaterOrEqual()
. -
lesser
, of type MLBinarySupportLimits -
Support limits for operator
lesser()
. -
lesserOrEqual
, of type MLBinarySupportLimits -
Support limits for operator
lesserOrEqual()
. -
logicalNot
, of type MLLogicalNotSupportLimits -
Support limits for operator
logicalNot()
. -
logicalAnd
, of type MLBinarySupportLimits -
Support limits for operator
logicalAnd()
. -
logicalOr
, of type MLBinarySupportLimits -
Support limits for operator
logicalOr()
. -
logicalXor
, of type MLBinarySupportLimits -
Support limits for operator
logicalXor()
.
-
equal : Compare if the values of the two input tensors are equal, element-wise.
-
notEqual : Compare if the values of the two input tensors are not equal, element-wise.
-
greater : Compare if the values of the first input tensor is greater, element-wise.
-
greaterOrEqual : Compare if the values of the first input tensor is greater or equal, element-wise.
-
lesser : Compare if the values of the first input tensor is lesser, element-wise.
-
lesserOrEqual : Compare if the values of the first input tensor is lesser or equal, element-wise.
-
logicalNot : Invert the values of the input tensor to values 0 or 1, element-wise. Specifically, when the input value is non-zero, invert it to 0. Conversely, for a zero input value, invert it to 1.
-
logicalAnd : Compute the logical and of the two input tensors, element-wise, treating any non-zero value as true and returning elements of 0 or 1.
-
logicalOr : Compute the logical or of the two input tensors, element-wise, treating any non-zero value as true and returning elements of 0 or 1.
-
logicalXor : Compute the logical xor of the two input tensors, element-wise, treating any non-zero value as true and returning elements of 0 or 1.
greaterOrEqual()
and
lesserOrEqual()
can
each
be
implemented
in
terms
of
operations
logicalNot()
,
lesser()
,
and
greater()
(in
other
words
builder.greaterOrEqual(a,
b)
is
builder.logicalNot(builder.lesser(a,
b))
),
they
are
specifically
defined
to
handle
NaN
cases
and
for
performance
reason
to
avoid
double
comparisons.
To
create
an
element-wise
logical
operation
given
string
op
,
MLOperand
a
,
an
optional
MLOperand
b
,
and
MLOperatorOptions
options
,
run
the
following
steps:
-
Assert : op is one of "equal", "notEqual", "greater", "greaterOrEqual", "lesser", "lesserOrEqual", "logicalNot", "logicalAnd", "logicalOr", "logicalXor".
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and a returns false, then throw a
TypeError
. -
If op is one of "logicalNot", "logicalAnd", "logicalOr", "logicalXor", then:
-
If b is passed, then:
-
Otherwise:
-
Let descriptor be the result of creating an MLOperandDescriptor given
"uint8"
and outputShape . -
Make graph connections:
-
Let output be the result of creating an MLOperand given this and descriptor .
-
Let operator be an operator for the op operation, given a and (if b is passed) b , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to a and (if b is passed) b .
-
Set operator ’s output to output .
-
-
Return output .
The element-wise logical operation algorithms invoke the creating an element-wise logical operation steps as follows.
equal(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "equal", a , b , and options .
-
Return output .
notEqual(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "notEqual", a , b , and options .
-
Return output .
greater(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "greater", a , b , and options .
-
Return output .
greaterOrEqual(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "greaterOrEqual", a , b , and options .
-
Return output .
lesser(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "lesser", a , b , and options .
-
Return output .
lesserOrEqual(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "lesserOrEqual", a , b , and options .
-
Return output .
logicalNot(
a
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "logicalNot", a , and options .
-
Return output .
logicalAnd(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "logicalAnd", a , b , and options .
-
Return output .
logicalOr(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "logicalOr", a , b , and options .
-
Return output .
logicalXor(
a
,
b
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise logical operation given "logicalXor", a , b , and options .
-
Return output .
7.9.15. Element-wise unary operations
Compute the element-wise unary operation for input tensor.partial interface MLGraphBuilder {MLOperand abs (MLOperand input ,optional MLOperatorOptions options = {});MLOperand ceil (MLOperand input ,optional MLOperatorOptions options = {});MLOperand cos (MLOperand input ,optional MLOperatorOptions options = {});MLOperand erf (MLOperand input ,optional MLOperatorOptions options = {});MLOperand exp (MLOperand input ,optional MLOperatorOptions options = {});MLOperand floor (MLOperand input ,optional MLOperatorOptions options = {});MLOperand identity (MLOperand input ,optional MLOperatorOptions options = {});MLOperand log (MLOperand input ,optional MLOperatorOptions options = {});MLOperand neg (MLOperand input ,optional MLOperatorOptions options = {});MLOperand reciprocal (MLOperand input ,optional MLOperatorOptions options = {});MLOperand sin (MLOperand input ,optional MLOperatorOptions options = {});MLOperand sign (MLOperand input ,optional MLOperatorOptions options = {});MLOperand sqrt (MLOperand input ,optional MLOperatorOptions options = {});MLOperand tan (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits abs ;MLSingleInputSupportLimits ceil ;MLSingleInputSupportLimits cos ;MLSingleInputSupportLimits erf ;MLSingleInputSupportLimits exp ;MLSingleInputSupportLimits floor ;MLSingleInputSupportLimits identity ;MLSingleInputSupportLimits log ;MLSingleInputSupportLimits neg ;MLSingleInputSupportLimits reciprocal ;MLSingleInputSupportLimits sin ;MLSingleInputSupportLimits sign ;MLSingleInputSupportLimits sqrt ;MLSingleInputSupportLimits tan ; };
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
result
of
element-wise
unary
operation
of
the
input
tensor.
The
shape
of
the
output
tensor
is
the
same
as
the
shape
of
input
tensor.
operand | allowed data types | allowed ranks |
---|---|---|
input
| specified as part of operation steps | N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
members
for
element-wise
unary
operations:
-
abs
, of type MLSingleInputSupportLimits -
Support limits for operator
abs()
. -
ceil
, of type MLSingleInputSupportLimits -
Support limits for operator
ceil()
. -
cos
, of type MLSingleInputSupportLimits -
Support limits for operator
cos()
. -
erf
, of type MLSingleInputSupportLimits -
Support limits for operator
erf()
. -
exp
, of type MLSingleInputSupportLimits -
Support limits for operator
exp()
. -
floor
, of type MLSingleInputSupportLimits -
Support limits for operator
floor()
. -
identity
, of type MLSingleInputSupportLimits -
Support limits for operator
identity()
. -
log
, of type MLSingleInputSupportLimits -
Support limits for operator
log()
. -
neg
, of type MLSingleInputSupportLimits -
Support limits for operator
neg()
. -
reciprocal
, of type MLSingleInputSupportLimits -
Support limits for operator
reciprocal()
. -
sin
, of type MLSingleInputSupportLimits -
Support limits for operator
sin()
. -
sign
, of type MLSingleInputSupportLimits -
Support limits for operator
sign()
. -
sqrt
, of type MLSingleInputSupportLimits -
Support limits for operator
sqrt()
. -
tan
, of type MLSingleInputSupportLimits -
Support limits for operator
tan()
.
-
abs : Compute the absolute value of the input tensor, element-wise.
-
ceil : Compute the ceiling of the input tensor, element-wise.
-
cos : Compute the cosine of the input tensor, element-wise.
-
erf : Compute the error function [Error-Function] of the input tensor, element-wise.
-
exp : Compute the exponential of the input tensor, element-wise.
-
floor : Compute the floor of the input tensor, element-wise.
-
identity : Copy the value of the input tensor to the output tensor, element-wise.
-
log : Compute the natural logarithm of the input tensor, element-wise.
-
neg : Compute the numerical negative value of the input tensor, element-wise.
-
reciprocal : Compute the reciprocal of the input tensor, element-wise.
-
sin : Compute the sine of the input tensor, element-wise.
-
sign : Compute the sign (-1, 0, 1) of the input tensor, element-wise, returning 1 if > 0, -1 if < 0, and 0 otherwise.
-
sqrt : Compute the square root of the input tensor, element-wise.
-
tan : Compute the tangent of the input tensor, element-wise.
To
create
an
element-wise
unary
operation
given
string
op
,
MLOperand
input
,
optional
list
allowedDataTypes
,
and
options
,
run
the
following
steps:
-
Assert : op is one of "abs", "ceil", "cos", "erf", "exp", "floor", "identity", "log", "neg", "reciprocal", "sin", "sign", "sqrt", "tan".
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If allowedDataTypes is given and it does not contain input ’s dataType , then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the op operation given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The element-wise unary operation algorithms invoke the creating an element-wise unary operation steps as follows.
abs(
input
,
options
)
method
steps
are:
ceil(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "ceil", input , «
"float32"
,"float16"
», and options . -
Return output .
cos(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "cos", input , «
"float32"
,"float16"
», and options . -
Return output .
erf(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "erf", input , «
"float32"
,"float16"
», and options . -
Return output .
exp(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "exp", input , «
"float32"
,"float16"
», and options . -
Return output .
floor(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "floor", input , «
"float32"
,"float16"
», and options . -
Return output .
identity(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "identity" input , and options .
-
Return output .
log(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "log", input , «
"float32"
,"float16"
», and options . -
Return output .
neg(
input
,
options
)
method
steps
are:
reciprocal(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "reciprocal", input , «
"float32"
,"float16"
», and options . -
Return output .
sin(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "sin", input , «
"float32"
,"float16"
», and options . -
Return output .
sign(
input
,
options
)
method
steps
are:
sqrt(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "sqrt", input , «
"float32"
,"float16"
», and options . -
Return output .
tan(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an element-wise unary operation given "tan", input , «
"float32"
,"float16"
», and options . -
Return output .
The
behavior
of
the
sign()
operation
can
be
generically
emulated
from
the
usage
of
other
operations
as
follows,
although
user
agents
typically
have
a
more
efficient
implementation.
In
cases
where
the
underlying
platform
does
not
directly
support
an
operation,
this
decomposition
can
be
used
as
a
template
to
guide
the
implementation.
function sign( builder, input, options) { const zero= builder. constant( input. dataType, 0 ); const positiveOne= builder. constant( input. dataType, 1 ); const negativeOne= builder. constant( input. dataType, - 1 ); return builder. where( builder. greater( input, zero), positiveOne, builder. where( builder. lesser( input, zero), negativeOne, zero)); }
7.9.16. dequantizeLinear
Dequantizes an integer tensor to floating point tensor using the scale and zero-point bias, where
output
=
(input
-
zeroPoint)
*
scale
.
The
scale
and
zeroPoint
tensors
can
be
smaller
than
the
input
tensor
as
they
are
blockwise
broadcastable
.
partial interface MLGraphBuilder {MLOperand dequantizeLinear (MLOperand input ,MLOperand scale ,MLOperand zeroPoint ,optional MLOperatorOptions options = {}); };dictionary {
MLQuantizeDequantizeLinearSupportLimits MLTensorLimits input ;MLTensorLimits scale ;MLTensorLimits zeroPoint ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLQuantizeDequantizeLinearSupportLimits dequantizeLinear ; };
-
input
: anMLOperand
. The input tensor. -
scale
: anMLOperand
. The scale tensor to multiply each input value by after adjusting by the zero point. It must be blockwise broadcastable with the input. -
zeroPoint
: anMLOperand
. The zero point tensor to subtract from each input value. It has the same shape as the scale. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
dequantized
values.
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"uint8"
,
"int8"
,
"uint32"
,
"int32"
| N |
scale
|
"float32"
,
"float16"
|
same
as
input
|
zeroPoint
|
same
as
input
|
same
as
input
|
output |
same
as
scale
|
same
as
input
|
MLQuantizeDequantizeLinearSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
scale
, of type MLTensorLimits -
MLTensorLimits
for scale operand. -
zeroPoint
, of type MLTensorLimits -
MLTensorLimits
for zeroPoint operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
dequantizeLinear()
:
-
dequantizeLinear
, of type MLQuantizeDequantizeLinearSupportLimits -
Support limits for operator
dequantizeLinear()
.
The
dequantizeLinear(
input
,
scale
,
zeroPoint
,
options
)
method
steps
are:
-
If this .
[[hasBuilt]]
is true, then throw an "InvalidStateError
"DOMException
. -
If validating operand with this and any of input , scale , and zeroPoint returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If scale ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If zeroPoint ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If scale ’s rank or zeroPoint ’s rank is not equal to input ’s rank , then throw a
TypeError
. -
If scale ’s shape is not equal to zeroPoint ’s shape , then throw a
TypeError
. -
If blockwise broadcasting scale ’s shape and input ’s shape returns false, then throw a
TypeError
. -
If blockwise broadcasting zeroPoint ’s shape and input ’s shape returns false, then throw a
TypeError
. -
Let outputDescriptor be the result of creating an MLOperandDescriptor given scale ’s dataType and input ’s shape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and outputDescriptor .
-
Let operator be an operator for the "dequantizeLinear" operation, given input , scale , zeroPoint , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function dequantizeLinear( builder, input, scale, zeroPoint, options) { // output = (input - zeroPoint) * scale const floatInput= builder. cast( input, scale. dataType); const floatZeroPoint= builder. cast( zeroPoint, scale. dataType); const upsampledScale= blockwiseExpand( builder, scale, input. shape); const upsampledZeroPoint= blockwiseExpand( builder, floatZeroPoint, input. shape); return builder. mul( builder. sub( floatInput, upsampledZeroPoint), upsampledScale); } function blockwiseExpand( builder, input, outputShape) { // Given the original input and a desired output shape, this expands each axis // by repeating the block the number of times per that axis. Though, backend // implementations might have much more efficient upsampling operators that // can accept multiple dimensions to upsample all dimensions at once by // integer multiples (like tile) using nearest neighbor resampling: // output = resample(scale, {sizes: input.shape}) let output= input; for ( let axis= 0 ; axis< input. shape. length; ++ axis) { const oldShape= output. shape; const oldDimensionLength= oldShape[ axis]; const newDimensionLength= outputShape[ axis]; if ( newDimensionLength!= oldDimensionLength) { // Since tile/expand can only accept repetitions of entire dimension // slices (not repeating individual elements along an axis), temporarily // reshape the tensor to enable them to broadcast the elements up to the // full block size, utilizing an inserted dimension of size 1. const elementRepeatCount= newDimensionLength/ oldDimensionLength; const flattenedShape= getFlattenedShapeAroundAxis( oldShape, axis); const unexpandedShape= [ flattenedShape[ 0 ], flattenedShape[ 1 ], 1 , flattenedShape[ 2 ]]; const expandedShape= [ flattenedShape[ 0 ], flattenedShape[ 1 ], elementRepeatCount, flattenedShape[ 2 ] ]; const reshapedInput= builder. reshape( output, unexpandedShape); output= builder. expand( reshapedInput, expandedShape); let newShape= [... oldShape]; newShape[ axis] = newDimensionLength; output= builder. reshape( output, newShape); } } return output; } // Compute the flattened shape before and after the given axis, yielding a // 3-element list: e.g. // - inputShape = [2,3,4,5,6] with axis = 2 yields shape [6,4,30]. // - inputShape = [4] with axis = 0 yields shape [1,4,1]. function getFlattenedShapeAroundAxis( inputShape, axis) { axis= Math. max( Math. min( axis, inputShape. length- 1 ), 0 ); const shapeBefore= inputShape. slice( 0 , axis); const shapeAfter= inputShape. slice( axis+ 1 , inputShape. length); const countBefore= shapeBefore. reduce(( a, b) => a* b, 1 ); const countAfter= shapeAfter. reduce(( a, b) => a* b, 1 ); return [ countBefore, inputShape[ axis], countAfter]; }
7.9.17. quantizeLinear
Quantizes a floating point tensor to integer tensor using the scale and zero-point bias (e.g.
output
=
clamp(roundEven(input
/
scale)
+
zeroPoint,
0,
255)
for
"uint8").
The
scale
and
zeroPoint
tensors
can
be
smaller
than
the
input
tensor
as
they
are
blockwise
broadcast
.
partial interface MLGraphBuilder {MLOperand quantizeLinear (MLOperand input ,MLOperand scale ,MLOperand zeroPoint ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLQuantizeDequantizeLinearSupportLimits quantizeLinear ; };
-
input
: anMLOperand
. The input tensor. -
scale
: anMLOperand
. The scale tensor to divide each input value by before adjusting by the zero point. It must be blockwise broadcastable with the input. -
zeroPoint
: anMLOperand
. The zero point tensor to add to each rescaled input value. It has the same shape as the scale. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
quantized
values.
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
scale
|
same
as
input
|
same
as
input
|
zeroPoint
|
"uint8"
,
"int8"
,
"uint32"
,
"int32"
|
same
as
input
|
output |
same
as
zeroPoint
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
quantizeLinear()
:
-
quantizeLinear
, of type MLQuantizeDequantizeLinearSupportLimits -
Support limits for operator
quantizeLinear()
.
The
quantizeLinear(
input
,
scale
,
zeroPoint
,
options
)
method
steps
are:
-
If this .
[[hasBuilt]]
is true, then throw an "InvalidStateError
"DOMException
. -
If validating operand with this and any of input , scale , and zeroPoint returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If scale ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If zeroPoint ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If scale ’s rank or zeroPoint ’s rank is not equal to input ’s rank , then throw a
TypeError
. -
If scale ’s shape is not equal to zeroPoint ’s shape , then throw a
TypeError
. -
If blockwise broadcasting scale ’s shape and input ’s shape returns false, then throw a
TypeError
. -
If blockwise broadcasting zeroPoint ’s shape and input ’s shape returns false, then throw a
TypeError
. -
Let outputDescriptor be the result of creating an MLOperandDescriptor given zeroPoint ’s dataType and input ’s shape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and outputDescriptor .
-
Let operator be an operator for the "quantizeLinear" operation, given input , scale , zeroPoint , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
This emulation relies on a pending
roundEven
operator
in
issue
#817
.
function quantizeLinear( builder, input, scale, zeroPoint, options) { // output = clamp(roundEven(input / scale) + zeroPoint, 0, 255) // Note blockwiseExpand is defined in dequantizeLinear. const floatZeroPoint= builder. cast( zeroPoint, scale. dataType); const upsampledScale= blockwiseExpand( builder, scale, input. shape); const upsampledZeroPoint= blockwiseExpand( builder, floatZeroPoint, input. shape); const quantizedInput= builder. roundEven( builder. div( input, upsampledScale)); const zeroPointAdjustedInput= builder. add( quantizedInput, upsampledZeroPoint); const clampedInput= builder. clamp( zeroPointAdjustedInput, { 'minValue' : 0 , 'maxValue' : 255 }); return builder. cast( clampedInput, zeroPoint. dataType); }
7.9.18. elu
Calculate the exponential linear unit function (ELU) on the input tensor element-wise. The calculation follows the expression
max(0,
x)
+
alpha
*
(exp(min(0,
x))
-
1)
.
dictionary :
MLEluOptions MLOperatorOptions {double alpha = 1; };partial interface MLGraphBuilder {MLOperand elu (MLOperand input ,optional MLEluOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits elu ; };
MLEluOptions
has
the
following
members:
-
alpha
, of type double , defaulting to1
-
A scalar multiplier.
-
input
: anMLOperand
. The input tensor. -
options
: an optionalMLEluOptions
. The optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
members
for
elu()
:
-
elu
, of type MLSingleInputSupportLimits -
Support limits for operator
elu()
.
The
elu(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Set options .
alpha
to the result of casting options .alpha
to input ’s dataType . -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "elu" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function elu( builder, input, options) { return builder. add( builder. max( builder. constant( input. dataType, 0 ), input), builder. mul( builder. constant( input. dataType, options. alpha), builder. sub( builder. exp( builder. min( builder. constant( input. dataType, 0 ), input)), builder. constant( input. dataType, 1 )))); }
7.9.19. expand
Expand any dimension of size 1 of the input tensor to a larger size according to the new shape. The expansion is consistent with [numpy-broadcasting-rule] . The input tensor must be unidirectionally broadcastable to the new shape; each dimension must be of size 1 or match the sizes of the corresponding output dimensions according to the new shape.partial interface MLGraphBuilder {MLOperand expand (MLOperand input ,sequence <[EnforceRange ]unsigned long >newShape ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits expand ; };
-
input
: anMLOperand
. An input tensor -
newShape
: sequence <unsigned long
>. The new shape the input tensor is expanded to. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
tensor
with
expanded
size
shape.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
newShape
’s
size
|
MLOpSupportLimits
has
the
following
members
for
expand()
:
-
expand
, of type MLSingleInputSupportLimits -
Support limits for operator
expand()
.
The
expand(
input
,
newShape
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
Let outputShape be the result of unidirectionally broadcasting input ’s shape and newShape .
-
Let outputDescriptor be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and outputDescriptor .
-
Let operator be an operator for the "expand" operation, given input , newShape , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.20. gather
Gather values of the input tensor along an axis according to the indices.dictionary :
MLGatherOptions MLOperatorOptions { [EnforceRange ]unsigned long axis = 0; };partial interface MLGraphBuilder {MLOperand gather (MLOperand input ,MLOperand indices ,optional MLGatherOptions options = {}); };dictionary {
MLGatherSupportLimits MLTensorLimits input ;MLTensorLimits indices ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLGatherSupportLimits gather ; };
MLGatherOptions
has
the
following
members:
-
axis
, of type unsigned long , defaulting to0
-
The axis along which the gathered values are obtained. Its value must be in the range [0, N-1] where N is the rank of the input tensor.
-
input
: anMLOperand
. The input N-D tensor from which the values are gathered. -
indices
: anMLOperand
. The indices N-D tensor of the input values to gather. The values must be of type"int32"
,"uint32"
, or"int64"
, and must be in the range -N (inclusive) to N (exclusive) where N is the size of the input dimension indexed byaxis
, and a negative index means indexing from the end of the dimension. -
options
: an optionalMLGatherOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
N-D
tensor
of
rank
equal
to
the
rank
of
input
+
the
rank
of
indices
-
1.
indices
parameter
to
gather()
can
not
be
clamped
to
the
allowed
range
when
the
graph
is
built
because
the
inputs
are
not
known
until
execution.
Implementations
can
introduce
clamp()
in
the
compiled
graph
if
the
specified
clamping
behavior
is
not
provided
by
the
underlying
platform.
Similarly,
if
the
underlying
platform
does
not
support
negative
indices,
the
implementation
can
introduce
operations
in
the
compiled
graph
to
transform
a
negative
index
from
the
end
of
the
dimension
into
a
positive
index.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
indices
|
"int32"
,
"uint32"
,
"int64"
| N |
output |
same
as
input
|
input
’s
rank
+
indices
’s
rank
-
1
|
MLGatherSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
indices
, of type MLTensorLimits -
MLTensorLimits
for indices operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
members
for
gather()
:
-
gather
, of type MLGatherSupportLimits -
Support limits for operator
gather()
.
The
gather(
input
,
indices
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input and indices returns false, then throw a
TypeError
. -
If indices ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Let inputShape be input ’s shape and inputRank be input ’s rank .
-
Let indicesShape be indices ’s shape .
-
Let axis be options .
axis
. -
If axis is greater than or equal to inputRank , then throw a
TypeError
. -
Let dimCount be zero.
-
Let outputRank be zero.
-
Let outputShape be an empty list.
-
For each size of inputShape :
-
If dimCount is equal to axis , then break .
-
Set outputShape [ dimCount ] to size .
-
Increment dimCount by one.
-
-
Set outputRank to dimCount .
-
Let dimCount be zero.
-
For each size of indicesShape :
-
Set outputShape [ outputRank + dimCount ] to size .
-
Increment dimCount by one.
-
-
Set outputRank to outputRank + dimCount .
-
Let dimCount be zero.
-
For each size of inputShape :
-
If dimCount is less than or equal to axis , then continue .
-
Set outputShape [ outputRank + dimCount - axis - 1] to size .
-
Increment dimCount by one.
-
-
Let desc be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given desc .
-
Let operator be an operator for the "gather" operation, given input , indices , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input and indices .
-
Set operator ’s output to output .
-
-
Return output .
Examples of how gather works in different slicing schemes.
// input of shape [4,3]: // [[ 0, 1, 2], // [10, 11, 12], // [20, 21, 22], // [30, 31, 32]] const input= builder. constant( { dataType: 'float32' , shape: [ 4 , 3 ]}, new Float32Array([ 0 , 1 , 2 , 10 , 11 , 12 , 20 , 21 , 22 , 30 , 31 , 32 ])); // axis = 0 (default) // indices of shape [2]: // [3,1] // output of shape [2,3]: // [[30, 31, 32], // [10, 11, 12]] const indices1= builder. constant({ dataType: 'uint32' , shape: [ 2 ]}, new Uint32Array([ 3 , 1 ])); const output1= builder. gather( input, indices1); // axis = 1 // indices of shape [3]: // [2,1,1] // output of shape [4,3]: // [[ 2, 1, 1], // [12, 11, 11], // [22, 21, 21], // [32, 31, 31]] const indices2= builder. constant( { dataType: 'uint32' , shape: [ 3 ]}, new Uint32Array([ 2 , 1 , 1 ])); const output2= builder. gather( input, indices2, { axis: 1 }); // axis = 1 // indices of shape [2,2]: // [[0, 1], // [1, 2]] // output of shape [4,2,2]: // [[[ 0, 1], [ 1, 2]], // [[10, 11], [11, 12]], // [[20, 21], [21, 22]], // [[30, 31], [31, 32]]] const indices3= builder. constant( { dataType: 'uint32' , shape: [ 2 , 2 ]}, new Uint32Array([ 0 , 1 , 1 , 2 ])); const output3= builder. gather( input, indices3, { axis: 1 });
7.9.21. gatherElements
Gather values of the input tensor along an axis according to the indices.partial interface MLGraphBuilder {MLOperand gatherElements (MLOperand input ,MLOperand indices ,optional MLGatherOptions options = {}); };partial dictionary MLOpSupportLimits {MLGatherSupportLimits gatherElements ; };
-
input
: anMLOperand
. The input N-D tensor from which the values are gathered. -
indices
: anMLOperand
. The indices N-D tensor of the input values to gather. The values must be of type"int32"
,"uint32"
, or"int64"
, and must be in the range -N (inclusive) to N (exclusive) where N is the size of the input dimension indexed by options.axis , and a negative index means indexing from the end of the dimension. -
options
: an optionalMLGatherOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
N-D
tensor
of
rank
equal
to
input
’s
rank
.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | 1 to N |
indices
|
"int32"
,
"uint32"
,
"int64"
|
same
as
input
|
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
members
for
gatherElements()
:
-
gatherElements
, of type MLGatherSupportLimits -
Support limits for operator
gatherElements()
.
indices
parameter
to
gatherElements()
can
not
be
clamped
to
the
allowed
range
when
the
graph
is
built
because
the
inputs
are
not
known
until
execution.
Implementations
can
introduce
clamp()
in
the
compiled
graph
if
the
specified
clamping
behavior
is
not
provided
by
the
underlying
platform.
Similarly,
if
the
underlying
platform
does
not
support
negative
indices,
the
implementation
can
introduce
operations
in
the
compiled
graph
to
transform
a
negative
index
from
the
end
of
the
dimension
into
a
positive
index.
The
gatherElements(
input
,
indices
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input and indices returns false, then throw a
TypeError
. -
If indices ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If the rank of any of input or indices is not its allowed rank , then throw a
TypeError
. -
Let axis be options .
axis
. -
If axis is greater than or equal to input ’s rank , then throw a
TypeError
. -
Let indicesShapeExpected be a copy of input ’s shape .
-
Set indicesShapeExpected [ axis ] to indices ’s shape [ axis ].
-
If indices ’s shape is not equal to indicesShapeExpected , then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "gatherElements" operation, given input , indices , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input and indices .
-
Set operator ’s output to output .
-
-
Return output .
Examples of how gatherElements works in different slicing schemes.
// input of shape [4,3]: // [[ 0, 1, 2], // [10, 11, 12], // [20, 21, 22], // [30, 31, 32]] // indices of shape [2,3]: // [[3, 1, 1], // [2, 0, 3]] // axis = 0 (default) // output of shape [2,3]: // [[30, 11, 12], // [20, 1, 32]] const input1= builder. constant( { dataType: 'float32' , shape: [ 4 , 3 ]}, new Float32Array([ 0 , 1 , 2 , 10 , 11 , 12 , 20 , 21 , 22 , 30 , 31 , 32 ])); const indices1= builder. constant( { dataType: 'uint32' , shape: [ 2 , 3 ]}, new Uint32Array([ 3 , 1 , 1 , 2 , 0 , 3 ])); const output1= builder. gatherElements( input1, indices1); // input of shape [4,3]: // [[ 0, 1, 2], // [10, 11, 12], // [20, 21, 22], // [30, 31, 32]] // indices of shape [4,1]: // [[2], // [1], // [0], // [2]], // axis = 1 // output of shape [4,1]: // [[ 2], // [11], // [20], // [32]] const indices2= builder. constant( { dataType: 'uint32' , shape: [ 4 , 1 ]}, new Uint32Array([ 2 , 1 , 0 , 2 ])); const output2= builder. gatherElements( input1, indices2, { axis: 1 }); // input of shape [4,2,2]: // [[[ 0, 1], // [ 10, 11]], // [[100, 101], // [110, 111]], // [[200, 201], // [210, 211]], // [[300, 301], // [310, 311]],] // indices of shape [1,2,2]: // [[[0, 2], // [1, 3]]], // axis = 0 // output of shape [1,2,2]: // [[[ 0, 201], // [110, 311]]] const inputData3= new Float32Array( [ 0 , 1 , 10 , 11 , 100 , 101 , 110 , 111 , 200 , 201 , 210 , 211 , 300 , 301 , 310 , 311 ]); const input3= builder. constant({ dataType: 'float32' , shape: [ 4 , 2 , 2 ]}, inputData3); const indices3= builder. constant( { dataType: 'uint32' , shape: [ 1 , 2 , 2 ]}, new Uint32Array([ 0 , 2 , 1 , 3 ])); const output3= builder. gatherElements( input3, indices3, { axis: 0 });
7.9.22. gatherND
Gather slices of the input tensor according to the indices.partial interface MLGraphBuilder {MLOperand gatherND (MLOperand input ,MLOperand indices ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLGatherSupportLimits gatherND ; };
-
input
: anMLOperand
. The input N-D tensor from which the values are gathered. -
indices
: anMLOperand
. The indices array contains entire coordinates into the input tensor, with the rightmost dimension holding the number of dimensions per coordinate. So an indices tensor of shape [10,1] holds 10 single-axis indices, and a shape of [4,3] holds 4 indices of 3D coordinates. The values must be of type"int32"
,"uint32"
, or"int64"
, and each must be in the range -N (inclusive) to N (exclusive) where N is the size of the corresponding input dimension, and a negative index means indexing from the end of the corresponding dimension. -
options
: an optionalMLOperatorOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
N-D
tensor
of
rank
equal
to
the
input
’s
rank
+
indices
’s
rank
-
indices
’s
shape
[-1]
-
1.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | 1 to N |
indices
|
"int32"
,
"uint32"
,
"int64"
| 1 to N |
output |
same
as
input
|
input
’s
rank
+
indices
’s
rank
-
indices
’s
shape
[-1]
-
1
|
MLOpSupportLimits
has
the
following
members
for
gatherND()
:
-
gatherND
, of type MLGatherSupportLimits -
Support limits for operator
gatherND()
.
indices
parameter
to
gatherND()
can
not
be
clamped
to
the
allowed
range
when
the
graph
is
built
because
the
inputs
are
not
known
until
execution.
Implementations
can
introduce
clamp()
in
the
compiled
graph
if
the
specified
clamping
behavior
is
not
provided
by
the
underlying
platform.
Similarly,
if
the
underlying
platform
does
not
support
negative
indices,
the
implementation
can
introduce
operations
in
the
compiled
graph
to
transform
a
negative
index
from
the
end
of
the
dimension
into
a
positive
index.
The
gatherND(
input
,
indices
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input and indices returns false, then throw a
TypeError
. -
If indices ’s dataType ’s is not one of the allowed data types (according to this table ), then throw a
TypeError
. -
If the rank of any of input or indices is not its allowed rank , then throw a
TypeError
. -
Let inputShape be input ’s shape and inputRank be input ’s rank .
-
Let indicesShape be indices ’s shape and indicesRank be indices ’s rank .
-
If the rank of any of input or indices is not its allowed rank , then throw a
TypeError
. -
Let indexableSize be indicesRank - 1.
-
Let coordinateSize be indicesShape [ indexableSize ].
-
If coordinateSize is greater than inputRank , then throw a
TypeError
. -
Let outputShape be an empty list.
-
For each index in the range 0 to indexableSize , exclusive:
-
Append indicesShape [ index ] to outputShape .
-
-
For each index in the range coordinateSize to inputRank , exclusive:
-
Append inputShape [ index ] to outputShape .
-
-
Let outputDesc be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given outputDesc .
-
Let operator be an operator for the "gatherND" operation, given input , indices , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input and indices .
-
Set operator ’s output to output .
-
-
Return output .
Examples of how gatherND works in different slicing schemes.
// input of shape [2,2]: // [[0, 1], // [2, 3]] // indices of shape [3,2]: // [[0, 0], // [1, 1], // [1, 0]] // output of shape [3]: // [0, 3, 2] const input1= builder. constant( { dataType: 'float32' , shape: [ 2 , 2 ]}, new Float32Array([ 0 , 1 , 2 , 3 ])); const indices1= builder. constant( { dataType: 'uint32' , shape: [ 3 , 2 ]}, new Uint32Array([ 0 , 0 , 1 , 1 , 1 , 0 ])); const output1= builder. gatherND( input1, indices1); // input of shape [2,2]: // [[0, 1], // [2, 3]] // indices of shape [2,1]: // [[1], // [0]] // output of shape [2,2]: // [[2, 3] <= row [2, 3] from input coordinates [1, *] // [0, 1]] <= row [0, 1] from input coordinates [0, *] const indices2= builder. constant( { dataType: 'uint32' , shape: [ 2 , 1 ]}, new Uint32Array([ 1 , 0 ])); const output2= builder. gatherND( input1, indices2); // input of shape [2,2,2]: // [[[0, 1], // [2, 3]], // [[4, 5], // [6, 7]]] // indices of shape [2,2]: // [[0, 1], // [1, 0]] // output of shape [2,2]: // [[2, 3], <= row [2, 3] from input coordinates [0, 1, *] // [4, 5]] <= row [4, 5] from input coordinates [1, 0, *] const input2= builder. constant( { dataType: 'float32' , shape: [ 2 , 2 , 2 ]}, new Float32Array([ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 ])); const indices3= builder. constant( { dataType: 'uint32' , shape: [ 2 , 2 ]}, new Uint32Array([ 0 , 1 , 1 , 0 ])); const output3= builder. gatherND( input2, indices3); // input of shape [2,2,2]: // [[[0, 1], // [2, 3]], // [[4, 5], // [6, 7]]] // indices of shape [3,1]: // [[1], // [0], // [1]] // output of shape [3,2,2]: // [[[4, 5], <= block [[4, 5], [6, 7]] from input coordinates [1, *, *] // [6, 7]], // [[0, 1], <= block [[0, 1], [2, 3]] from input coordinates [0, *, *] // [2, 3]], // [[4, 5], <= block [[4, 5], [6, 7]] from input coordinates [1, *, *] // [6, 7]]] const indices4= builder. constant( { dataType: 'uint32' , shape: [ 3 , 1 ]}, new Uint32Array([ 1 , 0 , 1 ])); const output4= builder. gatherND( input2, indices4); // input of shape [2,2,2]: // [[[0, 1], // [2, 3]], // [[4, 5], // [6, 7]]] // indices of shape [5,3]: // [[0,0,1], // [0,1,0], // [1,0,0], // [1,1,0], // [1,1,1]] // output of shape [5]: // [1,2,4,6,7] const indices5= builder. constant( { dataType: 'uint32' , shape: [ 5 , 3 ]}, new Uint32Array([ 0 , 0 , 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 1 , 1 ])); const output5= builder. gatherND( input2, indices5);
7.9.23. gelu
Compute the gaussian error linear unit function (GELU) of the input tensor. The calculation follows the expression
0.5
*
x
*
(1
+
erf(x
/
sqrt(2)))
.
partial interface MLGraphBuilder {MLOperand gelu (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits gelu ; };
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
gelu()
:
-
gelu
, of type MLSingleInputSupportLimits -
Support limits for operator
gelu()
.
The
gelu(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "gelu" operation given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function gelu( builder, input) { return builder. mul( builder. mul( input, builder. constant( input. dataType, 0.5 )), builder. add( builder. constant( input. dataType, 1 ), builder. erf( builder. div( input, builder. sqrt( builder. constant( input. dataType, 2 )))))); }
7.9.24. gemm
Calculate the general matrix multiplication of the Basic Linear Algebra Subprograms . The calculation follows the expression
alpha
*
A
*
B
+
beta
*
C
,
where
A
is
a
2-D
tensor
with
shape
[M,
K]
or
[K,
M]
,
B
is
a
2-D
tensor
with
shape
[K,
N]
or
[N,
K]
,
and
C
is
unidirectionally
broadcastable
to
the
shape
[M,
N]
.
A
and
B
can
optionally
be
transposed
prior
to
the
calculation.
dictionary :
MLGemmOptions MLOperatorOptions {MLOperand c ;double alpha = 1.0;double beta = 1.0;boolean aTranspose =false ;boolean bTranspose =false ; };partial interface MLGraphBuilder {MLOperand gemm (MLOperand a ,MLOperand b ,optional MLGemmOptions options = {}); };dictionary {
MLGemmSupportLimits MLTensorLimits a ;MLTensorLimits b ;MLTensorLimits c ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLGemmSupportLimits gemm ; };
MLGemmOptions
has
the
following
members:
-
c
, of type MLOperand -
The third input tensor. It is either a scalar, or of the shape that is unidirectionally broadcastable to the shape [M, N] . When it is not specified, the computation is done as if
c
is a scalar 0.0. -
alpha
, of type double , defaulting to1.0
-
A multiplier for the first input.
-
beta
, of type double , defaulting to1.0
-
A multiplier for the third input
c
. -
aTranspose
, of type boolean , defaulting tofalse
-
Indicates if the first input is transposed prior to calculating the output.
-
bTranspose
, of type boolean , defaulting tofalse
-
Indicates if the second input is transposed prior to calculating the output.
-
a
: anMLOperand
. The first input 2-D tensor with shape [M, K] ifaTranspose
is false, or [K, M] ifaTranspose
is true. -
b
: anMLOperand
. The second input 2-D tensor with shape [K, N] ifbTranspose
is false, or [N, K] ifbTranspose
is true. -
options
: an optionalMLGemmOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
2-D
tensor
of
shape
[M,
N]
that
contains
the
calculated
product
of
all
the
inputs.
operand | allowed data types | allowed ranks |
---|---|---|
a
|
"float32"
,
"float16"
| 2 |
b
|
same
as
a
| 2 |
c
|
same
as
a
| 0 to 2 |
output |
same
as
a
| 2 |
MLGemmSupportLimits
has
the
following
members:
-
a
, of type MLTensorLimits -
MLTensorLimits
for a operand. -
b
, of type MLTensorLimits -
MLTensorLimits
for b operand. -
c
, of type MLTensorLimits -
MLTensorLimits
for c operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
gemm()
:
-
gemm
, of type MLGemmSupportLimits -
Support limits for operator
gemm()
.
The
gemm(
a
,
b
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of a and b returns false, then throw a
TypeError
. -
If the dataType of any of a or b is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If the rank of any of a or b is not its allowed rank , then throw a
TypeError
. -
Set options .
alpha
to the result of casting options .alpha
to a ’s dataType . -
Set options .
beta
to the result of casting options .beta
to a ’s dataType . -
If options .
aTranspose
is true, then reverse the order of the items in shapeA . -
If options .
bTranspose
is true, then reverse the order of the items in shapeB . -
If shapeA [1] is not equal to shapeB [0], then throw a
TypeError
. -
-
If it is not unidirectionally broadcastable to the shape « shapeA [0], shapeB [1] », then throw a
TypeError
. -
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
.
-
-
Let desc be the result of creating an MLOperandDescriptor given a ’s dataType and « shapeA [0], shapeB [1] ».
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "gemm" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to a and b .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function gemm( builder, a, b, options) { if ( options. aTranspose) a= builder. transpose( a); if ( options. bTranspose) b= builder. transpose( b); let ab= builder. matmul( builder. mul( builder. constant( a. dataType, options. alpha), a), b); return ( options. c? builder. add( ab, builder. mul( builder. constant( a. dataType, options. beta), options. c)) : ab); }
7.9.25. gru
Gated Recurrent Unit [GRU] recurrent network uses an update, reset, and new gate to compute the output state that rolls into the output across the temporal sequence of the network.enum {
MLGruWeightLayout , // update-reset-new gate ordering
"zrn" // reset-update-new gate ordering };
"rzn" enum {
MLRecurrentNetworkActivation ,
"relu" ,
"sigmoid" };
"tanh" enum {
MLRecurrentNetworkDirection ,
"forward" ,
"backward" };
"both" dictionary :
MLGruOptions MLOperatorOptions {MLOperand bias ;MLOperand recurrentBias ;MLOperand initialHiddenState ;boolean resetAfter =true ;boolean returnSequence =false ;MLRecurrentNetworkDirection direction = "forward";MLGruWeightLayout layout = "zrn";sequence <MLRecurrentNetworkActivation >activations ; };partial interface MLGraphBuilder {sequence <MLOperand >gru (MLOperand input ,MLOperand weight ,MLOperand recurrentWeight , [EnforceRange ]unsigned long steps , [EnforceRange ]unsigned long hiddenSize ,optional MLGruOptions options = {}); };dictionary {
MLGruSupportLimits MLTensorLimits input ;MLTensorLimits weight ;MLTensorLimits recurrentWeight ;MLTensorLimits bias ;MLTensorLimits recurrentBias ;MLTensorLimits initialHiddenState ;MLDataTypeLimits outputs ; };partial dictionary MLOpSupportLimits {MLGruSupportLimits gru ; };
MLGruOptions
has
the
following
members:
-
bias
, of type MLOperand -
The 2-D input bias tensor of shape [numDirections, 3 * hiddenSize] . The ordering of the bias vectors in the second dimension of the tensor shape is specified according to
layout
. -
recurrentBias
, of type MLOperand -
The 2-D recurrent bias tensor of shape [numDirections, 3 * hiddenSize] . The ordering of the bias vectors in the second dimension of the tensor shape is specified according to
layout
. -
initialHiddenState
, of type MLOperand -
The 3-D initial hidden state tensor of shape [numDirections, batchSize, hiddenSize] . When not specified, implementations must use a tensor filled with zero.
-
resetAfter
, of type boolean , defaulting totrue
-
Indicates whether to apply the reset gate after or before matrix multiplication.
-
returnSequence
, of type boolean , defaulting tofalse
-
Indicates whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step.
-
direction
, of type MLRecurrentNetworkDirection , defaulting to"forward"
-
The processing direction of the input sequence. When set to
"both"
, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions. -
layout
, of type MLGruWeightLayout , defaulting to"zrn"
-
The ordering of the weight and bias vectors for the internal gates of GRU, specifically the
update (z)
,reset (r)
, andnew (n)
gate, as indicated in the second dimension of the weight and bias tensor shape. -
activations
, of type sequence< MLRecurrentNetworkActivation > -
Specifies a pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, defaults to the
"sigmoid"
and"tanh"
functions, respectively.
-
input
: anMLOperand
. The input 3-D tensor of shape [steps, batchSize, inputSize] . -
weight
: anMLOperand
. The 3-D input weight tensor of shape [numDirections, 3 * hiddenSize, inputSize] . The ordering of the weight vectors in the second dimension of the tensor shape is specified according tolayout
. -
recurrentWeight
: anMLOperand
. The 3-D recurrent weight tensor of shape [numDirections, 3 * hiddenSize, hiddenSize] . The ordering of the weight vectors in the second dimension of the tensor shape is specified according tolayout
. -
steps
: anunsigned long
scalar. The number of time steps in the recurrent network. The value must be greater than 0. -
hiddenSize
: anunsigned long
scalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state. -
options
: an optionalMLGruOptions
. The optional parameters of the operation.
Returns:
sequence
<
MLOperand
>.
The
first
element
is
a
3-D
tensor
of
shape
[numDirections,
batchSize,
hiddenSize]
,
the
cell
output
from
the
last
time
step
of
the
network.
Additionally,
if
returnSequence
is
set
to
true,
the
second
element
is
the
4-D
output
tensor
of
shape
[steps,
numDirections,
batchSize,
hiddenSize]
containing
every
cell
outputs
from
each
time
step
in
the
temporal
sequence.
operand | allowed data types | allowed ranks |
---|---|---|
|
"float32"
,
"float16"
| 3 |
|
same
as
| 3 |
|
same
as
| 3 |
bias
|
same
as
| 2 |
recurrentBias
|
same
as
| 2 |
|
same
as
| 3 |
outputs[0] |
same
as
| 3 |
outputs[1]
if
returnSequence
is
true
|
same
as
| 4 |
MLGruSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
weight
, of type MLTensorLimits -
MLTensorLimits
for weight operand. -
recurrentWeight
, of type MLTensorLimits -
MLTensorLimits
for recurrentWeight operand. -
bias
, of type MLTensorLimits -
MLTensorLimits
for bias operand. -
recurrentBias
, of type MLTensorLimits -
MLTensorLimits
for recurrentBias operand. -
initialHiddenState
, of type MLTensorLimits -
MLTensorLimits
for initialHiddenState operand. -
outputs
, of type MLDataTypeLimits -
MLDataTypeLimits
for all the output operands.
MLOpSupportLimits
has
the
following
member
for
gru()
:
-
gru
, of type MLGruSupportLimits -
Support limits for operator
gru()
.
The
gru(
input
,
weight
,
recurrentWeight
,
steps
,
hiddenSize
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , weight , recurrentWeight , options .
bias
(if it exists ), options .recurrentBias
(if it exists ), and options .TypeError
. -
If the dataType of any of input , weight or recurrentWeight is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If the rank of any of input , weight or recurrentWeight is not its allowed rank , then throw a
TypeError
. -
If input ’s shape [0] is not equal to steps , then throw a
TypeError
. -
Let batchSize be input ’s shape [1].
-
Let inputSize be input ’s shape [2].
-
Let numDirections be 2 if options .
direction
is"both"
, or 1 otherwise. -
If weight ’s shape is not equal to « numDirections , 3 * hiddenSize , inputSize », then throw a
TypeError
. -
If recurrentWeight ’s shape is not equal to « numDirections , 3 * hiddenSize , hiddenSize », then throw a
TypeError
. -
If hiddenSize * 6 is not a valid dimension , then throw a
TypeError
.Why hiddenSize * 6 ?
Some underlying platforms operate on a single bias tensor which is a concatenation ofbias
andrecurrentBias
. Therefore, 3 * hiddenSize + 3 * hiddenSize also needs to be a valid dimension . -
If options .
bias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , 3 * hiddenSize », then throw a
TypeError
.
-
-
If options .
recurrentBias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , 3 * hiddenSize », then throw a
TypeError
.
-
-
If options .
-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , batchSize , hiddenSize », then throw a
TypeError
.
-
-
If options .
activations
exists , then:-
Let activations be a clone of options .
activations
.
-
Otherwise:
-
Calculate the output shape:
-
Let desc0 be the result of creating an MLOperandDescriptor given input ’s dataType and « numDirections , batchSize , hiddenSize ».
-
If options .
returnSequence
is true, then:-
Let desc1 be the result of creating an MLOperandDescriptor given input ’s dataType and « steps , numDirections , batchSize , hiddenSize ».
-
-
-
Make graph connections:
-
Let operator be an operator for the "gru" operation, given weight , recurrentWeight , steps , hiddenSize and options .
-
Let output0 be the result of creating an MLOperand given this and desc0 .
-
If options .
returnSequence
is true, then:-
Let output1 be the result of creating an MLOperand given this and desc1 .
-
Let output be the list « output0 , output1 ».
-
Set output0 .
[[operator]]
and output1 .[[operator]]
to operator .
-
-
Otherwise:
-
Let output be the list « output0 ».
-
Set output0 .
[[operator]]
to operator .
-
-
Set operator ’s inputs to input , weight , and recurrentWeight .
-
If options .
bias
exists , then add it to operator ’s inputs . -
If options .
recurrentBias
exists , then add it to operator ’s inputs . -
Set operator ’s activation functions to a clone of activations .
-
Set operator ’s output to output .
-
-
Return output .
Using a squeeze() helper, the behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function gru( builder, input, weight, recurrentWeight, steps, hiddenSize, options) { const batchSize= input. shape[ 1 ]; const inputSize= input. shape[ 2 ]; const direction= options. direction|| 'forward' ; const numDirections= ( direction== 'both' ? 2 : 1 ); let hiddenState= options. initialHiddenState; if ( ! hiddenState) { const desc= { dataType: 'float32' , shape: [ numDirections, batchSize, hiddenSize] }; const totalSize= numDirections* batchSize* hiddenSize; hiddenState= builder. constant( desc, new Float32Array( totalSize). fill( 0 )); } let currentWeight= []; let currentRecurrentWeight= []; let currentBias= []; let currentRecurrentBias= []; let forwardSequence= null ; let backwardSequence= null ; let outputHidden= null ; for ( let dir= 0 ; dir< numDirections; ++ dir) { currentWeight. push( squeeze( builder, builder. slice( weight, [ dir, 0 , 0 ], [ 1 , 3 * hiddenSize, inputSize]))); currentRecurrentWeight. push( squeeze( builder, builder. slice( recurrentWeight, [ dir, 0 , 0 ], [ 1 , 3 * hiddenSize, hiddenSize]))); currentBias. push( options. bias? ( squeeze( builder, builder. slice( options. bias, [ dir, 0 ], [ 1 , 3 * hiddenSize]))) : null ); currentRecurrentBias. push( options. recurrentBias? ( squeeze( builder, builder. slice( options. recurrentBias, [ dir, 0 ], [ 1 , 3 * hiddenSize]))) : null ); let currentHidden= squeeze( builder, builder. slice( hiddenState, [ dir, 0 , 0 ], [ 1 , batchSize, hiddenSize])); for ( let step= 0 ; step< steps; ++ step) { const slice= ( dir== 1 || direction== 'backward' ? steps- step- 1 : step); const currentInput= squeeze( builder, builder. slice( input, [ slice, 0 , 0 ], [ 1 , batchSize, inputSize])); currentHidden= builder. gruCell( currentInput, currentWeight[ dir], currentRecurrentWeight[ dir], currentHidden, hiddenSize, { bias: currentBias[ dir], recurrentBias: currentRecurrentBias[ dir], resetAfter: options. resetAfter, layout: options. layout, activations: options. activations}); if ( options. returnSequence) { // Expand currentHidden of 2D([batchSize, hiddenSize]) // to 4D([steps, numDirections, batchSize, hiddenSize]) const expandedHiddenAs4D= builder. reshape( currentHidden, [ 1 , 1 , batchSize, hiddenSize]); if ( direction== 'forward' || ( dir== 0 && direction== 'both' )) { forwardSequence= forwardSequence? builder. concat([ forwardSequence, expandedHiddenAs4D], 0 ) : expandedHiddenAs4D; } else if ( direction== 'backward' || ( dir== 1 && direction== 'both' )) { backwardSequence= backwardSequence? builder. concat([ expandedHiddenAs4D, backwardSequence], 0 ) : expandedHiddenAs4D; } } } // Expand currentHidden of 2D([batchSize, hiddenSize]) // to 3D([numDirections, batchSize, hiddenSize]) const expandedHiddenAs3D= builder. reshape( currentHidden, [ 1 , batchSize, hiddenSize]); outputHidden= outputHidden? builder. concat([ outputHidden, expandedHiddenAs3D], 0 ) : expandedHiddenAs3D; } if ( options. returnSequence) { let outputSequence= null ; if ( direction== 'forward' ) { outputSequence= forwardSequence; } else if ( direction== 'backward' ) { outputSequence= backwardSequence; } else if ( direction== 'both' ) { // Concat along axis 1 (numDirections dimension) outputSequence= builder. concat([ forwardSequence, backwardSequence], 1 ); } return [ outputHidden, outputSequence]; } else { return [ outputHidden]; } }
7.9.26. gruCell
A single time step of the Gated Recurrent Unit [GRU] recurrent network using an update gate and a reset gate to compute the hidden state that rolls into the output across the temporal sequence of a recurrent network.dictionary :
MLGruCellOptions MLOperatorOptions {MLOperand bias ;MLOperand recurrentBias ;boolean resetAfter =true ;MLGruWeightLayout layout = "zrn";sequence <MLRecurrentNetworkActivation >activations ; };partial interface MLGraphBuilder {MLOperand gruCell (MLOperand input ,MLOperand weight ,MLOperand recurrentWeight ,MLOperand hiddenState , [EnforceRange ]unsigned long hiddenSize ,optional MLGruCellOptions options = {}); };dictionary {
MLGruCellSupportLimits MLTensorLimits input ;MLTensorLimits weight ;MLTensorLimits recurrentWeight ;MLTensorLimits hiddenState ;MLTensorLimits bias ;MLTensorLimits recurrentBias ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLGruCellSupportLimits gruCell ; };
MLGruCellOptions
has
the
following
members:
-
bias
, of type MLOperand -
The 1-D input bias tensor of shape [3 * hiddenSize] . The ordering of the bias vectors in the second dimension of the tensor shape is specified according to
layout
. -
recurrentBias
, of type MLOperand -
The 1-D recurrent bias tensor of shape [3 * hiddenSize] . The ordering of the bias vectors in the second dimension of the tensor shape is specified according to
layout
. -
resetAfter
, of type boolean , defaulting totrue
-
Indicates whether to apply the reset gate after or before matrix multiplication.
-
layout
, of type MLGruWeightLayout , defaulting to"zrn"
-
The ordering of the weight and bias vectors for the internal gates of GRU, specifically the
update (z)
,reset (r)
, andnew (n)
gate, as indicated in the second dimension of the weight and bias tensor shape. -
activations
, of type sequence< MLRecurrentNetworkActivation > -
Specifies a pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, defaults to the
"sigmoid"
and"tanh"
functions, respectively.
-
input
: anMLOperand
. The input 2-D tensor of shape [batchSize, inputSize] . -
weight
: anMLOperand
. The 2-D input weight tensor of shape [3 * hiddenSize, inputSize] . The ordering of the weight vectors in the first dimension of the tensor shape is specified according tolayout
. -
recurrentWeight
: anMLOperand
. The 2-D recurrent weight tensor of shape [3 * hiddenSize, hiddenSize] . The ordering of the weight vectors in the first dimension of the tensor shape is specified according tolayout
. -
hiddenState
: anMLOperand
. The 2-D input hidden state tensor of shape [batchSize, hiddenSize] . -
hiddenSize
: anunsigned long
scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state. -
options
: an optionalMLGruCellOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
2-D
tensor
of
shape
[batchSize,
hiddenSize]
,
the
cell
output
hidden
state
of
a
single
time
step
of
the
recurrent
network.
operand | allowed data types | allowed ranks |
---|---|---|
|
"float32"
,
"float16"
| 2 |
|
same
as
| 2 |
|
same
as
| 2 |
bias
|
same
as
| 1 |
recurrentBias
|
same
as
| 1 |
output |
same
as
| 2 |
MLGruCellSupportLimits
has
the
following
members;
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
weight
, of type MLTensorLimits -
MLTensorLimits
for weight operand. -
recurrentWeight
, of type MLTensorLimits -
MLTensorLimits
for recurrentWeight operand. -
hiddenState
, of type MLTensorLimits -
MLTensorLimits
for hiddenState operand. -
bias
, of type MLTensorLimits -
MLTensorLimits
for bias operand. -
recurrentBias
, of type MLTensorLimits -
MLTensorLimits
for recurrentBias operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
gruCell()
:
-
gruCell
, of type MLGruCellSupportLimits -
Support limits for operator
gruCell()
.
The
gruCell(
input
,
weight
,
recurrentWeight
,
hiddenState
,
hiddenSize
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , weight , recurrentWeight , hiddenState , options .
bias
(if it exists ), and options .recurrentBias
(if it exists ) returns false, then throw aTypeError
. -
If the dataType of any of input , weight , recurrentWeight , or hiddenState is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If the rank of any of input , weight , recurrentWeight or hiddenState is not its allowed ranks (according to this table ), then throw a
TypeError
. -
Let batchSize be input ’s shape [0].
-
Let inputSize be input ’s shape [1].
-
If weight ’s shape is not equal to « 3 * hiddenSize , inputSize », then throw a
TypeError
. -
If recurrentWeight ’s shape is not equal to « 3 * hiddenSize , hiddenSize », then throw a
TypeError
. -
If hiddenState ’s shape is not equal to « batchSize , hiddenSize », then throw a
TypeError
. -
If hiddenSize * 6 is not a valid dimension , then throw a
TypeError
.Why hiddenSize * 6 ?
Some underlying platforms operate on a single bias tensor which is a concatenation ofbias
andrecurrentBias
. Therefore, 3 * hiddenSize + 3 * hiddenSize also needs to be a valid dimension . -
If options .
bias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « 3 * hiddenSize », then throw a
TypeError
.
-
-
If options .
recurrentBias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « 3 * hiddenSize », then throw a
TypeError
.
-
-
If options .
activations
exists , then:-
Let activations be a clone of options .
activations
.
-
Otherwise:
-
Let desc be the result of creating an MLOperandDescriptor given input ’s dataType and « batchSize , hiddenSize ».
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "gruCell" operation, given weight , recurrentWeight , hiddenState , hiddenSize and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input , weight , recurrentWeight , and hiddenState .
-
If options .
bias
exists , then add it to operator ’s inputs . -
If options .
recurrentBias
exists , then add it to operator ’s inputs . -
Set operator ’s activation functions to a clone of activations .
-
Set operator ’s output to output .
-
-
Return output .
The
behavior
of
this
operation
when
the
weight
layout
is
the
default
"zrn"
layout,
and
the
activation
functions
of
the
update/reset
gate
and
new
gate
are
sigmoid()
and
tanh()
respectively
can
be
generically
emulated
from
the
usage
of
other
operations
as
follows,
although
user
agents
typically
have
a
more
efficient
implementation.
In
cases
where
the
underlying
platform
does
not
directly
support
an
operation,
this
decomposition
can
be
used
as
a
template
to
guide
the
implementation.
function gruCell( builder, input, weight, recurrentWeight, hiddenState, hiddenSize, options) { const one= builder. constant( input. dataType, 1 ); const zero= builder. constant( input. dataType, 0 ); const inputSize= input. shape[ 1 ]; // update gate (z) let z= builder. sigmoid( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 0 ], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 0 ], [ hiddenSize]) : zero)), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 0 , 0 ], [ hiddenSize, inputSize]))), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 0 , 0 ], [ hiddenSize, hiddenSize])))))); // reset gate (r) let r= builder. sigmoid( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ hiddenSize], [ hiddenSize]) : zero)), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ hiddenSize, 0 ], [ hiddenSize, inputSize]))), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ hiddenSize, 0 ], [ hiddenSize, hiddenSize])))))); // new gate (n) let n; if ( options. resetAfter) { n= builder. tanh( builder. add( ( options. bias? builder. slice( options. bias, [ 2 * hiddenSize], [ hiddenSize]) : zero), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 2 * hiddenSize, 0 ], [ hiddenSize, inputSize]))), builder. mul( r, builder. add( ( options. recurrentBias? builder. slice( options. recurrentBias, [ 2 * hiddenSize], [ hiddenSize]) : zero), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 2 * hiddenSize, 0 ], [ hiddenSize, hiddenSize])))))))); } else { n= builder. tanh( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 2 * hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 2 * hiddenSize], [ hiddenSize]) : zero)), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 2 * hiddenSize, 0 ], [ hiddenSize, inputSize]))), builder. matmul( builder. mul( r, hiddenState), builder. transpose( builder. slice( recurrentWeight, [ 2 * hiddenSize, 0 ], [ hiddenSize, hiddenSize])))))); } // compute the new hidden state return builder. add( builder. mul( z, hiddenState), builder. mul( n, builder. sub( one, z))); }
7.9.27. hardSigmoid
Calculate the non-smooth hard sigmoid function on the input tensor, used instead of the sigmoid function for faster computation.dictionary :
MLHardSigmoidOptions MLOperatorOptions {double alpha = 0.2;double beta = 0.5; };partial interface MLGraphBuilder {MLOperand hardSigmoid (MLOperand input ,optional MLHardSigmoidOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits hardSigmoid ; };
MLHardSigmoidOptions
has
the
following
members:
-
alpha
, of type double , defaulting to0.2
-
A scalar multiplier.
-
beta
, of type double , defaulting to0.5
-
A scalar addition.
-
input
: anMLOperand
. The input tensor. -
options
: an optionalMLHardSigmoidOptions
. The optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
hardSigmoid()
:
-
hardSigmoid
, of type MLSingleInputSupportLimits -
Support limits for operator
hardSigmoid()
.
The
hardSigmoid(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Set options .
alpha
to the result of casting options .alpha
to input ’s dataType . -
Set options .
beta
to the result of casting options .beta
to input ’s dataType . -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "hardSigmoid" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function hardSigmoid( builder, input, options) { return builder. max( builder. min( builder. add( builder. mul( builder. constant( input. dataType, options. alpha), input), builder. constant( input. dataType, options. beta)), builder. constant( input. dataType, 1 )), builder. constant( input. dataType, 0 )); }
7.9.28. hardSwish
Computes the nonlinear function
y
=
x
*
max(0,
min(6,
(x
+
3)))
/
6
that
is
introduced
by
[MobileNetV3]
on
the
input
tensor
element-wise.
partial interface MLGraphBuilder {MLOperand hardSwish (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits hardSwish ; };
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
hardSwish()
:
-
hardSwish
, of type MLSingleInputSupportLimits -
Support limits for operator
hardSwish()
.
The
hardSwish(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "hardSwish" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function hardSwish( builder, input, options) { return builder. div( builder. mul( input, builder. max( builder. constant( input. dataType, 0 ), builder. min( builder. constant( input. dataType, 6 ), builder. add( input, builder. constant( input. dataType, 3 ))))), builder. constant( input. dataType, 6 )); }
7.9.29. instanceNormalization
Normalize the input using [Instance-Normalization] . Unlike
batchNormalization()
where
the
mean
and
variance
values
used
in
the
normalization
are
computed
across
all
the
samples
in
the
batch
dimension
while
the
model
is
trained,
the
mean
and
variance
values
used
in
the
instance
normalization
are
computed
on
the
fly
for
each
input
feature
of
each
individual
sample
in
the
batch.
dictionary :
MLInstanceNormalizationOptions MLOperatorOptions {MLOperand scale ;MLOperand bias ;double epsilon = 1e-5;MLInputOperandLayout layout = "nchw"; };partial interface MLGraphBuilder {, = {});MLOperand instanceNormalization (MLOperand input ,optional MLInstanceNormalizationOptions options = {}); };dictionary {
MLNormalizationSupportLimits MLTensorLimits input ;MLTensorLimits scale ;MLTensorLimits bias ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLNormalizationSupportLimits instanceNormalization ; };
MLInstanceNormalizationOptions
has
the
following
members:
-
scale
, of type MLOperand -
The 1-D tensor of the scaling values whose size is equal to the number of channels, i.e. the size of the feature dimension of the input. For example, for an
input
tensor with"nchw"
layout, the size is equal toinput
’s shape [1]. -
bias
, of type MLOperand -
The 1-D tensor of the bias values whose size is equal to the size of the feature dimension of the input. For example, for an
input
tensor with"nchw"
layout, the size is equal toinput
’s shape [1]. -
epsilon
, of type double , defaulting to1e-5
-
A small value to prevent computational error due to divide-by-zero.
-
layout
, of type MLInputOperandLayout , defaulting to"nchw"
-
The layout format of the input.
-
input
: anMLOperand
. The input 4-D tensor. -
options
: an optionalMLInstanceNormalizationOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
instance-normalized
4-D
tensor
of
the
same
shape
as
input
.
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| 4 |
scale
|
same
as
input
| 1 |
bias
|
same
as
input
| 1 |
output |
same
as
input
| 4 |
MLNormalizationSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
scale
, of type MLTensorLimits -
MLTensorLimits
for scale operand. -
bias
, of type MLTensorLimits -
MLTensorLimits
for bias operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
instanceNormalization()
:
-
instanceNormalization
, of type MLNormalizationSupportLimits -
Support limits for operator
instanceNormalization()
.
The
instanceNormalization(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , options .
scale
(if it exists ), and options .bias
(if it exists ) returns false, then throw aTypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If input ’s rank is not its allowed rank , then throw a
TypeError
. -
Set options .
epsilon
to the result of casting options .epsilon
to input ’s dataType . -
Let axis be 1 if options .
layout
is"nchw"
, and 3 otherwise. -
If options .
scale
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « input ’s shape [ axis ] », then throw a
TypeError
.
-
-
If options .
bias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « input ’s shape [ axis ] », then throw a
TypeError
.
-
-
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "instanceNormalization" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
If options .
scale
exists , then add it to operator ’s inputs . -
If options .
bias
exists , then add it to operator ’s inputs . -
Set operator ’s output to output .
-
-
Return output .
The
behavior
of
this
operation
when
the
input
tensor
is
4-D
of
the
"nchw"
layout
can
be
generically
emulated
from
the
usage
of
other
operations
as
follows,
although
user
agents
typically
have
a
more
efficient
implementation.
In
cases
where
the
underlying
platform
does
not
directly
support
an
operation,
this
decomposition
can
be
used
as
a
template
to
guide
the
implementation.
function instanceNormalization( builder, input, options) { // The reduction of the mean and variance values happens over the spatial // dimensions of the input e.g. axis 2 and 3 of the input tensor. const reduceOptions= { axes: [ 2 , 3 ], keepDimensions: true }; const mean= builder. reduceMean( input, reduceOptions); const variance= builder. reduceMean( builder. pow( builder. sub( input, mean), builder. constant( input. dataType, 2 )), reduceOptions); // The scale and bias values are applied per input feature // e.g. axis 1 of the input tensor. const shape= [ 1 , input. shape[ 1 ], 1 , 1 ]; return builder. add( builder. mul( builder. reshape( options. scale, shape), builder. div( builder. sub( input, mean), builder. sqrt( builder. add( variance, options. epsilon)))), builder. reshape( options. bias, shape)); }
7.9.30. layerNormalization
Normalize the input using [Layer-Normalization] . Unlike
batchNormalization()
where
the
mean
and
variance
values
are
computed
across
all
the
samples
in
the
batch
dimension
while
the
model
is
trained,
and
in
instanceNormalization()
where
the
mean
and
variance
values
are
computed
on
the
fly
for
each
input
feature
of
each
individual
sample
in
the
batch,
the
means
and
variance
values
of
the
layer
normalization
are
computed
on
the
fly
across
all
the
input
features
of
each
individual
sample
in
the
batch.
dictionary :
MLLayerNormalizationOptions MLOperatorOptions {MLOperand scale ;MLOperand bias ;sequence <[EnforceRange ]unsigned long >axes ;double epsilon = 1e-5; };partial interface MLGraphBuilder {MLOperand layerNormalization (MLOperand input ,optional MLLayerNormalizationOptions options = {}); };partial dictionary MLOpSupportLimits {MLNormalizationSupportLimits layerNormalization ; };
MLLayerNormalizationOptions
has
the
following
members:
-
scale
, of type MLOperand -
The N-D tensor of the scaling values whose shape is determined by the
axes
member in that each value inaxes
indicates the dimension of the input tensor with scaling values. For example, for anaxes
values of [1,2,3], the shape of this tensor is the list of the corresponding sizes of the input dimension 1, 2 and 3. When this member is not present, the scaling value is assumed to be 1. -
bias
, of type MLOperand -
The N-D tensor of the bias values whose shape is determined by the
axes
member in that each value inaxes
indicates the dimension of the input tensor with bias values. For example, for anaxes
values of [1,2,3], the shape of this tensor is the list of the corresponding sizes of the input dimension 1, 2 and 3. When this member is not present, the bias value is assumed to be 0. -
axes
, of typesequence<[EnforceRange] unsigned long>
-
The indices to the input dimensions to reduce. When this member is not present, it is treated as if all dimensions except the first were given (e.g. for a 4-D input tensor,
axes
= [1,2,3]). That is, the reduction for the mean and variance values are calculated across all the input features for each independent batch. If empty, no dimensions are reduced. -
epsilon
, of type double , defaulting to1e-5
-
A small value to prevent computational error due to divide-by-zero.
-
input
: anMLOperand
. The input N-D tensor. -
options
: an optionalMLLayerNormalizationOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
layer-normalized
N-D
tensor
of
the
same
shape
as
input
.
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
scale
|
same
as
input
|
0
to
input
’s
rank
|
bias
|
same
as
input
|
0
to
input
’s
rank
|
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
layerNormalization()
:
-
layerNormalization
, of type MLNormalizationSupportLimits -
Support limits for operator
layerNormalization()
.
The
layerNormalization(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , options .
scale
(if it exists ), and options .bias
(if it exists ) returns false, then throw aTypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If options .
axes
does not exist , then set options .axes
to a new list , either the range from 1 to input ’s rank , exclusive, if input ’s rank is greater than 1, or an empty list otherwise. -
Otherwise, if options .
axes
contains duplicate values, or if any of its items is not in the range 0 to input ’s rank , exclusive, then throw aTypeError
. -
Set options .
epsilon
to the result of casting options .epsilon
to input ’s dataType . -
If options .
scale
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its rank is not equal to options .
axes
’s size , then throw aTypeError
.
-
-
If options .
bias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its rank is not equal to options .
axes
’s size , then throw aTypeError
.
-
-
For each index in the range 0 to options .
axes
’s size , exclusive: -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "layerNormalization" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
If options .
scale
exists , then add it to operator ’s inputs . -
If options .
bias
exists , then add it to operator ’s inputs . -
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation when the axes parameter is set to [1,2,3] can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function layerNormalization( builder, input, options) { // The reduction of the mean and variance values happens over the spatial // dimensions across all the input features (i.e. all channels) of the input // tensor. const reduceOptions= { axes: [ 1 , 2 , 3 ], keepDimensions: true }; const mean= builder. reduceMean( input, reduceOptions); const variance= builder. reduceMean( builder. pow( builder. sub( input, mean), builder. constant( input. dataType, 2 )), reduceOptions); // The scale and bias tensors are of the shape of the input // specified by the values in the axes parameter (i.e. [1,2,3]). return builder. add( builder. mul( options. scale, builder. div( builder. sub( input, mean), builder. sqrt( builder. add( variance, options. epsilon)))), options. bias); }
7.9.31. leakyRelu
Calculate the leaky version of rectified linear function on the input tensor element-wise. The calculation follows the expression
max(0,
x)
+
alpha
*
min(0,
x)
.
dictionary :
MLLeakyReluOptions MLOperatorOptions {double alpha = 0.01; };partial interface MLGraphBuilder {MLOperand leakyRelu (MLOperand input ,optional MLLeakyReluOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits leakyRelu ; };
MLLeakyReluOptions
has
the
following
members:
-
alpha
, of type double , defaulting to0.01
-
A scalar multiplier.
-
input
: anMLOperand
. The input tensor. -
options
: an optionalMLLeakyReluOptions
. The optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
leakyRelu()
:
-
leakyRelu
, of type MLSingleInputSupportLimits -
Support limits for operator
leakyRelu()
.
The
leakyRelu(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Set options .
alpha
to the result of casting options .alpha
to input ’s dataType . -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "leakyRelu" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function leakyRelu( builder, input, options) { return builder. add( builder. max( builder. constant( input. dataType, 0 ), input), builder. mul( builder. constant( input. dataType, options. alpha), builder. min( builder. constant( input. dataType, 0 ), input))); }
7.9.32. linear
Calculate a linear function
y
=
alpha
*
x
+
beta
on
the
input
tensor.
dictionary :
MLLinearOptions MLOperatorOptions {double alpha = 1;double beta = 0; };partial interface MLGraphBuilder {MLOperand linear (MLOperand input ,optional MLLinearOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits linear ; };
MLLinearOptions
has
the
following
members:
-
alpha
, of type double , defaulting to1
-
A scalar multiplier.
-
beta
, of type double , defaulting to0
-
A scalar addition.
-
input
: anMLOperand
. The input tensor. -
options
: an optionalMLLinearOptions
. The optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
linear()
:
-
linear
, of type MLSingleInputSupportLimits -
Support limits for operator
linear()
.
The
linear(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Set options .
alpha
to the result of casting options .alpha
to input ’s dataType . -
Set options .
beta
to the result of casting options .beta
to input ’s dataType . -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "linear" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function linear( builder, input, options) { return builder. add( builder. mul( input, builder. constant( input. dataType, options. alpha)), builder. constant( input. dataType, options. beta)); }
7.9.33. lstm
Long Short-Term Memory [LSTM] recurrent network uses an input, output, forget, and cell gate to compute the output state that rolls into the output across the temporal sequence of the network.enum {
MLLstmWeightLayout , // input-output-forget-cell gate ordering
"iofg" // input-forget-cell-output gate ordering };
"ifgo" dictionary :
MLLstmOptions MLOperatorOptions {MLOperand bias ;MLOperand recurrentBias ;MLOperand peepholeWeight ;MLOperand initialHiddenState ;MLOperand initialCellState ;boolean returnSequence =false ;MLRecurrentNetworkDirection direction = "forward";MLLstmWeightLayout layout = "iofg";sequence <MLRecurrentNetworkActivation >activations ; };partial interface MLGraphBuilder {sequence <MLOperand >lstm (MLOperand input ,MLOperand weight ,MLOperand recurrentWeight , [EnforceRange ]unsigned long steps , [EnforceRange ]unsigned long hiddenSize ,optional MLLstmOptions options = {}); };dictionary {
MLLstmSupportLimits MLTensorLimits input ;MLTensorLimits weight ;MLTensorLimits recurrentWeight ;MLTensorLimits bias ;MLTensorLimits recurrentBias ;MLTensorLimits peepholeWeight ;MLTensorLimits initialHiddenState ;MLTensorLimits initialCellState ;MLDataTypeLimits outputs ; };partial dictionary MLOpSupportLimits {MLLstmSupportLimits lstm ; };
MLLstmOptions
has
the
following
members:
-
bias
, of type MLOperand -
The 2-D input bias tensor of shape [numDirections, 4 * hiddenSize] . The ordering of the bias vectors in the second dimension of the tensor shape is specified according to
layout
. -
recurrentBias
, of type MLOperand -
The 2-D recurrent bias tensor of shape [numDirections, 4 * hiddenSize] . The ordering of the bias vectors in the first dimension of the tensor shape is specified according to
layout
. -
peepholeWeight
, of type MLOperand -
The 2-D weight tensor for peepholes of shape [numDirections, 3 * hiddenSize] . The pack ordering of the weight vectors is for the
input (i)
,output (o)
, andforget (f)
gate, respectively. -
initialHiddenState
, of type MLOperand -
The 3-D initial hidden state tensor of shape [numDirections, batchSize, hiddenSize] . When not specified, implementations must use a tensor filled with zero.
-
initialCellState
, of type MLOperand -
The 3-D initial hidden state tensor of shape [numDirections, batchSize, hiddenSize] . When not specified, implementations must use a tensor filled with zero.
-
returnSequence
, of type boolean , defaulting tofalse
-
Indicates whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step.
-
direction
, of type MLRecurrentNetworkDirection , defaulting to"forward"
-
The processing direction of the input sequence. When set to
"both"
, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions. -
layout
, of type MLLstmWeightLayout , defaulting to"iofg"
-
The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the
input (i)
,output (o)
,forget (f)
, andcell (g)
gate, as indicated in the first dimension of the weight and bias tensor shapes. -
activations
, of type sequence< MLRecurrentNetworkActivation > -
A list of three activation functions , the first one is used for the
input (i)
,forget (f)
, andoutput (o)
gate, the second one is used for thecell (g)
gate, and the last used for filtering the output cell state before combining it with the result of the output gate to form the output hidden state. When not specified, defaults to a sequence of the"sigmoid"
,"tanh"
, and"tanh"
functions, respectively.
-
input
: anMLOperand
. The input 3-D tensor of shape [steps, batchSize, inputSize] . -
weight
: anMLOperand
. The 3-D input weight tensor of shape [numDirections, 4 * hiddenSize, inputSize] . The ordering of the weight vectors in the second dimension of the tensor shape is specified according tolayout
. -
recurrentWeight
: anMLOperand
. The 3-D recurrent weight tensor of shape [numDirections, 4 * hiddenSize, hiddenSize] . The ordering of the weight vectors in the second dimension of the tensor shape is specified according tolayout
. -
steps
: anunsigned long
scalar. The number of time steps in the recurrent network. The value must be greater than 0. -
hiddenSize
: anunsigned long
scalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state. -
options
: an optionalMLLstmOptions
. The optional parameters of the operation.
Returns:
sequence
<
MLOperand
>.
The
first
element
is
a
3-D
tensor
of
shape
[numDirections,
batchSize,
hiddenSize]
,
the
output
hidden
state
from
the
last
time
step
of
the
network.
The
second
element
is
a
3-D
tensor
of
shape
[numDirections,
batchSize,
hiddenSize]
,
the
output
cell
state
from
the
last
time
step
of
the
network.
Additionally,
if
returnSequence
is
set
to
true,
the
third
element
is
the
4-D
output
tensor
of
shape
[steps,
numDirections,
batchSize,
hiddenSize]
containing
every
output
from
each
time
step
in
the
temporal
sequence.
operand | allowed data types | allowed ranks |
---|---|---|
|
"float32"
,
"float16"
| 3 |
|
same
as
| 3 |
|
same
as
| 3 |
bias
|
same
as
| 2 |
recurrentBias
|
same
as
| 2 |
peepholeWeight
|
same
as
| 2 |
|
same
as
| 3 |
initialCellState
|
same
as
| 3 |
outputs[0] |
same
as
| 3 |
outputs[1] |
same
as
| 3 |
outputs[2]
if
returnSequence
is
true
|
same
as
| 4 |
MLLstmSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
weight
, of type MLTensorLimits -
MLTensorLimits
for weight operand. -
recurrentWeight
, of type MLTensorLimits -
MLTensorLimits
for recurrentWeight operand. -
bias
, of type MLTensorLimits -
MLTensorLimits
for bias operand. -
recurrentBias
, of type MLTensorLimits -
MLTensorLimits
for recurrentBias operand. -
peepholeWeight
, of type MLTensorLimits -
MLTensorLimits
for peepholeWeight operand. -
initialHiddenState
, of type MLTensorLimits -
MLTensorLimits
for initialHiddenState operand. -
initialCellState
, of type MLTensorLimits -
MLTensorLimits
for initialCellState operand. -
outputs
, of type MLDataTypeLimits -
MLDataTypeLimits
for all the output operands.
MLOpSupportLimits
has
the
following
member
for
lstm()
:
-
lstm
, of type MLLstmSupportLimits -
Support limits for operator
lstm()
.
The
lstm(
input
,
weight
,
recurrentWeight
,
steps
,
hiddenSize
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , weight , recurrentWeight , options .
bias
(if it exists ), options .recurrentBias
(if it exists ), options .peepholeWeight
(if it exists ), options .initialCellState
(if it exists ) returns false, then throw aTypeError
. -
Let numDirections be 2 if options .
direction
is"both"
, or 1 otherwise. -
If the dataType of any of input , weight or recurrentWeight is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If the rank of any of input , weight or recurrentWeight is not its allowed rank , then throw a
TypeError
. -
If input ’s shape [0] is not equal to steps , then throw a
TypeError
. -
Let batchSize be input ’s shape [1].
-
Let inputSize be input ’s shape [2].
-
If weight ’s shape is not equal to « numDirections , 4 * hiddenSize , inputSize », then throw a
TypeError
. -
If recurrentWeight ’s shape is not equal to « numDirections , 4 * hiddenSize , hiddenSize », then throw a
TypeError
. -
If hiddenSize * 8 is not a valid dimension , then throw a
TypeError
.Why hiddenSize * 8 ?
Some underlying platforms operate on a single bias tensor which is a concatenation ofbias
andrecurrentBias
. Therefore, 4 * hiddenSize + 4 * hiddenSize also needs to be a valid dimension . -
If options .
bias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , 4 * hiddenSize », then throw a
TypeError
.
-
-
If options .
recurrentBias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , 4 * hiddenSize », then throw a
TypeError
.
-
-
If options .
peepholeWeight
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , 3 * hiddenSize », then throw a
TypeError
.
-
-
If options .
-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , batchSize , hiddenSize », then throw a
TypeError
.
-
-
If options .
initialCellState
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « numDirections , batchSize , hiddenSize », then throw a
TypeError
.
-
-
If options .
activations
exists , then:-
Let activations be a clone of options .
activations
.
-
Otherwise:
-
Calculate the output shape:
-
Let desc be the result of creating an MLOperandDescriptor given input ’s dataType and « numDirections , batchSize , hiddenSize ».
-
If options .
returnSequence
is true, then:-
Let desc2 be the result of creating an MLOperandDescriptor given input ’s dataType and « steps , numDirections , batchSize , hiddenSize ».
-
-
-
Make graph connections:
-
Let operator be an operator for the "lstm" operation, given weight , recurrentWeight , steps , hiddenSize and options .
-
Let output0 be the result of creating an MLOperand given this and desc .
-
Let output1 be the result of creating an MLOperand given this and desc .
-
If options .
returnSequence
is true, then:-
Let output2 be the result of creating an MLOperand given this and desc2 .
-
Let output be the list « output0 , output1 , output2 ».
-
Set output0 .
[[operator]]
, output1 .[[operator]]
and output2 .[[operator]]
to operator .
-
-
Otherwise:
-
Let output be the list « output0 , output1 ».
-
Set output0 .
[[operator]]
and output1 .[[operator]]
to operator .
-
-
Set operator ’s inputs to input , weight , and recurrentWeight .
-
If options .
bias
exists , then add it to operator ’s inputs . -
If options .
recurrentBias
exists , then add it to operator ’s inputs . -
If options .
peepholeWeight
exists , then add it to operator ’s inputs . -
If options .
initialCellState
exists , then add it to operator ’s inputs . -
Set operator ’s activation functions to a clone of activations .
-
Set operator ’s output to output .
-
-
Return output .
Using a squeeze() helper, the behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function lstm( builder, input, weight, recurrentWeight, steps, hiddenSize, options) { const batchSize= input. shape[ 1 ]; const inputSize= input. shape[ 2 ]; const direction= options. direction|| 'forward' ; const numDirections= ( direction== 'both' ? 2 : 1 ); let hiddenState= options. initialHiddenState; let cellState= options. initialCellState; if ( ! hiddenState) { const desc= { dataType: 'float32' , shape: [ numDirections, batchSize, hiddenSize] }; const totalSize= numDirections* batchSize* hiddenSize; hiddenState= builder. constant( desc, new Float32Array( totalSize). fill( 0 )); } if ( ! cellState) { const desc= { dataType: 'float32' , shape: [ numDirections, batchSize, hiddenSize] }; const totalSize= numDirections* batchSize* hiddenSize; cellState= builder. constant( desc, new Float32Array( totalSize). fill( 0 )); } let currentWeight= []; let currentRecurrentWeight= []; let currentBias= []; let currentRecurrentBias= []; let currentPeepholeWeight= []; let forwardSequence= null ; let backwardSequence= null ; let outputHidden= null ; let outputCell= null ; for ( let dir= 0 ; dir< numDirections; ++ dir) { currentWeight. push( squeeze( builder, builder. slice( weight, [ dir, 0 , 0 ], [ 1 , 4 * hiddenSize, inputSize]))); currentRecurrentWeight. push( squeeze( builder, builder. slice( recurrentWeight, [ dir, 0 , 0 ], [ 1 , 4 * hiddenSize, hiddenSize]))); currentBias. push( options. bias? ( squeeze( builder, builder. slice( options. bias, [ dir, 0 ], [ 1 , 4 * hiddenSize]))) : null ); currentRecurrentBias. push( options. recurrentBias? ( squeeze( builder, builder. slice( options. recurrentBias, [ dir, 0 ], [ 1 , 4 * hiddenSize]))) : null ); currentPeepholeWeight. push( options. peepholeWeight? ( squeeze( builder, builder. slice( options. peepholeWeight, [ dir, 0 ], [ 1 , 3 * hiddenSize]))) : null ); let currentHidden= squeeze( builder, builder. slice( hiddenState, [ dir, 0 , 0 ], [ 1 , batchSize, hiddenSize])); let currentCell= squeeze( builder, builder. slice( cellState, [ dir, 0 , 0 ], [ 1 , batchSize, hiddenSize])); for ( let step= 0 ; step< steps; ++ step) { const slice= ( dir== 1 || direction== 'backward' ? steps- step- 1 : step); const currentInput= squeeze( builder, builder. slice( input, [ slice, 0 , 0 ], [ 1 , batchSize, inputSize])); [ currentHidden, currentCell] = builder. lstmCell( currentInput, currentWeight[ dir], currentRecurrentWeight[ dir], currentHidden, currentCell, hiddenSize, { bias: currentBias[ dir], recurrentBias: currentRecurrentBias[ dir], peepholeWeight: currentPeepholeWeight[ dir], layout: options. layout, activations: options. activations}); if ( options. returnSequence) { // Expand currentHidden of 2D([batchSize, hiddenSize]) // to 4D([steps, numDirections, batchSize, hiddenSize]) const expandedHiddenAs4D= builder. reshape( currentHidden, [ 1 , 1 , batchSize, hiddenSize]); if ( direction== 'forward' || ( dir== 0 && direction== 'both' )) { forwardSequence= forwardSequence? builder. concat([ forwardSequence, expandedHiddenAs4D], 0 ) : expandedHiddenAs4D; } else if ( direction== 'backward' || ( dir== 1 && direction== 'both' )) { backwardSequence= backwardSequence? builder. concat([ expandedHiddenAs4D, backwardSequence], 0 ) : expandedHiddenAs4D; } } } // Expand currentHidden of 2D([batchSize, hiddenSize]) // to 3D([numDirections, batchSize, hiddenSize]) const expandedHiddenAs3D= builder. reshape( currentHidden, [ 1 , batchSize, hiddenSize]); outputHidden= outputHidden? builder. concat([ outputHidden, expandedHiddenAs3D], 0 ) : expandedHiddenAs3D; // Expand currentCell of 2D([batchSize, hiddenSize]) // to 3D([numDirections, batchSize, hiddenSize]) const expandedCellAs3D= builder. reshape( currentCell, [ 1 , batchSize, hiddenSize]); outputCell= outputCell? builder. concat([ outputCell, expandedCellAs3D], 0 ) : expandedCellAs3D; } if ( options. returnSequence) { let outputSequence= null ; if ( direction== 'forward' ) { outputSequence= forwardSequence; } else if ( direction== 'backward' ) { outputSequence= backwardSequence; } else if ( direction== 'both' ) { // Concat along axis 1 (numDirections dimension) outputSequence= builder. concat([ forwardSequence, backwardSequence], 1 ); } return [ outputHidden, outputCell, outputSequence]; } else { return [ outputHidden, outputCell]; } }
7.9.34. lstmCell
A single time step of the Long Short-Term Memory [LSTM] recurrent network using a cell state, an input, output, and forget gate to compute the cell state and the hidden state of the next time step that rolls into the output across the temporal sequence of the network.dictionary :
MLLstmCellOptions MLOperatorOptions {MLOperand bias ;MLOperand recurrentBias ;MLOperand peepholeWeight ;MLLstmWeightLayout layout = "iofg";sequence <MLRecurrentNetworkActivation >activations ; };partial interface MLGraphBuilder {sequence <MLOperand >lstmCell (MLOperand input ,MLOperand weight ,MLOperand recurrentWeight ,MLOperand hiddenState ,MLOperand cellState , [EnforceRange ]unsigned long hiddenSize ,optional MLLstmCellOptions options = {}); };dictionary {
MLLstmCellSupportLimits MLTensorLimits input ;MLTensorLimits weight ;MLTensorLimits recurrentWeight ;MLTensorLimits hiddenState ;MLTensorLimits cellState ;MLTensorLimits bias ;MLTensorLimits recurrentBias ;MLTensorLimits peepholeWeight ;MLDataTypeLimits outputs ; };partial dictionary MLOpSupportLimits {MLLstmCellSupportLimits lstmCell ; };
MLLstmCellOptions
has
the
following
members:
-
bias
, of type MLOperand -
The 1-D input bias tensor of shape [4 * hiddenSize] . The ordering of the bias vectors in the first dimension of the tensor shape is specified according to
layout
. -
recurrentBias
, of type MLOperand -
The 1-D recurrent bias tensor of shape [4 * hiddenSize] . The ordering of the bias vectors in the first dimension of the tensor shape is specified according to
layout
. -
peepholeWeight
, of type MLOperand -
The 1-D weight tensor for peepholes of shape [3 * hiddenSize] . The pack ordering of the weight vectors is for the
input (i)
,output (o)
, andforget (f)
gate, respectively. -
layout
, of type MLLstmWeightLayout , defaulting to"iofg"
-
The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the
input (i)
,output (o)
,forget (f)
, andcell (g)
gate, as indicated in the first dimension of the weight and bias tensor shapes. -
activations
, of type sequence< MLRecurrentNetworkActivation > -
A list of three activation functions , the first one is used for the
input (i)
,forget (f)
, andoutput (o)
gate, the second one is used for thecell (g)
gate, and the last used for filtering the output cell state before combining it with the result of the output gate to form the output hidden state. When not specified, defaults to a sequence of the"sigmoid"
,"tanh"
, and"tanh"
functions, respectively.
-
input
: anMLOperand
. The input 2-D tensor of shape [batchSize, inputSize] . -
weight
: anMLOperand
. The 2-D input weight tensor of shape [4 * hiddenSize, inputSize] . The ordering of the weight vectors in the first dimension of the tensor shape is specified according tolayout
. -
recurrentWeight
: anMLOperand
. The 2-D recurrent weight tensor of shape [4 * hiddenSize, hiddenSize] . The ordering of the weight vectors in the first dimension of the tensor shape is specified according tolayout
. -
hiddenState
: anMLOperand
. The 2-D input hidden state tensor of shape [batchSize, hiddenSize] . -
cellState
: anMLOperand
. The 2-D input cell state tensor of shape [batchSize, hiddenSize] . -
hiddenSize
: anunsigned long
scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state. -
options
: an optionalMLLstmCellOptions
. The optional parameters of the operation.
Returns:
sequence
<
MLOperand
>.
The
first
element
is
the
output
hidden
state
of
the
current
time
step
of
the
recurrent
network.
The
following
element
is
the
output
cell
state.
Both
elements
are
2-D
tensors
of
shape
[batchSize,
hiddenSize]
.
operand | allowed data types | allowed ranks |
---|---|---|
|
"float32"
,
"float16"
| 2 |
|
same
as
| 2 |
|
same
as
| 2 |
|
same
as
| 2 |
|
same
as
| 2 |
bias
|
same
as
| 1 |
recurrentBias
|
same
as
| 1 |
peepholeWeight
|
same
as
| 1 |
outputs[0] |
same
as
| 2 |
outputs[1] |
same
as
| 2 |
MLLstmCellSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
weight
, of type MLTensorLimits -
MLTensorLimits
for weight operand. -
recurrentWeight
, of type MLTensorLimits -
MLTensorLimits
for recurrentWeight operand. -
hiddenState
, of type MLTensorLimits -
MLTensorLimits
for hiddenState operand. -
cellState
, of type MLTensorLimits -
MLTensorLimits
for cellState operand. -
bias
, of type MLTensorLimits -
MLTensorLimits
for bias operand. -
recurrentBias
, of type MLTensorLimits -
MLTensorLimits
for recurrentBias operand. -
peepholeWeight
, of type MLTensorLimits -
MLTensorLimits
for peepholeWeight operand. -
outputs
, of type MLDataTypeLimits -
MLDataTypeLimits
for all the output operands.
MLOpSupportLimits
has
the
following
member
for
lstmCell()
:
-
lstmCell
, of type MLLstmCellSupportLimits -
Support limits for operator
lstmCell()
.
The
lstmCell(
input
,
weight
,
recurrentWeight
,
hiddenState
,
cellState
,
hiddenSize
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , weight , recurrentWeight , hiddenState , cellState , options .
bias
(if it exists ), options .recurrentBias
(if it exists ), and options .peepholeWeight
(if it exists ) returns false, then throw aTypeError
. -
If the dataType of any of input , weight , recurrentWeight , hiddenState or cellState is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If the rank of any of input , weight , recurrentWeight , hiddenState or cellState is not its allowed rank , then throw a
TypeError
. -
Let batchSize be input ’s shape [0].
-
Let inputSize be input ’s shape [1].
-
If weight ’s shape is not equal to « 4 * hiddenSize , inputSize », then throw a
TypeError
. -
If recurrentWeight ’s shape is not equal to « 4 * hiddenSize , hiddenSize », then throw a
TypeError
. -
If hiddenState ’s shape is not equal to « batchSize , hiddenSize », then throw a
TypeError
. -
If cellState ’s shape is not equal to « batchSize , hiddenSize », then throw a
TypeError
. -
If hiddenSize * 8 is not a valid dimension , then throw a
TypeError
.Why hiddenSize * 8 ?
Some underlying platforms operate on a single bias tensor which is a concatenation ofbias
andrecurrentBias
. Therefore, 4 * hiddenSize + 4 * hiddenSize also needs to be a valid dimension . -
If options .
bias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « 4 * hiddenSize », then throw a
TypeError
.
-
-
If options .
recurrentBias
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « 4 * hiddenSize », then throw a
TypeError
.
-
-
If options .
peepholeWeight
exists , then:-
If its dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If its shape is not equal to « 3 * hiddenSize », then throw a
TypeError
.
-
-
If options .
activations
exists , then:-
Let activations be a clone of options .
activations
.
-
Otherwise:
-
Let desc be a new
MLOperandDescriptor
. -
Make graph connections:
-
Let output0 be the result of creating an MLOperand given this and desc .
-
Let output1 be the result of creating an MLOperand given this and desc .
-
Let output be the list « output0 , output1 ».
-
Let operator be an operator for the "lstmCell" operation, given weight , recurrentWeight , hiddenState , cellState , hiddenSize and options .
-
Set output0 .
[[operator]]
and output1 .[[operator]]
to operator . -
Set operator ’s inputs to input , weight , recurrentWeight , hiddenState , and cellState .
-
If options .
bias
exists , then add it to operator ’s inputs . -
If options .
recurrentBias
exists , then add it to operator ’s inputs . -
If options .
peepholeWeight
exists , then add it to operator ’s inputs . -
Set operator ’s activation functions to a clone of activations .
-
Set operator ’s output to output .
-
-
Return output .
The
behavior
of
this
operation
when
the
weight
layout
is
the
default
"iofg"
layout,
and
the
activation
functions
of
the
input/forget/output
gate
and
the
cell
gate/the
cell
state’s
filter
for
the
output
hidden
state
are
sigmoid()
and
tanh()
respectively
can
be
generically
emulated
from
the
usage
of
other
operations
as
follows,
although
user
agents
typically
have
a
more
efficient
implementation.
In
cases
where
the
underlying
platform
does
not
directly
support
an
operation,
this
decomposition
can
be
used
as
a
template
to
guide
the
implementation.
function lstmCell( builder, input, weight, recurrentWeight, hiddenState, cellState, hiddenSize, options) { const zero= builder. constant( input. dataType, 0 ); const inputSize= input. shape[ 1 ]; // input gate (i) let i= builder. sigmoid( builder. add( builder. mul( cellState, ( options. peepholeWeight? builder. slice( options. peepholeWeight, [ 0 ], [ hiddenSize]) : zero)), builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 0 ], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 0 ], [ hiddenSize]) : zero)), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 0 , 0 ], [ hiddenSize, inputSize]))), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 0 , 0 ], [ hiddenSize, hiddenSize]))))))); // forget gate (f) let f= builder. sigmoid( builder. add( builder. mul( cellState, ( options. peepholeWeight? builder. slice( options. peepholeWeight, [ 2 * hiddenSize], [ hiddenSize]) : zero)), builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 2 * hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 2 * hiddenSize], [ hiddenSize]) : zero)), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 2 * hiddenSize, 0 ], [ hiddenSize, inputSize]))), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 2 * hiddenSize, 0 ], [ hiddenSize, hiddenSize]))))))); // cell gate (g) let g= builder. tanh( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 3 * hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 3 * hiddenSize], [ hiddenSize]) : zero)), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 3 * hiddenSize, 0 ], [ hiddenSize, inputSize]))), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 3 * hiddenSize, 0 ], [ hiddenSize, hiddenSize])))))); // output gate (o) let o= builder. sigmoid( builder. add( builder. mul( cellState, ( options. peepholeWeight? builder. slice( options. peepholeWeight, [ hiddenSize], [ hiddenSize]) : zero)), builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ hiddenSize], [ hiddenSize]) : zero)), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ hiddenSize, 0 ], [ hiddenSize, inputSize]))), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ hiddenSize, 0 ], [ hiddenSize, hiddenSize]))))))); // output cell state (ct) let ct= builder. add( builder. mul( f, cellState), builder. mul( i, g)); // output hidden state (ht) let ht= builder. mul( o, builder. tanh( ct)); return [ ht, ct]; }
7.9.35. matmul
Compute the matrix product of two input tensors.partial interface MLGraphBuilder {MLOperand matmul (MLOperand a ,MLOperand b ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLBinarySupportLimits matmul ; };
-
a
: anMLOperand
. The first input tensor which is at least 2-D. -
b
: anMLOperand
. The second input tensor which is at least 2-D. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
matrix
product
of
two
input
tensors.
-
If both
a
andb
are 2-dimensional, they are multiplied like conventional matrices and produce a 2-dimensional tensor as the output. -
If either
a
orb
isN
-dimensional whereN > 2
, it is treated as a stack of matrices with dimensions corresponding to the last two indices. The matrix multiplication will be broadcast according to [numpy-broadcasting-rule] . The shapes ofa
andb
, except the last two dimensions, must be bidirectionally broadcastable . The output is aN
-dimensional tensor whose rank is the maximum rank of the input tensors. For each dimension, except the last two, of the output tensor, its size is the maximum size along that dimension of the input tensors.
operand | allowed data types | allowed ranks |
---|---|---|
a
|
"float32"
,
"float16"
| 2 or greater |
b
|
same
as
a
| 2 or greater |
output |
same
as
a
|
maximum
of
a
’s
rank
and
b
’s
rank
|
MLOpSupportLimits
has
the
following
member
for
matmul()
:
-
matmul
, of type MLBinarySupportLimits -
Support limits for operator
matmul()
.
The
matmul(
a
,
b
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of a and b returns false, then throw a
TypeError
. -
If the dataType of any of a or b is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Calculate the output shape:
-
Let rankA be a ’s rank .
-
Let rankB be b ’s rank .
-
If either rankA or rankB is less than 2, then throw a
TypeError
. -
Let colsA be shapeA [ rankA - 1].
-
Let rowsA be shapeA [ rankA - 2].
-
Let colsB be shapeB [ rankB - 1].
-
Let rowsB be shapeB [ rankB - 2].
-
Let batchShapeA be a clone of shapeA with the spatial dimensions (last 2 items) removed .
-
Let batchShapeB be a clone of shapeB with the spatial dimensions (last 2 items) removed .
-
Let outputShape be the result of bidirectionally broadcasting batchShapeA and batchShapeB . If that returns failure, then throw a
TypeError
. -
Append « rowsA , colsB » to outputShape .
-
Let desc be the result of creating an MLOperandDescriptor given a ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "matmul" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to a and b .
-
Set operator ’s output to output .
-
-
Return output .
7.9.36. pad
Inflate the tensor with constant or mirrored values on the edges.enum {
MLPaddingMode ,
"constant" ,
"edge" };
"reflection" dictionary :
MLPadOptions MLOperatorOptions {MLPaddingMode mode = "constant";MLNumber value = 0; };partial interface MLGraphBuilder {MLOperand pad (MLOperand input ,sequence <[EnforceRange ]unsigned long >beginningPadding ,sequence <[EnforceRange ]unsigned long >endingPadding ,optional MLPadOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits pad ; };
MLPadOptions
has
the
following
members:
-
mode
, of type MLPaddingMode , defaulting to"constant"
-
The different ways to pad the tensor.
-
value
, of type MLNumber , defaulting to0
-
The padding value when
mode
is set to"constant"
.
-
input
: anMLOperand
. The input tensor. -
beginningPadding
: sequence <unsigned long
>. The number of padding values to add at the beginning of each input dimension, of length N where N is the rank of the input tensor. For each dimension d ofinput
,beginningPadding
[ d ] indicates how many values to add before the content in that dimension. -
endingPadding
: sequence <unsigned long
>. The number of padding values to add at the ending of each input dimension, of length N where N is the rank of the input tensor. For each dimension d ofinput
,endingPadding
[ d ] indicates how many values to add after the content in that dimension. -
options
: an optionalMLPadOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
padded
output
tensor.
Each
dimension
of
the
output
tensor
can
be
calculated
as
follows:
output
size
=
beginning
padding
+
input
size
+
ending
padding
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
pad()
:
-
pad
, of type MLSingleInputSupportLimits -
Support limits for operator
pad()
.
The
pad(
input
,
beginningPadding
,
endingPadding
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If beginningPadding ’s size and endingPadding ’s size are not both equal to input ’s rank , then throw a
TypeError
. -
Let desc be a copy of input .
[[descriptor]]
. -
Let outputShape be a copy of input ’s shape .
-
For each index in the range 0 to outputShape ’s rank , exclusive:
-
Switch on options .
mode
:-
"constant"
-
Do nothing.
-
"edge"
-
Do nothing.
-
"reflection"
-
-
Add to outputShape [ index ] the value of beginningPadding [ index ].
-
Add to outputShape [ index ] the value of endingPadding [ index ].
-
-
If any item in outputShape is not a valid dimension , then throw a
TypeError
. -
Set options .
value
to the result of casting options .value
to input ’s dataType . -
Set desc .
shape
to outputShape . -
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "padding" operation, given beginningPadding , endingPadding and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
Examples for constant, edge, and reflection padding:
// input: [[1,2,3], [4,5,6]] const input= builder. constant( { dataType: 'float32' , shape: [ 2 , 3 ]}, new Float32Array([ 1 , 2 , 3 , 4 , 5 , 6 ])); const beginningPadding= [ 1 , 2 ]; const endingPadding= [ 1 , 2 ]; // "constant" padded: // [[0,0,0,0,0,0,0], // [0,0,1,2,3,0,0], // [0,0,4,5,6,0,0], // [0,0,0,0,0,0,0]] builder. pad( input, beginningPadding, endingPadding); // "edge" padded: // [[1,1,1,2,3,3,3], // [1,1,1,2,3,3,3], // [4,4,4,5,6,6,6], // [4,4,4,5,6,6,6]] builder. pad( input, beginningPadding, endingPadding, { mode: 'edge' }); // "reflection" padded: // [[6,5,4,5,6,5,4], // [3,2,1,2,3,2,1], // [6,5,4,5,6,5,4], // [3,2,1,2,3,2,1]] builder. pad( input, beginningPadding, endingPadding, { mode: 'reflection' });
7.9.37. Pooling operations
Compute a pooling operation across all the elements within the moving window over the input tensor.dictionary :
MLPool2dOptions MLOperatorOptions {sequence <[EnforceRange ]unsigned long >windowDimensions ;sequence <[EnforceRange ]unsigned long >padding ;sequence <[EnforceRange ]unsigned long >strides ;sequence <[EnforceRange ]unsigned long >dilations ;MLInputOperandLayout layout = "nchw";= "floor"; ;required sequence <[EnforceRange ]unsigned long >outputSizes ; };partial interface MLGraphBuilder {= {}); = {}); = {});MLOperand averagePool2d (MLOperand input ,MLPool2dOptions options );MLOperand l2Pool2d (MLOperand input ,MLPool2dOptions options );MLOperand maxPool2d (MLOperand input ,MLPool2dOptions options ); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits averagePool2d ;MLSingleInputSupportLimits l2Pool2d ;MLSingleInputSupportLimits maxPool2d ; };
MLPool2dOptions
has
the
following
members:
-
windowDimensions
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2: [windowHeight, windowWidth] . Specifies the dimensions of the sliding window. The default value for the window dimensions are the height and width dimensions of the input shape.
-
padding
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 4: [beginningHeight, endingHeight, beginningWidth, endingWidth] . Specifies the additional rows and columns added to the beginning and ending of each spatial dimension of the convolution input. The default value is [0,0,0,0].
-
strides
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2: [strideHeight, strideWidth] . Specifies the stride of the sliding window for each spatial dimension of the convolution input. The default value is [1,1].
-
dilations
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2: [dilationHeight, dilationWidth] . Specifies the dilation factor for each spatial dimension applied on the convolution filter (kernel). The default value is [1,1].
-
layout
, of type MLInputOperandLayout , defaulting to"nchw"
-
Specifies the layout format of the input and output tensor as follows:
-
roundingType , of type MLRoundingType , defaulting to "floor" The rounding function used to compute the output shape.outputSizes
, of typesequence<[EnforceRange] unsigned long>
-
A list of length
2.2: [outputHeight, outputWidth] Specifies the sizes of the twospacialspatial dimensions of the output tensor.WhenThe spatial dimensions of the output
sizes are explicitly specified, thetensor can be calculated for a single dimension via:roundingTypeoutput size = ((input size - filter size + beginning padding + ending padding) / stride) + 1is ignored.If not specified,Then theoutput sizescaller either applies a floor or ceiling depending on whether partial window results areautomatically computed.desired.
-
input
: anMLOperand
. The input 4-D tensor. The logical shape is interpreted according to the value oflayout
. -
options
: an optionalMLPool2dOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
4-D
tensor
that
contains
the
result
of
the
reduction.
The
logical
shape
is
interpreted
according
to
the
value
of
layout
.
More
specifically,
if
the
roundingType
is
"floor"
,
the
spatial
dimensions
of
the
output
tensor
can
be
calculated
as
follows:
output
size
=
floor(1
+
(input
size
-
filter
size
+
beginning
padding
+
ending
padding)
/
stride)
or
if
roundingType
is
"ceil"
:
output
size
=
ceil(1
+
(input
size
-
filter
size
+
beginning
padding
+
ending
padding)
/
stride)
operand | allowed data types | allowed ranks |
---|---|---|
input
| specified as part of operation steps | 4 |
output |
same
as
input
| 4 |
MLOpSupportLimits
has
the
following
members
for
pooling
operations:
-
averagePool2d
, of type MLSingleInputSupportLimits -
Support limits for operator
averagePool2d()
. -
l2Pool2d
, of type MLSingleInputSupportLimits -
Support limits for operator
l2Pool2d()
. -
maxPool2d
, of type MLSingleInputSupportLimits -
Support limits for operator
maxPool2d()
.
// 'global' max pooling builder. maxPool2d( input);
To
create
a
pooling
operation
given
string
op
,
MLOperand
input
,
MLPool2dOptions
options
,
and
optional
list
allowedDataTypes
,
run
the
following
steps:
-
Assert : op is one of "averagePool2d", "l2Pool2d", "maxPool2d".
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If allowedDataTypes is given and it does not contain input ’s dataType , then throw a
TypeError
. -
Switch on options .
layout
: -
If options .
windowDimensions
does not exist , then set options .windowDimensions
to « inputHeight , inputWidth ». -
If options .
windowDimensions
’s size is not 2, then throw aTypeError
. -
If any item in options .
windowDimensions
is equal to 0, then throw aTypeError
. -
If options .
outputSizes
exists , or if options .padding
does not exist , then set options .padding
to the list « 0, 0, 0, 0 ». -
If options .
padding
’s size is not 4, then throw aTypeError
. -
If options .
strides
does not exist , then set options .strides
to the list « 1, 1 ». -
If options .
strides
’s size is not 2, then throw aTypeError
. -
If any item in options .
strides
is 0, then throw aTypeError
. -
If options .
outputSizes
exists , then: -
If options .
dilations
does not exist , then set options .dilations
to the list « 1, 1 ». -
If options .
dilations
’s size is not 2, then throw aTypeError
. -
If any item in options .
dilations
is 0, then throw aTypeError
. -
Let desc be a copy of input .
[[descriptor]]
. -
Calculate the output shape:
-
Let « windowHeight , windowWidth » be options .
windowDimensions
. -
Let « calculatedOutputHeight , calculatedOutputWidth » be the result of calculating conv2d output sizes given inputHeight , inputWidth , windowHeight , windowWidth , options .
padding
, options .strides
, and options .dilations
. -
If options . outputSizes exists , then:Let « outputHeight , outputWidth » be options .outputSizes
. -
If neither outputHeight equals floor( calculatedOutputHeight ) and outputWidth equals floor( calculatedOutputWidth ), nor outputHeight equals ceil( calculatedOutputHeight ) and outputWidth equals ceil( calculatedOutputWidth ), then throw a
TypeError
.Otherwise: Let « outputHeight , outputWidth » be « calculatedOutputHeight , calculatedOutputWidth ». Switch on options . roundingType : "floor" Set outputWidth to floor( outputWidth ). Set outputHeight to floor( outputHeight ). "ceil" Set outputWidth to ceiling( outputWidth ). Set outputHeight to ceiling( outputHeight ). -
If either outputHeight or outputWidth is not a valid dimension , then throw a
TypeError
. -
Switch on options .
layout
: -
Set desc .
shape
to outputShape .
-
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the op operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The following pooling algorithms are supported.
averagePool2d(
input
,
options
)
method
steps
are:
-
Let output be the result of creating an pooling operation given "averagePool2d", input , options , and «
"float32"
,"float16"
». -
Return output .
l2Pool2d(
input
,
options
)
method
steps
are:
-
Let output be the result of creating a pooling operation given "l2Pool2d", input , options , and «
"float32"
,"float16"
». -
Return output .
maxPool2d(
input
,
options
)
method
steps
are:
-
Let output be the result of creating a pooling operation given "maxPool2d", input and options .
-
Return output .
7.9.37.1. averagePool2d
Calculate the average value for patches of a feature map, and use it to create a pooled feature map. See § 7.9.37 Pooling operations for more detail.7.9.37.2. l2Pool2d
Apply the L2 norm function to a region of the input feature map. The L2 norm is the square root of the sum of the squares of its elements. See § 7.9.37 Pooling operations for more detail.7.9.37.3. maxPool2d
Calculate the maximum value for patches of a feature map, and use it to create a pooled feature map. See § 7.9.37 Pooling operations for more detail.7.9.38. prelu
Calculate the parametric version of rectified linear function (Parametric ReLU) on the input tensor element-wise. Parametric ReLU is a type of leaky ReLU that, instead of having a scalar slope like 0.01, making the slope (coefficient of leakage) into a parameter that is learned during the model training phase of this operation. The calculation follows the expression
max(0,
x)
+
slope
*
min(0,
x)
.
The operation will be broadcast according to [numpy-broadcasting-rule] . The input tensors must be bidirectionally broadcastable . The rank of the output tensor is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.
partial interface MLGraphBuilder {MLOperand prelu (MLOperand input ,MLOperand slope ,optional MLOperatorOptions options = {}); };dictionary {
MLPreluSupportLimits MLTensorLimits input ;MLTensorLimits slope ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLPreluSupportLimits prelu ; };
-
input
: anMLOperand
. The input tensor. -
slope
: anMLOperand
. The slope tensor. Its shape must be bidirectionally broadcastable to the shape ofinput
. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
,
"int64"
,
"int32"
,
"int8"
| N |
slope
|
same
as
input
| N |
output |
same
as
input
|
maximum
of
input
’s
rank
and
slope
’s
rank
|
MLPreluSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
slope
, of type MLTensorLimits -
MLTensorLimits
for slope operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
prelu()
:
-
prelu
, of type MLPreluSupportLimits -
Support limits for operator
prelu()
.
The
prelu(
input
,
slope
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input and slope returns false, then throw a
TypeError
. -
If the dataType of any of input or slope is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Let outputShape be the result of bidirectionally broadcasting slope ’s shape and input ’s shape .
-
Let descriptor be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and descriptor .
-
Let operator be an operator for the "prelu" operation, given slope and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input and slope .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function prelu( builder, input, slope) { return builder. add( builder. max( builder. constant( input. dataType, 0 ), input), builder. mul( slope, builder. min( builder. constant( input. dataType, 0 ), input))); }
7.9.39. Reduction operations
Reduce the input tensor along all dimensions, or along the axes specified in the
axes
array
parameter.
For
each
specified
axis,
the
dimension
with
that
index
is
reduced,
i.e.
the
resulting
tensor
will
not
contain
it,
unless
keepDimensions
is
specified.
The
values
of
the
resulting
tensor
are
calculated
using
the
specified
reduction
function
that
takes
as
parameters
all
the
input
values
across
the
reduced
dimensions.
dictionary :
MLReduceOptions MLOperatorOptions {sequence <[EnforceRange ]unsigned long >axes ;boolean keepDimensions =false ; };partial interface MLGraphBuilder {MLOperand reduceL1 (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceL2 (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceLogSum (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceLogSumExp (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceMax (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceMean (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceMin (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceProduct (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceSum (MLOperand input ,optional MLReduceOptions options = {});MLOperand reduceSumSquare (MLOperand input ,optional MLReduceOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits reduceL1 ;MLSingleInputSupportLimits reduceL2 ;MLSingleInputSupportLimits reduceLogSum ;MLSingleInputSupportLimits reduceLogSumExp ;MLSingleInputSupportLimits reduceMax ;MLSingleInputSupportLimits reduceMean ;MLSingleInputSupportLimits reduceMin ;MLSingleInputSupportLimits reduceProduct ;MLSingleInputSupportLimits reduceSum ;MLSingleInputSupportLimits reduceSumSquare ; };
MLReduceOptions
has
the
following
members:
-
axes
, of typesequence<[EnforceRange] unsigned long>
-
The dimensions to reduce, which also specifies which of the values in the input tensor are used with the reduction function. The axes in the list must be in the range [0, N-1] where N is the rank of the input tensor.
If not present, all dimensions are reduced. The input values for the reduction function are all of the values in the input tensor.
If present and not empty, the input values for the reduction function are all the values for the specified dimensions of the input tensor.
If present and empty, no dimensions are reduced, and the shape of the output tensor is the same as the shape of the input tensor; the reduction function is applied to each value in the tensor individually.
-
keepDimensions
, of type boolean , defaulting tofalse
-
If true, the output has the same rank as the input, setting any reduced dimensions to size 1.
-
input
: anMLOperand
. The input tensor. -
options
: an optionalMLReduceOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
reduced
output
tensor.
If
the
input
operand
is
a
scalar,
the
reduction
function
is
applied
to
the
scalar
value,
and
the
output
is
also
a
scalar.
operand | allowed data types | allowed ranks |
---|---|---|
input
| specified as part of operation steps | N |
output |
same
as
input
|
0
to
input
’s
rank
,
depending
on
axes
and
keepDimensions
|
MLOpSupportLimits
has
the
following
members
for
reduction
operations:
-
reduceL1
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceL1()
. -
reduceL2
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceL2()
. -
reduceLogSum
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceLogSum()
. -
reduceLogSumExp
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceLogSumExp()
. -
reduceMax
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceMax()
. -
reduceMean
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceMean()
. -
reduceMin
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceMin()
. -
reduceProduct
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceProduct()
. -
reduceSum
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceSum()
. -
reduceSumSquare
, of type MLSingleInputSupportLimits -
Support limits for operator
reduceSumSquare()
.
-
L1 : Compute the L1 norm , the sum of the absolute value of the input values.
-
L2 : Compute the L2 norm , the square root of the sum of the square of the input values.
-
LogSum : Compute the log value of the sum of the input values.
-
LogSumExp : Compute the log value of the sum of the exponent of the input values.
-
Max : Compute the maximum value of the input values.
-
Mean : Compute the average value of the input values.
-
Min : Compute the minimum value of the input values.
-
Product : Compute the product of the input values.
-
Sum : Compute the sum of the input values.
-
SumSquare : Compute the sum of the square of the input values.
To calculate reduction output sizes , given a list of unsigned integers inputShape , a optional list of unsigned integers axes , and boolean keepDimensions , perform the following steps. They return a new list of unsigned integers, or failure.
-
Let inputRank be inputShape ’s size .
-
If axes is not given, then let axes be the range 0 to inputRank , exclusive.
-
Otherwise, if axes contains duplicate values, or if any of its items is not in the range 0 to inputRank , exclusive, then return failure.
-
If keepDimensions is true, then:
-
Otherwise:
-
Return outputShape .
To
create
reduction
operation
given
string
op
,
MLOperand
input
,
MLReduceOptions
options
,
and
optional
list
allowedDataTypes
,
run
the
following
steps:
-
Assert : op is one of "reduceL1", "reduceL2", "reduceLogSum", "reduceLogSumExp", "reduceMax", "reduceMean", "reduceMin", "reduceProduct", "reduceSum", "reduceSumSquare".
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If allowedDataTypes is given and it does not contain input ’s dataType , then throw a
TypeError
. -
Let outputShape be the result of calculating reduction output sizes given input ’s shape , options .
axes
(if it exists ), and options .keepDimensions
. If that returns failure, then throw aTypeError
. -
Let desc be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the op operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The following reduction algorithms are supported.
reduceL1(
input
,
options
)
method
steps
are:
reduceL2(
input
,
options
)
method
steps
are:
-
Let output be the result of creating reduction operation given "reduceL2", input , options , and «
"float32"
,"float16"
». -
Return output .
reduceLogSum(
input
,
options
)
method
steps
are:
-
Let output be the result of creating reduction operation given "reduceLogSum", input , options , and «
"float32"
,"float16"
». -
Return output .
reduceLogSumExp(
input
,
options
)
method
steps
are:
-
Let output be the result of creating reduction operation given "reduceLogSumExp", input , options , and «
"float32"
,"float16"
». -
Return output .
reduceMax(
input
,
options
)
method
steps
are:
-
Let output be the result of creating reduction operation given "reduceMax", input and options .
-
Return output .
reduceMean(
input
,
options
)
method
steps
are:
-
Let output be the result of creating reduction operation given "reduceMean", input , options , and «
"float32"
,"float16"
». -
Return output .
reduceMin(
input
,
options
)
method
steps
are:
-
Let output be the result of creating reduction operation given "reduceMin", input and options .
-
Return output .
reduceProduct(
input
,
options
)
method
steps
are:
reduceSum(
input
,
options
)
method
steps
are:
The behavior of several reduction operations can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function reduceLogSum( builder, input, options) { return builder. log( builder. reduceSum( input, options)); } function reduceLogSumExp( builder, input, options) { return builder. log( builder. reduceSum( builder. exp( input), options)); } function reduceSumSquare( builder, input, options) { return builder. reduceSum( builder. pow( input, 2 ), options); }
keepDimensions
directly.
This
does
not
affect
the
underlying
tensor
data,
only
the
shape.
For
example,
if
the
input
shape
is
[2,
3,
4]
,
the
axis
is
1,
and
keepDimensions
is
true,
the
expected
output
shape
is
[2,
1
,4]
.
If
the
underlying
platform
never
keeps
reduced
dimensions
it
will
produce
an
output
shape
of
[2,
4]
.
The
implementation
can
introduce
a
no-op
reshape
to
[2,
1,
4]
.
A
similar
no-op
reshape
can
be
introduced
if
keepDimensions
is
false
but
the
underlying
platform
always
keeps
reduced
dimensions.
7.9.40. relu
Compute the rectified linear function of the input tensor.partial interface MLGraphBuilder {MLOperand relu (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits relu ; };
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
,
"int64"
,
"int32"
,
"int8"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
relu()
:
-
relu
, of type MLSingleInputSupportLimits -
Support limits for operator
relu()
.
The
relu(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "relu" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function relu( builder, input) { return builder. max( builder. constant( input. dataType, 0 ), input); }
7.9.41. resample2d
Resample the tensor values from the source to the destination dimensions according to the axes and scaling factors.enum {
MLInterpolationMode "nearest-neighbor" ,"linear" };dictionary :
MLResample2dOptions MLOperatorOptions {MLInterpolationMode mode = "nearest-neighbor";sequence <float >scales ;sequence <[EnforceRange ]unsigned long >sizes ;sequence <[EnforceRange ]unsigned long >axes ; };partial interface MLGraphBuilder {MLOperand resample2d (MLOperand input ,optional MLResample2dOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits resample2d ; };
-
input
: anMLOperand
. The input 4-D tensor. -
options
: an optionalMLResample2dOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
4-D
tensor.
MLResample2dOptions
has
the
following
members:
-
mode
, of type MLInterpolationMode , defaulting to"nearest-neighbor"
-
The interpolation algorithm used to fill the output tensor values.
Both algorithms start with these inputs, computed for each spatial axis (based on
axes
), whereinputSize
is given by theinput
tensor’s shape ,outputSize
is given bysizes
orscales
, andoutputCoordinate
identifies the element in the output tensor being computed.scale = outputSize / inputSize unclampedCoordinate = (outputCoordinate + 0.5) / scale - 0.5 inputCoordinate = clamp(unclampedCoordinate, 0, inputSize - 1)
For a given
outputCoordinate.x
andoutputCoordinate.y
location in the output tensor, the above equations give a rationalinputCoordinate.x
andinputCoordinate.y
.-
nearest-neighbor
-
The
inputCoordinate.x
andinputCoordinate.y
computed above are used as inputs to a nearest-neighbor sampling algorithm to compute the output tensor value as follows:x = ceil(inputCoordinate.x - 0.5) y = ceil(inputCoordinate.y - 0.5) output tensor value = input tensor value at (x, y)
-
linear
-
The
inputCoordinate.x
andinputCoordinate.y
computed above are used as inputs to a bilinear sampling algorithm to compute the output tensor value as follows:x0 = floor(inputCoordinate.x) x1 = ceil(inputCoordinate.x) y0 = floor(inputCoordinate.y) y1 = ceil(inputCoordinate.y) vx0y0 = input tensor value at (x0, y0) vx1y0 = input tensor value at (x1, y0) vx0y1 = input tensor value at (x0, y1) vx1y1 = input tensor value at (x1, y1) tx = inputCoordinate.x - x0 ty = inputCoordinate.y - y0 vy0 = vx0y0 * (1 - tx) + vx1y0 * tx vy1 = vx0y1 * (1 - tx) + vx1y1 * tx output tensor value = vy0 * (1 - ty) + vy1 * ty
-
-
scales
, of type sequence< float > -
A list of length 2. Specifies the scaling factor for each input dimension from
axes
: [scaleForFirstAxis, scaleForSecondAxis] . The default value is [1.0, 1.0]. -
sizes
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2. Specifies the target sizes for each input dimension from
axes
: [sizeForFirstAxis, sizeForSecondAxis] . Whensizes
is specified,scales
is ignored, since the scaling factor values are derived from the target sizes of the input. -
axes
, of typesequence<[EnforceRange] unsigned long>
-
A list of length 2. Specifies the two dimensions of the input tensor to which the interpolation algorithm applies. The default value is [2, 3].
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| 4 |
output |
same
as
input
| 4 |
MLOpSupportLimits
has
the
following
member
for
resample2d()
:
-
resample2d
, of type MLSingleInputSupportLimits -
Support limits for operator
resample2d()
.
The
resample2d(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If input ’s rank is not its allowed rank , then throw a
TypeError
. -
If options .
scales
does not exist , then set it to the list « 1.0, 1.0 ». -
Otherwise, if any of its items is less than or equal to 0, or if its size is not 2, then throw a
TypeError
. -
If options .
sizes
exists , and if its size is not 2, or if any of its items is 0, then throw aTypeError
. -
If options .
axes
does not exists , then set it to the list « 2, 3 ». -
Otherwise, if options .
axes
contains duplicate values, or if any of its items is not in the range 0 to input ’s rank , exclusive, then throw aTypeError
. -
Calculate the output shape:
-
Let inputDescriptor be input .
[[descriptor]]
. -
For each index in the range 0 to options .
axes
’s size , exclusive: -
Let desc be the result of creating an MLOperandDescriptor given inputDescriptor .
dataType
and outputShape .
-
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "resample2d" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
linear
resampling
from
the
following
[4,
4]
input
tensor
(considering
only
spatial
dimensions):
[ 0 1 2 3 ] [ 0 1 2 3 ] [ 12 13 14 15 ] [ 12 13 14 15 ]
For an [8, 8] output tensor, the expected values are:
[ 0 0.25 0.75 1.25 1.75 2.25 2.75 3 ] [ 0 0.25 0.75 1.25 1.75 2.25 2.75 3 ] [ 0 0.25 0.75 1.25 1.75 2.25 2.75 3 ] [ 3 3.25 3.75 4.25 4.75 5.25 5.75 6 ] [ 9 9.25 9.75 10.25 10.75 11.25 11.75 12 ] [ 12 12.25 12.75 13.25 13.75 14.25 14.75 15 ] [ 12 12.25 12.75 13.25 13.75 14.25 14.75 15 ] [ 12 12.25 12.75 13.25 13.75 14.25 14.75 15 ]
This has the convenient properties that the sampling is evenly distributed, symmetric, robust to image mirroring, and the corner values are aligned.
7.9.42. reshape
Alter the shape of a tensor to a new shape. Reshape does not copy or change the content of the tensor. It just changes the tensor’s logical shape for the subsequent operations.partial interface MLGraphBuilder {MLOperand reshape (MLOperand input ,sequence <[EnforceRange ]unsigned long >newShape ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits reshape ; };
-
input
: anMLOperand
. The input tensor. -
newShape
: sequence <unsigned long
>. The shape of the output tensor. The number of elements implied bynewShape
must be the same as the number of elements in the input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor.
The
values
of
the
output
tensor
are
the
same
as
values
of
the
input
tensor.
The
shape
of
the
output
tensor
is
specified
by
newShape
.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
newShape
’s
size
|
MLOpSupportLimits
has
the
following
member
for
reshape()
:
-
reshape
, of type MLSingleInputSupportLimits -
Support limits for operator
reshape()
.
The
reshape(
input
,
newShape
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
Let outputShape be an empty array of
unsigned long
. -
If newShape ’s size is 0, then set outputShape to an empty list for a scalar.
-
If any item in newShape is not a valid dimension , then throw a
TypeError
. -
Let inputElementCount be the product of all items in input ’s shape . Empty dimensions yield an inputElementCount of 1.
-
If product of all values in newShape is not equal to inputElementCount , then throw a
TypeError
. -
Let desc be a copy of input .
[[descriptor]]
. -
Set desc .
shape
to newShape . -
Make graph connections:
-
Let output be the result of creating an MLOperand given this and desc .
-
Let operator be an operator for the "reshape" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.43. reverse
Reverse a tensor along the given axes.dictionary :
MLReverseOptions MLOperatorOptions {sequence <[EnforceRange ]unsigned long >axes ; };partial interface MLGraphBuilder {MLOperand reverse (MLOperand input ,optional MLReverseOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits reverse ; };
MLReverseOptions
has
the
following
members:
-
axes
, of typesequence<[EnforceRange] unsigned long>
-
The indices to the input dimensions to reverse. When this member is not present, it is treated as if all dimensions are reversed. If explicitly passed as empty, no dimensions are reversed.
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
reverse()
:
-
reverse
, of type MLSingleInputSupportLimits -
Support limits for operator
reverse()
.
The
reverse(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Let inputRank be input ’s rank .
-
If axes is not given, then let axes be the range 0 to inputRank , exclusive.
-
Otherwise, if axes contains duplicate values, or if any of its elements is not in the range 0 to inputRank , exclusive, then return failure.
-
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "reverse" operation and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.44. scatterElements
Scatter values from the updates tensor atop a copy of the input tensor the along an axis according to the indices.dictionary :
MLScatterOptions MLOperatorOptions { [EnforceRange ]unsigned long axis = 0; };partial interface MLGraphBuilder {MLOperand scatterElements (MLOperand input ,MLOperand indices ,MLOperand updates ,optional MLScatterOptions options = {}); };dictionary {
MLScatterSupportLimits MLTensorLimits input ;MLTensorLimits indices ;MLTensorLimits updates ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLScatterSupportLimits scatterElements ; };
MLScatterOptions
has
the
following
members:
-
axis
, of type unsigned long , defaulting to0
-
The axis along which the scattered values are obtained. Its value must be in the range [0, N-1] where N is the rank of the input tensor.
-
input
: anMLOperand
. The input N-D tensor from to initialize the output with. -
indices
: anMLOperand
. The indices N-D tensor of the input values to scatter over. The values must be of type"int32"
,"uint32"
, or"int64"
, and must be in the range -N (inclusive) to N (exclusive) where N is the size of the input dimension indexed by options.axis , and a negative index means indexing from the end of the dimension. -
updates
: anMLOperand
. New values to replace atop the input, with the same shape as the indices. -
options
: an optionalMLScatterOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
N-D
tensor
of
rank
equal
to
input
’s
rank
.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | 1 to N |
indices
|
"int32"
,
"uint32"
,
"int64"
|
same
as
input
|
updates
|
same
as
input
|
same
as
input
|
output |
same
as
input
|
same
as
input
|
MLScatterSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
indices
, of type MLTensorLimits -
MLTensorLimits
for indices operand. -
updates
, of type MLTensorLimits -
MLTensorLimits
for updates operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
members
for
scatterElements()
:
-
scatterElements
, of type MLScatterSupportLimits -
Support limits for operator
scatterElements()
.
indices
parameter
to
scatterElements()
can
not
be
clamped
to
the
allowed
range
when
the
graph
is
built
because
the
inputs
are
not
known
until
execution.
Implementations
can
introduce
clamp()
in
the
compiled
graph
if
the
specified
clamping
behavior
is
not
provided
by
the
underlying
platform.
Similarly,
if
the
underlying
platform
does
not
support
negative
indices,
the
implementation
can
introduce
operations
in
the
compiled
graph
to
transform
a
negative
index
from
the
end
of
the
dimension
into
a
positive
index.
The
scatterElements(
input
,
indices
,
updates
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input and indices and updates returns false, then throw a
TypeError
. -
If indices ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If updates ’s dataType is not equal to input ’s dataType , then throw a
TypeError
. -
If the rank of any of input , indices , or updates is not its allowed rank , then throw a
TypeError
. -
Let axis be options .
axis
. -
If axis is greater than or equal to input ’s rank , then throw a
TypeError
. -
Let indicesShapeExpected be a copy of input ’s shape .
-
Set indicesShapeExpected [ axis ] to indices ’s shape [ axis ].
-
If indices ’s shape is not equal to indicesShapeExpected , then throw a
TypeError
. -
If updates ’s shape is not equal to indices ’s shape , then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "scatterElements" operation, given input , indices , updates , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input , indices , and updates .
-
Set operator ’s output to output .
-
-
Return output .
Examples of how scatterElements works in different slicing schemes.
// input of shape [4,3]: // [[ 0, 1, 2], // [10, 11, 12], // [20, 21, 22], // [30, 31, 32]] // indices of shape [2,3]: // [[3, 1, 1], // [2, 0, 3]] // updates of shape [2,3]: // [[-1, -2, -3], // [-4, -5, -6]] // axis = 0 (default) // output of shape [4,3]: // [[ 0, -5, 2], // [10, -2, -3], // [-4, 21, 22], // [-1, 31, -6]] const input1= builder. constant( { dataType: 'float32' , shape: [ 4 , 3 ]}, new Float32Array([ 0 , 1 , 2 , 10 , 11 , 12 , 20 , 21 , 22 , 30 , 31 , 32 ])); const indices1= builder. constant( { dataType: 'uint32' , shape: [ 2 , 3 ]}, new Uint32Array([ 3 , 1 , 1 , 2 , 0 , 3 ])); const updates1= builder. constant( { dataType: 'float32' , shape: [ 2 , 3 ]}, new Uint32Array([ - 1 , - 2 , - 3 , - 4 , - 5 , - 6 ])); const output1= builder. scatterElements( input1, indices1, updates1); // input of shape [4,3]: // [[ 0, 1, 2], // [10, 11, 12], // [20, 21, 22], // [30, 31, 32]] // indices of shape [4,1]: // [[2], // [1], // [0], // [2]], // updates of shape [4,1]: // [[-1], // [-2], // [-3], // [-4]], // axis = 1 // output of shape [4,3]: // [[ 0, 1, -1], // [10, -2, 12], // [-3, 21, 22], // [30, 31, -4]] const indices2= builder. constant( { dataType: 'uint32' , shape: [ 4 , 1 ]}, new Uint32Array([ 2 , 1 , 0 , 2 ])); const updates2= builder. constant( { dataType: 'float32' , shape: [ 4 , 1 ]}, new Uint32Array([ - 1 , - 2 , - 3 , - 4 ])); const output2= builder. scatterElements( input1, indices2, updates2, { axis: 1 }); // input of shape [4,2,2]: // [[[ 0, 1], // [ 10, 11]], // [[100, 101], // [110, 111]], // [[200, 201], // [210, 211]], // [[300, 301], // [310, 311]],] // indices of shape [1,2,2]: // [[[0, 2], // [1, 3]]], // updates of shape [1,2,2]: // [[[-1, -2], // [-3, -4]]], // axis = 0 // output of shape [4,2,2]: // [[[ -1, 1], // [ 10, 11]], // [[100, 101], // [ -3, 111]], // [[200, -2], // [210, 211]], // [[300, 301], // [310, -4]],] const inputData3= new Float32Array( [ 0 , 1 , 10 , 11 , 100 , 101 , 110 , 111 , 200 , 201 , 210 , 211 , 300 , 301 , 310 , 311 ]); const input3= builder. constant({ dataType: 'float32' , shape: [ 4 , 2 , 2 ]}, inputData3); const indices3= builder. constant( { dataType: 'uint32' , shape: [ 1 , 2 , 2 ]}, new Uint32Array([ 0 , 2 , 1 , 3 ])); const updates3= builder. constant( { dataType: 'float32' , shape: [ 1 , 2 , 2 ]}, new Uint32Array([ - 1 , - 2 , - 3 , - 4 ])); const output3= builder. scatterElements( input3, indices3, updates3, { axis: 0 });
7.9.45. scatterND
Scatter slices of values from the update tensor atop a copy of the input tensor according to the indices.partial interface MLGraphBuilder {MLOperand scatterND (MLOperand input ,MLOperand indices ,MLOperand updates ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLScatterSupportLimits scatterND ; };
-
input
: anMLOperand
. The input N-D tensor from to initialize the output with. -
indices
: anMLOperand
. The indices array contains entire coordinates into the output tensor, with the rightmost dimension holding the number of dimensions per coordinate. So an indices tensor of shape [10,1] holds 10 single-axis indices, and a shape of [4,3] holds 4 indices of 3D coordinates. The values must be of type"int32"
,"uint32"
, or"int64"
, and each must be in the range -N (inclusive) to N (exclusive) where N is the size of the corresponding output dimension, and a negative index means indexing from the end of the corresponding dimension. -
updates
: anMLOperand
. New values to replace atop the input. -
options
: an optionalMLScatterOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
N-D
tensor
of
rank
equal
to
the
rank
of
input
’s
rank
+
indices
’s
rank
-
indices
’s
shape
[-1]
-
1.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | 1 to N |
indices
|
"int32"
,
"uint32"
,
"int64"
| 1 to N |
updates
|
same
as
input
|
input
’s
rank
+
indices
’s
rank
-
indices
’s
shape
[-1]
-
1
|
output |
same
as
input
| 1 to N |
MLOpSupportLimits
has
the
following
members
for
scatterND()
:
-
scatterND
, of type MLScatterSupportLimits -
Support limits for operator
scatterND()
.
indices
parameter
to
scatterND()
can
not
be
clamped
to
the
allowed
range
when
the
graph
is
built
because
the
inputs
are
not
known
until
execution.
Implementations
can
introduce
clamp()
in
the
compiled
graph
if
the
specified
clamping
behavior
is
not
provided
by
the
underlying
platform.
Similarly,
if
the
underlying
platform
does
not
support
negative
indices,
the
implementation
can
introduce
operations
in
the
compiled
graph
to
transform
a
negative
index
from
the
end
of
the
dimension
into
a
positive
index.
The
scatterND(
input
,
indices
,
updates
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of input , indices , and updates returns false, then throw a
TypeError
. -
If indices ’s dataType ’s is not one of the allowed data types (according to this table ), then throw a
TypeError
. -
If updates ’s dataType is not equal to input ’s dataType , then throw a
TypeError
. -
If the rank of any of input , indices , or updates is not its allowed rank , then throw a
TypeError
. -
Let inputShape be input ’s shape and inputRank be input ’s rank .
-
Let indicesShape be indices ’s shape and indicesRank be indices ’s rank .
-
Let indexableSize be indicesRank - 1.
-
Let coordinateSize be indicesShape [ indexableSize ].
-
If coordinateSize is greater than inputRank , then throw a
TypeError
. -
Let expectedUpdatesShape be an empty list.
-
For each index in the range 0 to indexableSize , exclusive:
-
Append indicesShape [ index ] to expectedUpdatesShape .
-
-
For each index in the range coordinateSize to inputRank , exclusive:
-
Append inputShape [ index ] to expectedUpdatesShape .
-
-
If updates ’s shape is not equal to expectedUpdatesShape , then throw a
TypeError
. -
Let outputShape be a copy of input ’s shape .
-
Let outputDesc be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given outputDesc .
-
Let operator be an operator for the "scatterND" operation, given input , indices , updates , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to input , indices , and updates .
-
Set operator ’s output to output .
-
-
Return output .
Examples of how scatterND works in different slicing schemes.
// input of shape [8]: // [0, 1, 2, 3, 4, 5, 6, 7] // indices of shape [4, 1]: // [[4], // [3], // [1], // [7]] // updates of shape [4]: // [-1, -2, -3, -4] // output of shape [8]: // [0, -3, 2, -2, -1, 5, 6, -4] const input1= builder. constant( { dataType: 'float32' , shape: [ 8 ]}, new Float32Array([ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 ])); const indices1= builder. constant( { dataType: 'uint32' , shape: [ 4 , 1 ]}, new Uint32Array([ 4 , 3 , 1 , 7 ])); const updates1= builder. constant( { dataType: 'uint32' , shape: [ 4 ]}, new Uint32Array([ - 1 , - 2 , - 3 , - 4 ])); const output1= builder. scatterND( input1, indices1, updates1); // input of shape [2,2]: // [[0, 1], // [2, 3]] // indices of shape [2,2]: // [[0, 0], // [1, 1]] // updates of shape [2]: // [-1, -2] // output of shape [2,2]: // [[-1, 1], <= -1 written to output coordinate [0, 0] // [ 2, -2]] <= -2 written to output coordinate [1, 1] const input2= builder. constant( { dataType: 'float32' , shape: [ 2 , 2 ]}, new Float32Array([ 0 , 1 , 2 , 3 ])); const indices2= builder. constant( { dataType: 'uint32' , shape: [ 2 , 2 ]}, new Uint32Array([ 0 , 0 , 1 , 1 ])); const updates2= builder. constant({ dataType: 'uint32' , shape: [ 2 ]}, new Uint32Array([ - 1 , - 2 ])); const output2= builder. scatterND( input2, indices2, updates2); // input of shape [3,2]: // [[0, 1], // [2, 3], // [4, 5]] // indices of shape [2,1]: // [[2], // [0]] // updates of shape [2,2]: // [[-1, -2], // [-3, -4]] // output of shape [3,2]: // [[-3 ,-4], <= [-3, -4] written to output coordinates [0, *] // [ 2, 3], // [-1, -2]] <= [-1, -2] written to output coordinates [2, *] const input3= builder. constant( { dataType: 'float32' , shape: [ 3 , 2 ]}, new Float32Array([ 0 , 1 , 2 , 3 , 4 , 5 ])); const indices3= builder. constant( { dataType: 'uint32' , shape: [ 2 , 1 ]}, new Uint32Array([ 1 , 0 ])); const updates3= builder. constant( { dataType: 'uint32' , shape: [ 2 , 2 ]}, new Uint32Array([ - 1 , - 2 , - 3 , 4 ])); const output3= builder. scatterND( input3, indices3, updates3); // input of shape [2,2,2]: // [[[0, 1], // [2, 3]], // [[4, 5], // [6, 7]]] // indices of shape [2,2]: // [[0, 1], // [1, 0]] // updates of shape [2,2]: // [[-1, -2], // [-3, -4]] // output of shape [2,2,2]: // [[[ 0, 1], // [-1, -2]], <= [-1, -2] written to output coordinates [0, 1, *] // [[-3, -4], <= [-3, -4] written to output coordinates [1, 0, *] // [ 6, 7]]] const input4= builder. constant( { dataType: 'float32' , shape: [ 2 , 2 , 2 ]}, new Float32Array([ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 ])); const indices4= builder. constant( { dataType: 'uint32' , shape: [ 2 , 2 ]}, new Uint32Array([ 0 , 1 , 1 , 0 ])); const updates4= builder. constant( { dataType: 'uint32' , shape: [ 2 , 2 ]}, new Uint32Array([ - 1 , - 2 , - 3 , 4 ])); const output4= builder. scatterND( input4, indices4, updates4);
7.9.46. sigmoid
Compute the sigmoid function of the input tensor. The calculation follows the expression
1
/
(exp(-x)
+
1)
.
partial interface MLGraphBuilder {MLOperand sigmoid (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits sigmoid ; };
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
sigmoid()
:
-
sigmoid
, of type MLSingleInputSupportLimits -
Support limits for operator
sigmoid()
.
The
sigmoid(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "sigmoid" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function sigmoid( builder, input) { return builder. div( builder. constant( input. dataType, 1 ), builder. add( builder. exp( builder. neg( input)), builder. constant( input. dataType, 1 ))); }
7.9.47. slice
Produce a slice of the input tensor.dictionary :
MLSliceOptions MLOperatorOptions {sequence <[EnforceRange ]unsigned long >strides ; };partial interface MLGraphBuilder {MLOperand slice (MLOperand input ,sequence <[EnforceRange ]unsigned long >starts ,sequence <[EnforceRange ]unsigned long >sizes ,optional MLSliceOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits slice ; };
MLSliceOptions
has
the
following
members:
-
strides
, of typesequence<[EnforceRange] unsigned long>
-
The stride to step over each input along each axis. The length of the strides array must equal the rank of the input tensor. The default is an array of length rank consisting of all 1’s. e.g. [1,1,1] for a 3-D tensor. Strides must be greater than zero.
-
input
: anMLOperand
. The input tensor. -
starts
: a sequence <unsigned long
>. The starting index to slice of each input dimension, of length N where N is the rank of the input tensor. For each dimension d ofinput
,starts
[ d ] indicates the starting index to slice in that dimension. The starting index must be in the range [0, input size - 1] in that dimension. -
sizes
: a sequence <unsigned long
>. The number of elements to slice of each input dimension, of length N where N is the rank of the input tensor. For each dimension d ofinput
,sizes
[ d ] indicates the number of elements to slice in that dimension. The size must not be 0 and must satisfy the constraintstarting index + size <= input size
in that dimension. -
options
: anMLSliceOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
of
the
same
rank
as
the
input
tensor
with
tensor
values
stripped
to
the
specified
starting
and
ending
indices
in
each
dimension.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
slice()
:
-
slice
, of type MLSingleInputSupportLimits -
Support limits for operator
slice()
.
The
slice(
input
,
starts
,
sizes
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If starts ’s size and sizes ’s size are not both equal to input ’s rank , then throw a
TypeError
. -
Let strides be a new list .
-
Let inputShape be input ’s shape and inputRank be input ’s rank .
-
Let outputShape be a new list .
-
For each index in the range 0 to inputRank , exclusive:
-
Let inputSize be inputShape [ index ].
-
Let inputSliceSize be sizes [ index ].
-
Let stride be strides [ index ] if it is not empty, or 1 otherwise:
-
If inputSliceSize is 0, then throw a
TypeError
.If 0-size dimensions are allowed, revise these steps. [Issue #391]
-
If starts [ index ] is greater than inputSize , then throw a
TypeError
. -
If starts [ index ] + inputSliceSize is greater than inputSize , then throw a
TypeError
. -
Let outputSizeRoundingExcess be 1 if inputSliceSize % stride != 0, or 0 otherwise.
-
Let outputSize be floor( inputSliceSize / stride ) + outputSizeRoundingExcess :
-
Append outputSize to outputShape .
-
-
Let outputDesc be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given outputDesc .
-
Let operator be an operator for the "slice" operation, given starts , sizes , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.48. softmax
Compute the softmax values of the N-D input tensor along the given axis.partial interface MLGraphBuilder {MLOperand softmax (MLOperand input , [EnforceRange ]unsigned long axis ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits softmax ; };
-
input
: anMLOperand
. The input N-D tensor. -
axis
: anunsigned long
scalar. The dimension the reduction will be performed on. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
softmax()
:
-
softmax
, of type MLSingleInputSupportLimits -
Support limits for operator
softmax()
.
The
softmax(
input
,
axis
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
If axis is greater than or equal to input ’s rank , then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "softmax" operation, given axis and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function softmax( builder, input, axis) { // This sample deploys a well-known implementation trick [1] to compute the // exponentials of the distances to the max value, instead of the exponentials // of the input values itself, in order to increase the numerical stability of // the result. // [1]: https://cs231n.github.io/linear-classify/#softmax const maxX= builder. reduceMax( input, { axes: [ axis], keepDimensions: true }); const expX= builder. exp( builder. sub( input, maxX)); return builder. div( expX, builder. reduceSum( expX, { axes: [ axis], keepDimensions: true })); }
7.9.49. softplus
Compute the softplus function of the input tensor. The calculation follows the expression
ln(1
+
exp(x))
.
partial interface MLGraphBuilder {MLOperand softplus (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits softplus ; };
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
softplus()
:
-
softplus
, of type MLSingleInputSupportLimits -
Support limits for operator
softplus()
.
The
softplus(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "softplus" operation and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function softplus( builder, input) { return builder. log( builder. add( builder. exp( input), builder. constant( input. dataType, 1 ))); }
7.9.50. softsign
Compute the softsign function of the input tensor. The calculation follows the expression
x
/
(1
+
|x|)
.
partial interface MLGraphBuilder {MLOperand softsign (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits softsign ; };
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function softsign( builder, input) { return builder. div( input, builder. add( builder. constant( input. dataType, 1 ), builder. abs( input))); }
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
softsign()
:
-
softsign
, of type MLSingleInputSupportLimits -
Support limits for operator
softsign()
.
The
softsign(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "softsign" operation and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.51. split
Split the input tensor into a number of sub tensors along the given axis.dictionary :
MLSplitOptions MLOperatorOptions { [EnforceRange ]unsigned long axis = 0; };partial interface MLGraphBuilder {sequence <MLOperand >split (MLOperand input , ([EnforceRange ]unsigned long or sequence <[EnforceRange ]unsigned long >)splits ,optional MLSplitOptions options = {}); };dictionary {
MLSplitSupportLimits MLTensorLimits input ;MLDataTypeLimits outputs ; };partial dictionary MLOpSupportLimits {MLSplitSupportLimits split ; };
-
input
: anMLOperand
. The input tensor. -
splits
: anunsigned long
or sequence <unsigned long
>. If anunsigned long
, it specifies the number of output tensors along the axis. The number must evenly divide the dimension size ofinput
alongaxis
. If a sequence <unsigned long
>, it specifies the sizes of each output tensor along theaxis
. The sum of sizes must equal to the dimension size ofinput
alongaxis
. -
options
: an optionalMLSplitOptions
. The optional parameters of the operation.
Returns:
sequence
<
MLOperand
>.
The
split
output
tensors.
If
splits
is
an
unsigned
long
,
the
size
of
the
output
is
equal
to
splits
.
The
shape
of
each
output
tensor
is
the
same
as
input
except
the
dimension
size
of
axis
equals
to
the
quotient
of
dividing
the
dimension
size
of
input
along
axis
by
splits
.
If
splits
is
a
sequence
<
unsigned
long
>,
the
size
of
the
output
equals
the
size
of
splits
.
The
shape
of
the
i
-th
output
tensor
is
the
same
as
input
except
along
axis
where
the
dimension
size
is
splits
[
i
].
MLSplitOptions
has
the
following
members:
-
axis
, of type unsigned long , defaulting to0
-
The dimension along which to split. Its value must be in the range [0, N-1] where N is the rank of the input tensor.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
outputs |
same
as
input
|
same
as
input
|
MLSplitSupportLimits
has
the
following
members:
-
input
, of type MLTensorLimits -
MLTensorLimits
for input operand. -
outputs
, of type MLDataTypeLimits -
MLDataTypeLimits
for all the output operands.
MLOpSupportLimits
has
the
following
member
for
split()
:
-
split
, of type MLSplitSupportLimits -
Support limits for operator
split()
.
The
split(
input
,
splits
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
Let axis be options .
axis
. -
If axis is greater than or equal to input ’s rank , then throw a
TypeError
. -
If splits is an
unsigned long
, then: -
If splits is a sequence <
unsigned long
>, then: -
Make graph connections:
-
Let operator be an operator for the "split" operation, given splits and options .
-
Let outputs be a new list .
-
For each index in the range 0 to splitCount , exclusive:
-
Let operand be the result of copying an MLOperand given input .
-
If splits is an
unsigned long
, then let newDimension be operand ’s shape [ axis ] / splits . -
Otherwise, let newDimension be splits [ index ].
-
Set operand ’s shape [ axis ] to newDimension .
-
Set operand .
[[operator]]
to operator . -
Append operand to outputs .
-
-
Set operator ’s input to input .
-
Set operator ’s outputs to outputs .
-
-
Return outputs .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function split( builder, input, splits, options) { // This sample shows the case that the splits parameter is an array. const outputs= []; const inputShape= input. shape; const inputRank= inputShape. length; let starts= Array( inputRank). fill( 0 ); let sizes= inputShape; let start= 0 ; for ( const sizeof splits) { starts[ options. axis] = start; sizes[ options. axis] = size; outputs. push( builder. slice( input, starts, sizes)); start+= size; } return outputs; }
7.9.52. tanh
Compute the hyperbolic tangent function of the input tensor. The calculation follows the expression
(exp(2
*
x)
-
1)
/
(exp(2
*
x)
+
1)
.
partial interface MLGraphBuilder {MLOperand tanh (MLOperand input ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits tanh ; };
-
input
: anMLOperand
. The input tensor. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
operand | allowed data types | allowed ranks |
---|---|---|
input
|
"float32"
,
"float16"
| N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
tanh()
:
-
tanh
, of type MLSingleInputSupportLimits -
Support limits for operator
tanh()
.
The
tanh(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s dataType is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "tanh" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function tanh( builder, input) { return builder. div( builder. sub( builder. exp( builder. mul( builder. constant( input. dataType, 2 ), input)), builder. constant( input. dataType, 1 )), builder. add( builder. exp( builder. mul( builder. constant( input. dataType, 2 ), input)), builder. constant( input. dataType, 1 ))); }
7.9.53. tile
Repeat a tensor the given number of times along each dimension.partial interface MLGraphBuilder {MLOperand tile (MLOperand input ,sequence <unsigned long >repetitions ,optional MLOperatorOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits tile ; };
-
input
: anMLOperand
. The input N-D tensor. -
repetitions
: A count per dimension of how many times to repeat that dimension. The size must match theinput
’s rank , using 1’s for any axis that should retain the same size. -
options
: an optionalMLOperatorOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
reversed
N-D
tensor.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
members
for
tile()
:
-
tile
, of type MLSingleInputSupportLimits -
Support limits for operator
tile()
.
The
tile(
input
,
repetitions
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If repetitions ’s size is not equal to input ’s rank , then throw a
TypeError
. -
If repetitions ’s values contain 0’s, then throw a
TypeError
.If 0-size dimensions are allowed, revise these steps. [Issue #391]
-
Let outputShape be a copy of input ’s shape .
-
For each index in the range 0 to outputShape ’s size , exclusive:
-
Set outputShape [ index ] to outputShape [ index ] * repetitions [ index ].
-
-
Let outputDescriptor be the result of creating an MLOperandDescriptor given input ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given outputDescriptor .
-
Let operator be an operator for the "tile" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.54. transpose
Permute the dimensions of the input tensor according to
permutation
.
dictionary :
MLTransposeOptions MLOperatorOptions {sequence <[EnforceRange ]unsigned long >permutation ; };partial interface MLGraphBuilder {MLOperand transpose (MLOperand input ,optional MLTransposeOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits transpose ; };
MLTransposeOptions
has
the
following
members:
-
permutation
, of typesequence<[EnforceRange] unsigned long>
-
The values used to permute the output shape. The default is [N-1, ..., 0], where N is the rank of the input tensor, e.g. [2,1,0] for a 3-D tensor. These default values cause the output to become a transposed tensor of the input. When specified, the number of values must be the same as the rank of the input tensor, and the values must be within the range from 0 to N-1 with no duplicates.
-
input
: anMLOperand
. The input N-D tensor. -
options
: an optionalMLTransposeOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
permuted
or
transposed
N-D
tensor.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | N |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
transpose()
:
-
transpose
, of type MLSingleInputSupportLimits -
Support limits for operator
transpose()
.
The
transpose(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If options .
permutation
does not exist , then let options .permutation
be the reversed sequence of all indices for input ’s shape . -
Otherwise, if options .
permutation
exists : -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "transpose" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
7.9.55. triangular
Given a 2-D tensor (matrix), return a 2-D tensor containing either the upper or lower triangular part of the input tensor. If the input tensor has greater than 2 dimensions it is treated as a batch of matrices and the result has the same shape.dictionary :
MLTriangularOptions MLOperatorOptions {boolean upper =true ; [EnforceRange ]long diagonal = 0; };partial interface MLGraphBuilder {MLOperand triangular (MLOperand input ,optional MLTriangularOptions options = {}); };partial dictionary MLOpSupportLimits {MLSingleInputSupportLimits triangular ; };
MLTriangularOptions
has
the
following
members:
-
upper
, of type boolean , defaulting totrue
-
Indicates whether the output the upper or the lower part of the input matrix is retained. True indicates that the upper part is retained.
-
diagonal
, of type long , defaulting to0
-
Specifies how many diagonals above or below the main diagonals of the input matrix are retained or excluded. A value of 0 means no diagonals other than the main diagonals are affected.
-
input
: anMLOperand
. The input tensor which is at least 2-D. -
options
: an optionalMLTriangularOptions
. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
representing
a
triangular
matrix,
or
batch
of
matrices
which
is
the
same
shape
as
the
input.
operand | allowed data types | allowed ranks |
---|---|---|
input
| any | 2 or greater |
output |
same
as
input
|
same
as
input
|
MLOpSupportLimits
has
the
following
member
for
triangular()
:
-
triangular
, of type MLSingleInputSupportLimits -
Support limits for operator
triangular()
.
The
triangular(
input
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and input returns false, then throw a
TypeError
. -
If input ’s rank is not one of its allowed ranks (according to this table ), then throw a
TypeError
. -
Make graph connections:
-
Let output be the result of copying an MLOperand given input .
-
Let operator be an operator for the "triangular" operation, given options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s input to input .
-
Set operator ’s output to output .
-
-
Return output .
Examples of how triangular works in different diagonal settings.
// input: // [[7, 1, 2], // [9, 4, 8], // [2, 6, 3]] const input= builder. constant( { dataType: 'float32' , shape: [ 3 , 3 ]}, new Float32Array([ 7 , 1 , 2 , 9 , 4 , 8 , 2 , 6 , 3 ])); // upper triangular matrix: // [[7, 1, 2], // [0, 4, 8], // [0, 0, 3]] const upper= builder. triangular( input); // upper triangular matrix with one additional set of diagonals excluded: // [[0, 1, 2], // [0, 0, 8], // [0, 0, 0]] const upperPositive= builder. triangular( input, { diagonal: 1 }); // upper triangular matrix with one additional set of diagonals retained: // [[7, 1, 2], // [9, 4, 8], // [0, 6, 3]] const upperNegative= builder. triangular( input, { diagonal: - 1 }); // lower triangular matrix: // [[7, 0, 0], // [9, 4, 0], // [2, 6, 3]] const lower= builder. triangular( input, { upper: false }); // lower triangular matrix with one additional set of diagonals retained: // [[7, 1, 0], // [9, 4, 8], // [2, 6, 3]] const lowerPositive= builder. triangular( input, { upper: false , diagonal: 1 }); // lower triangular matrix with one additional set of diagonals excluded: // [[0, 0, 0], // [9, 0, 0], // [2, 6, 0]] const lowerNegative= builder. triangular( input, { upper: false , diagonal: - 1 }) // lower triangular matrix with two batches: // [[[7, 0, 0], // [9, 4, 0], // [2, 6, 3]], // [[1, 0, 0], // [4, 5, 0], // [7, 8, 9]]] const lowerWithBatches= builder. triangular( input, { upper: false });
7.9.56. where
Select the values from the
trueValue
or
the
falseValue
tensor
depending
on
the
corresponding
values
of
the
condition
tensor,
where
non-zero
is
true
and
zero
is
false.
The
condition
tensor
is
often
the
output
of
one
of
the
element-wise
logical
operations.
The operation will be broadcast according to [numpy-broadcasting-rule] . The input tensors must be bidirectionally broadcastable . The rank of the output tensor is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.
partial interface MLGraphBuilder {MLOperand where (MLOperand condition ,MLOperand trueValue ,MLOperand falseValue ,optional MLOperatorOptions options = {}); };dictionary {
MLWhereSupportLimits MLTensorLimits condition ;MLTensorLimits trueValue ;MLTensorLimits falseValue ;MLDataTypeLimits output ; };partial dictionary MLOpSupportLimits {MLWhereSupportLimits where ; };
-
condition
: anMLOperand
. The condition tensor. -
trueValue
: anMLOperand
. The tensor from which the value is selected when the condition of the corresponding element is set to true. -
falseValue
: anMLOperand
. The tensor from which the value is selected when the condition of the corresponding element is set to false. -
options
: anMLOperatorOptions
. Specifies the optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
values
selected
element-wise
from
either
the
trueValue
or
the
falseValue
tensor.
operand | allowed data types | allowed ranks |
---|---|---|
condition
|
"uint8"
| N |
trueValue
| any | N |
falseValue
|
same
as
trueValue
| N |
output |
same
as
trueValue
|
maximum
of
condition
’s
rank
,
trueValue
’s
rank
and
falseValue
’s
rank
|
MLWhereSupportLimits
has
the
following
members:
-
condition
, of type MLTensorLimits -
MLTensorLimits
for condition operand. -
trueValue
, of type MLTensorLimits -
MLTensorLimits
for trueValue operand. -
falseValue
, of type MLTensorLimits -
MLTensorLimits
for falseValue operand. -
output
, of type MLDataTypeLimits -
MLDataTypeLimits
for output operand.
MLOpSupportLimits
has
the
following
member
for
where()
:
-
where
, of type MLWhereSupportLimits -
Support limits for operator
where()
.
The
where(
condition
,
trueValue
,
falseValue
,
options
)
method
steps
are:
-
If this can not build , then throw an "
InvalidStateError
"DOMException
. -
If validating operand with this and any of condition , trueValue , and falseValue returns false, then throw a
TypeError
. -
If the dataType of any of condition , trueValue , or falseValue is not one of its allowed data types (according to this table ), then throw a
TypeError
. -
Let outputShape be the result of bidirectionally broadcasting trueValue ’s shape and falseValue ’s shape .
-
Set outputShape to the result of bidirectionally broadcasting condition ’s shape and outputShape .
-
Let descriptor be the result of creating an MLOperandDescriptor given trueValue ’s dataType and outputShape .
-
Make graph connections:
-
Let output be the result of creating an MLOperand given this and descriptor .
-
Let operator be an operator for the "where" operation, given condition , trueValue , falseValue , and options .
-
Set output .
[[operator]]
to operator . -
Set operator ’s inputs to condition , trueValue and falseValue .
-
Set operator ’s output to output .
-
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows, although user agents typically have a more efficient implementation. In cases where the underlying platform does not directly support an operation, this decomposition can be used as a template to guide the implementation.
function where( builder, condition, trueValue, falseValue) { const c= builder. clamp( condition, { 'minValue' : 0 , 'maxValue' : 1 }); builder. add( builder. mul( trueValue, builder. cast( c, trueValue. dataType)), builder. mul( falseValue, builder. cast( builder. logicalNot( c), falseValue. dataType))); }
8. Algorithms
8.1. Broadcasting
Broadcasting describes how WebNN treats tensors with different shapes during graph construction and computation. It is heavily influenced by [NumPy] and follows the [numpy-broadcasting-rule] . Loosely speaking, it allows an operation on a smaller tensor to be "broadcast" across the shape of a larger tensor, so that the same data can be applied repeatedly without making copies.
The
simplest
example
is
the
application
of
a
scalar
constant
to
an
N-dimension
tensor
with
element-wise
binary
operations
such
as
add()
or
mul()
.
Rather
than
needing
to
allocate
and
populate
a
matching
N-dimensional
tensor
containing
multiple
copies
of
the
scalar
constant,
these
element-wise
operations
allow
the
scalar
constant
to
be
used
directly,
and
broadcast
the
scalar
value
across
the
N-dimensional
tensor.
With
the
following
considerations,
the
same
logic
applies
to
tensors
of
other
dimensions.
The
shapes
of
the
input
tensors
must
be
compatible.
A
tensor
is
unidirectionally
broadcastable
to
another
tensor
if
the
first
tensor
can
be
"stretched"
by
repeating
the
first
tensor
along
an
axis
with
size
1
or
repeating
across
new
dimensions,
starting
from
the
last
(rightmost)
dimension.
For
example,
a
[4]
tensor
can
be
broadcast
to
a
[5,
4]
tensor
by
repeating
it
5
times.
A
[1]
tensor
can
be
broadcast
to
a
[5,4]
tensor
by
repeating
it
4
times
on
the
last
dimension
and
5
times
on
the
preceding
dimension.
Unidirectional
broadcasting
is
important
for
operations
such
as
expand()
where
the
target
tensor
shape
is
explicitly
given.
Two tensors are bidirectionally broadcastable if they can be mutually "stretched" (repeated) across their various dimensions, starting from the last dimension. For example, a [5,1] tensor can be bidirectionally broadcast with a [1,6] tensor by repeating the first tensor 6 times in the last dimension and the second tensor 5 times in preceding dimension. The result of the operation will be a [5,6] tensor. Bidirectional broadcasting is convenient for element-wise operations.
A tensor is blockwise broadcastable if the all dimensions can be upsampled by integer multiples to the target tensor’s shape. For example, a [4,5] tensor can be blockwise broadcast up to a [16,10] tensor as it is an exact multiple (16 % 4 = 0, 10 % 5 = 0) by repeating every element 4 times in the first dimension and every element 2 times in the last dimension (e.g. values [1,2,3,4,5] in the last dimensions would be repeated to [1,1,2,2,3,3,4,4,5,5] ). However, a [4,5] tensor would be incompatible with a [9,3] tensor since both dimensions have a nonzero remainder (9 % 4 = 1, 3 % 5 = 3). Blockwise broadcasting is useful for sharing common values in larger blocks to save memory. Both tensors are expected to have the same rank, and the output shape is simply the target tensor’s shape which the smaller one is being upsampled to.
Some
operations
allow
broadcasting
with
special
semantics.
For
example,
matmul()
treats
the
last
two
dimensions
of
the
input
tensors
as
the
rows
and
columns
of
the
matrices,
and
the
number
of
columns
in
the
first
matrix
must
be
equal
to
the
number
of
rows
in
the
second
matrix.
The
matrix
multiplication
is
bidirectionally
broadcast
across
any
additional
dimensions,
treating
the
input
tensors
as
stacks
of
matrices
to
multiply.
To unidirectionally broadcast the shapes shapeFrom and shapeTo , perform the following steps. shapeFrom and shapeTo are lists of positive integers, representing the dimensions of tensors, and the steps return a new list of positive integers, or failure.
-
Let sizeFrom be shapeFrom ’s size .
-
Let sizeTo be shapeTo ’s size .
-
If sizeFrom > sizeTo , then return failure.
-
Let paddedShapeFrom be a clone of shapeFrom .
-
While paddedShapeFrom ’s size is less than sizeTo , prepend 1 to paddedShapeFrom .
-
Let outputShape be a new list .
-
For each index in the range 0 to sizeTo , exclusive:
-
Let dimFrom be paddedShapeFrom [ index ].
-
Let dimTo be shapeTo [ index ].
-
If dimTo is not equal to dimFrom and dimFrom is not equal to 1, then return failure.
-
Append dimTo to outputShape .
-
-
Return outputShape .
shapeFrom is unidirectionally broadcastable to shapeTo if unidirectionally broadcasting shapeFrom and shapeTo does not result in failure.
To bidirectionally broadcast the shapes shapeA and shapeB , perform the following steps. shapeA and shapeB are lists of positive integers, representing the dimensions of tensors, and the steps return a new list of positive integers, or failure.
-
Let sizeA be shapeA ’s size .
-
Let sizeB be shapeB ’s size .
-
Let outputSize be the maximum of sizeA and sizeB .
-
Let paddedA be a clone of shapeA .
-
While paddedA ’s size is less than outputSize , prepend 1 to paddedA .
-
Let paddedB be a clone of shapeB .
-
While paddedB ’s size is less than outputSize , prepend 1 to paddedB .
-
Let outputShape be a new list .
-
For each index in the range 0 to outputSize , exclusive:
-
Let dimA be paddedA [ index ].
-
Let dimB be paddedB [ index ].
-
If dimA is not equal to dimB , and dimA is not equal to 1, and dimB is not equal to 1, then return failure.
-
Append the maximum of dimA and dimB to outputShape .
-
-
Return outputShape .
shapeA is bidirectionally broadcastable to shapeB if bidirectionally broadcasting shapeA and shapeB does not result in failure.
To blockwise broadcast the shapes shapeFrom and shapeTo , perform the following steps. shapeFrom and shapeTo are lists of positive integers, representing the dimensions of tensors, and the steps return true or false.
shapeFrom is blockwise broadcastable to shapeTo if blockwise broadcasting shapeFrom and shapeTo returns true.
8.2. Casting
Explicit
numeric
casting
is
used
in
algorithms
where
parameters
passed
as
MLNumber
or
double
need
to
be
converted
to
match
the
MLOperandDataType
of
input
or
output
MLOperand
s.
To
cast
a
number
x
to
a
given
MLOperandDataType
dataType
,
perform
the
following
steps.
They
return
a
number.
-
Switch on dataType :
-
"float32"
-
Return ConvertToFloat ( x , 32).
-
"float16"
-
Return ConvertToFloat ( x , 16).
-
"int64"
-
Return ConvertToInt ( x , 64, "signed").
-
"uint64"
-
Return ConvertToInt ( x , 64, "unsigned").
-
"int32"
-
Return ConvertToInt ( x , 32, "signed").
-
"uint32"
-
Return ConvertToInt ( x , 32, "signed").
-
"int8"
-
Return ConvertToInt ( x , 8, "signed").
-
"uint8"
-
Return ConvertToInt ( x , 8, "unsigned").
-
NOTE: The input to cast is an abstract number with unlimited range and precision, including the special values Infinity, -Infinity and NaN. The output is also an abstract number, but exactly representable as the specified type.
-
If x is NaN, then return NaN.
-
Switch on bitLength :
- 32
-
-
Let upperBound be 2 128 .
-
Let lowerBound be -2 128 .
-
Let S be the set of [IEEE-754-2019] binary32 floating point values except -0, but with the special values upperBound and lowerBound added.
-
- 16
-
-
Let upperBound be 2 16 .
-
Let lowerBound be -2 16 .
-
Let S be the set of [IEEE-754-2019] binary16 floating point values except -0, but with the special values upperBound and lowerBound added.
-
-
Let y be the number in S that is closest to x , selecting the number with an even significand if there are two equally close values . The two special values lowerBound and upperBound are considered to have even significands for this purpose.
-
If y is upperBound , then return +Infinity.
-
If y is lowerBound , then return -Infinity.
-
If y is +0 and x is negative, then return -0.
-
Return y .
NOTE: This is based on a definition in [WEBIDL] , but extended to cover 16-bit floating point values.
-
If signedness is "unsigned", then:
-
Let lowerBound be 0.
-
Let upperBound be 2 bitLength - 1.
-
-
Otherwise:
-
Let lowerBound be -(2 bitLength - 1 ).
-
Let upperBound be 2 bitLength - 1 - 1.
-
-
If x is -0, then set x to +0.
-
If x is NaN, then return +0.
-
Set x to min(max( x , lowerBound ), upperBound ).
-
Round x to the nearest integer, choosing the even integer if it lies halfway between two, and choosing +0 rather than -0.
-
Return x .
NOTE: This is based on a definition in [WEBIDL] with these differences: 64-bit integers are not treated specially, the input x is an abstract number, and clamping is always performed.
8.3. Miscellaneous
Remove this when a definition in [INFRA] is available. [whatwg/infra Issue #664]
9. Examples
constant1 ---+ +--- Add ---> intermediateOutput1 ---+ input1 ---+ | +--- Mul---> output constant2 ---+ | +--- Add ---> intermediateOutput2 ---+ input2 ---+
The following code implements the graph:
// Use tensors in 4 dimensions. const TENSOR_SHAPE= [ 1 , 2 , 2 , 2 ]; const TENSOR_SIZE= 8 ; const context= await navigator. ml. createContext(); const builder= new MLGraphBuilder( context); // Create MLOperandDescriptor object. const desc= { dataType: 'float32' , shape: TENSOR_SHAPE}; // constant1 is a constant MLOperand with the value 0.5. const constantBuffer1= new Float32Array( TENSOR_SIZE). fill( 0.5 ); const constant1= builder. constant( desc, constantBuffer1); // input1 is one of the input MLOperands. Its value will be set before // execution. const input1= builder. input( 'input1' , desc); // constant2 is another constant MLOperand with the value 0.5. const constantBuffer2= new Float32Array( TENSOR_SIZE). fill( 0.5 ); const constant2= builder. constant( desc, constantBuffer2); // input2 is another input MLOperand. Its value will be set before execution. const input2= builder. input( 'input2' , desc); // intermediateOutput1 is the output of the first Add operation. const intermediateOutput1= builder. add( constant1, input1); // intermediateOutput2 is the output of the second Add operation. const intermediateOutput2= builder. add( constant2, input2); // output is the output MLOperand of the Mul operation. const output= builder. mul( intermediateOutput1, intermediateOutput2);
10. Operator Emulation
This section is non-normative.
Operations present in other neural network inference APIs can often be emulated using operations present in WebNN.
10.1. squeeze
The
squeeze
operation
returns
a
tensor
with
all
specified
dimensions
of
input
of
size
1
removed.
It
can
be
generically
implemented
using
the
reshape()
operation
as
follows:
function squeeze( builder, input, axes) { if ( ! axes) axes= []; if ( ! axes. length) input. shape. forEach(( item, i) => { axes. push( i); }); const shape= Array. from ( input. shape); for ( let axisin axes. sort(). reverse()) if ( axis< shape. length&& shape[ axis] == 1 ) shape. splice( axis, 1 ); return builder. reshape( input, shape); }
10.2. unsqueeze
The
unsqueeze
operation
returns
a
new
tensor
with
a
dimension
of
size
one
inserted
at
the
specified
position.
It
can
be
generically
implemented
using
the
reshape()
operation
as
follows:
function unsqueeze( builder, input, axes) { const shape= Array. from ( input. shape); for ( let axisin axes. sort()) shape. splice( axis, 0 , 1 ); return builder. reshape( input, shape); }
10.3. flatten
The
flatten
operation
reshapes
the
input
into
a
one-dimensional
tensor.
It
can
be
generically
implemented
using
the
reshape()
operation
as
follows:
function flatten( builder, input, axis) { if ( axis> input. shape. length) return input; const before= axis. slice( 0 , axis). reduce(( a, b) => a* b, 1 ); const after= axis. slice( axis, input. shape. length). reduce(( a, b) => a* b, 1 ); return builder. reshape( input, [ before, after]); }
11. Appendices
11.1.
MLOperandDataType
and
ArrayBufferView
compatibility
MLOperandDataType
|
ArrayBufferView
|
---|---|
float32
|
Float32Array
|
float16
|
Float16Array
|
int64
|
BigInt64Array
|
uint64
|
BigUint64Array
|
int32
|
Int32Array
|
uint32
|
Uint32Array
|
int8
|
Int8Array
|
uint8
|
Uint8Array
|
Float16Array
is
at
ECMA
Stage
3
signaling
its
design
is
finished.
Implementers
wanting
to
enable
this
type
ahead
native
implementations
can
emulate
the
type
by
passing
raw
bits
via
Uint16Array
.
[Issue
webnn#373]
12. Acknowledgements
This specification follows the concepts of the Android Neural Networks API C API.
Thanks to Tomoyuki Shimizu, Ningxin Hu, Zhiqiang Yu and Belem Zhang for the use cases.
Thanks to Nikhil Thorat, Daniel Smilkov, Ganesan Ramalingam, Rafael Cintron and Benjamin Poulain for their contributions to the API specification.
Thanks to Sangwhan Moon and the W3C Technical Architecture Group for review of this specification for web architecture fit, design consistency and developer ergonomics.
Thanks to Zoltan Kis for adding algorithms and making navigating this specification a delightful experience. Thanks to Joshua Bell for aligning the specification with modern editorial conventions. Thanks to Ningxin Hu, Lisha Guo, Shiyi Zou, Mingming Xu, Junwei Fu, Bruce Dai and Bin Miao for careful review and comments.
Thanks to W3C Privacy Interest Group for privacy and security review and feedback.
Thanks to Alex Gough and the Chrome Security team for security review and questions.
Thanks to Michal Karzynski for sharing practical guidelines and learnings from ONNX.
Thanks to Kaustubha Govind and Chrome privacy reviewers for feedback and privacy considerations.
Thanks to Jiewei Qian for Chromium implementation review and feedback.
Thanks to Dwayne Robinson, Joshua Lochner and Wanming Lin for their work investigating and providing recommendation for transformer support. Additional thanks to Dwayne and Wanming for providing reviews of operator conformance and web-platform-tests implementation.
Thanks to Feng Dai for his continuous contributions that keep web-platform-tests evolving alongside the specification.
Thanks to Fuqiao Xue and the W3C Internationalization Activity for reviews and suggestions.