1. Introduction
The Web Neural Network API defines a web-friendly hardware-agnostic abstraction layer that makes use of Machine Learning capabilities of operating systems and underlying hardware platforms without being tied to platform-specific capabilities. The abstraction layer addresses the requirements of key Machine Learning JavaScript frameworks and also allows web developers familiar with the ML domain to write custom code without the help of libraries. A complementary Model Loader API defines a higher-level abstraction targeting primarily web developers.
For an illustrated introduction, please see the explainer .
2. Use cases
2.1. Application Use Cases
This section illustrates application-level use cases for neural network inference hardware acceleration. All applications in those use cases can be built on top of pre-trained deep neural network (DNN) [models] .
Note: Please be aware that some of the use cases described here, are by their very nature, privacy-invasive. Developers who are planning to use the API for such use cases should ensure that the API is being used to benefit users, for purposes that users understand, and approve. They should apply the Ethical Principles for Web Machine Learning [webmachinelearning-ethics] and implement appropriate privacy risk mitigations such as transparency, data minimisation, and users controls.
2.1.1. Person Detection
A user opens a web-based video conferencing application, but she temporarily leaves from her room. The application is watching whether she is in front of her PC by using object detection (for example, using object detection approaches such as [SSD] or [YOLO] that use a single DNN) to detect regions in a camera input frame that include persons.
When she comes back, the application automatically detects her and notifies other online users that she is active now.
2.1.2. Semantic Segmentation
A user joins a teleconference via a web-based video conferencing application at her desk since no meeting room in her office is available. During the teleconference, she does not wish that her room and people in the background are visible. To protect the privacy of the other people and the surroundings, the application runs a machine learning model such as [DeepLabv3+] or [MaskR-CNN] to semantically split an image into segments and replaces segments that represent other people and background with another picture.
2.1.3. Skeleton Detection
A web-based video conferencing application tracks a pose of user’s skeleton by running a machine learning model, which allows for real-time human pose estimation, such as [PoseNet] to recognize her gesture and body language. When she raises her hand, her microphone is automatically unmuted and she can start speaking on the teleconference.
2.1.4. Face Recognition
There are multiple people in the conference room and they join an online meeting using a web-based video conferencing application. The application detects faces of participants by using object detection (for example, using object detection approaches such as [SSD] ) and checks whether each face was present at the previous meeting or not by running a machine learning model such as [FaceNet] , which verifies whether two faces would be identical or not.
2.1.5. Facial Landmark Detection
A user wants to find new glasses that beautifully fits her on an online glasses store. The online store offers web-based try-on simulator that runs a machine learning model such as Face Alignment Network [FAN] to detect facial landmarks like eyes, nose, mouth, etc. When she chooses a pair of glasses, the simulator properly renders the selected glasses on the detected position of eyes on her facial image.
2.1.6. Style Transfer
A user is looking for cosmetics on an online store and wondering which color may fit her face. The online store shows sample facial makeup images of cosmetics, and offers makeup simulator that runs a machine learning model like [ContextualLoss] or [PairedCycleGAN] to transfer the makeup style of the sample makeup image to her facial image. She can check how the selected makeup looks like on her face by the simulator.
2.1.7. Super Resolution
A web-based video conferencing is receiving a video stream from its peer, but the resolution of the video becomes lower due to network congestion. To prevent degradation of the perceived video quality, the application runs a machine learning model for super-resolution such as [SRGAN] to generate higher-resolution video frames.
2.1.8. Image Captioning
For better accessibility, a web-based presentation application provides automatic image captioning by running a machine learning model such as [im2txt] which predicts explanatory words of the presentation slides.
2.1.9. Machine Translation
Multiple people from various countries are talking via a web-based real-time text chat application. The application translates their conversation by using a machine learning model such as [GNMT] or [OpenNMT] , which translates every text into different language.
2.1.10. Emotion Analysis
A user is talking to her friend via a web-based real-time text chat application, and she is wondering how the friend feels because she cannot see the friend’s face. The application analyses the friend’s emotion by using a machine learning model such as [DeepMoji] , which infers emotion from input texts, and displays an emoji that represents the estimated emotion.
2.1.11. Video Summarization
A web-based video conferencing application records received video streams, and it needs to reduce recorded video data to be stored. The application generates the short version of the recorded video by using a machine learning model for video summarization such as [Video-Summarization-with-LSTM] .
2.1.12. Noise Suppression
A web-based video conferencing application records received audio streams, but usually the background noise is everywhere. The application leverages real-time noise suppression using Recurrent Neural Network such as [RNNoise] for suppressing background dynamic noise like baby cry or dog barking to improve audio experiences in video conferences.
2.1.13. Detecting fake video
A user is exposed to realistic fake videos generated by ‘deepfake’ on the web. The fake video can swap the speaker’s face into the president’s face to incite a user politically or to manipulate user’s opinion. The deepfake detection applications such as [FaceForensics++] analyze the videos and protect a user against the fake videos or images. When she watches a fake video on the web, the detection application alerts her of the fraud video in real-time.
2.2. Framework Use Cases
This section collects framework-level use cases for a dedicated low-level API for neural network inference hardware acceleration. It is expected that Machine Learning frameworks will be key consumers of the Web Neural Network API (WebNN API) and the low-level details exposed through the WebNN API are abstracted out from typical web developers. However, it is also expected that web developers with specific interest and competence in Machine Learning will want to interface with the WebNN API directly instead of a higher-level ML framework.
2.2.1. Custom Layer
A web application developer wants to run a DNN model on the WebNN API. However, she has found that some of activation functions like [LeakyReLU] , [ELU] , etc. are not included in the WebNN API. To address this issue, she constructs custom layers of the additional activation functions on top of the WebNN API. Note that the scope of custom layers may include convolution, normalization, etc. as well as activation.
2.2.2. Network Concatenation
A web application uses a DNN model, and its model data of upper convolutional layers and lower fully-connected layers are stored in separate files, since model data of the fully-connected layers are periodically updated due to fine tuning at the server side.
Therefore, the application downloads both partial model files at first and concatenates them into a single model. When the model is updated, the application downloads fine-tuned part of the model and replace only the fully-connected layers with it.
2.2.3. Performance Adaptation
A web application developer has a concern about performance of her DNN model on mobile devices. She has confirmed that it may run too slow on mobile devices which do not have GPU acceleration. To address this issue, her web application refers to the WebNN API to confirm whether acceleration is available or not, so that the application can display the warning for devices without acceleration.
After several weeks, she has developed a tiny DNN model that can even run on CPU. In order to accommodate CPU execution, she modifies the application so that the application loads the tiny model in the case of CPU-only devices.
2.2.4. Operation Level Execution
A JavaScript ML framework is responsible for loading, interpreting and executing a ML model. During the model execution phase, the framework iterates through the operations of the model and executes each operation on the hardware device, like CPU, GPU or ML accelerator. To avoid the unnecessary data copying across devices, the framework selects the same device to execute the operations. For a compute intensive operation, such as convolution 2D or matrix multiplication, the framework uses WebNN API to execute it with the ML-specific acceleration available on that selected device.
2.2.5. Integration with real-time video processing
The user experience of WebRTC-based video conferencing is enhanced using real-time video processing. For example, background blur implemented using a § 2.1.2 Semantic Segmentation model blurs the background in the user’s live camera feed. To satisfy the performance requirements of this use case, the WebNN API integrates with primitives from other Web APIs that make up the media pipeline to allow WebNN API-based transformation of real-time video streams.
3. Security Considerations
This specification defines a low-level API for neural network inference hardware acceleration. This API is considered a powerful feature [POWERFUL-FEATURES] because it grants low-level access to a user’s computer. To meet the authentication and confidentiality expectations of a powerful feature and to prevent man-in-the-middle attacks, all interfaces defined by this specification are only available in a secure context.This API is disabled by default in all cross-origin frames using the § 7.2.1 Permissions Policy Integration . This prevents third-party content from using this API unless the embedding page explicitly sets a policy that grants permission.
This
API
allows
creation
of
an
MLContext
from
a
GPUDevice
defined
by
WebGPU
specification.
See
WebGPU
Security
Considerations
for
more
information
regarding
security
characteristics
of
this
context.
Once the graph is fully constructed and compiled, the input shapes into each of the operations in the graph are inferred and finalized. The bounds checking occurs when the compute method is invoked that executes the graph against the actual data. No actual data is bound to the compiled graph before this stage. It is the implementation’s responsibility to make sure proper bounds checking occurs against the shapes of the data already inferred by that time.
Document operations susceptible to out-of-bounds access as a guidance to implementers.
As a future-proofing measure, the API design allows certain operations that can be generically emulated to be deprecated for security, performance, or other reasons without breaking compatibility. This is made possible by high-level functions that are defined in terms of smaller primitive operations defined in this specifications. This enables a native implementation of a high-level function to be replaced with a polyfill implementation.
Investigate side channel attack feasibility considering the current state where CPU is shared between processes running renderers.
In order to not allow an attacker to target a specific implementation that may contain a flaw, the § 6.2 Device Selection mechanism is a hint only, and the concrete device selection is left to the implementation - a user agent could for instance choose never to run a model on a device with known vulnerabilities. As a further mitigation, no device enumeration mechanism is defined.
Hinting partially mitigates the concern. Investigate additional mitigations.
The
API
design
minimizes
the
attack
surface
for
the
compiled
computational
graph.
The
MLGraphBuilder
interface
that
hosts
the
various
operations
is
a
data
definition
API
and
as
such
doesn’t
execute
anything,
only
constructs
data.
What
follows,
is
that
the
potential
for
an
attack
is
limited
to
when
binding
the
data
to
the
graph
before
executing
it
by
invoking
the
MLContext
.
compute()
method.
This
enables
implementers
to
focus
on
hardening
the
MLContext
.
compute()
method.
For
example,
by
making
sure
it
honors
the
boundary
of
data
and
fails
appropriately
when
the
bounds
are
not
respected.
Purpose-built Web APIs for measuring high-resolution time mitigate against timing attacks using techniques such as resolution reduction, adding jitter, detection of abuse and API call throttling [hr-time-3] . The practical deployment of WebNN implementations are likely to bring enough jitter to make timing attacks impractical (e.g. because they would use IPC) but implementers are advised to consider and test their implementations against timing attacks.
3.1. Guidelines for new operations
To ensure operations defined in this specification are shaped in a way they can be implemented securely, this section includes guidelines on how operations are expected to be defined to reduce potential for implementation problems. These guidelines are expected to evolve over time to align with industry best practices:
-
Prefer simplicity of arguments
-
Don’t use parsers for complex data formats
-
If an operation can be decomposed to low level primitives:
-
Add an informative emulation path
-
Prefer primitives over new high level operations but consider performance consequences
-
-
Operations should follow a consistent style for inputs and attributes
-
Operation families such as pooling and reduction should share API shape and options
-
Formalize failure cases into test cases whenever possible
-
When in doubt, leave it out: API surface should be as small as possible required to satisfy the use cases, but no smaller
-
Try to keep the API free of implementation details that might inhibit future evolution, do not overspecify
-
Fail fast: the sooner the web developer is informed of an issue, the better
In general, always consider the security and privacy implications as documented in [security-privacy-questionnaire] by the Technical Architecture Group and the Privacy Interest Group when adding new features.
4. Privacy Considerations
This API enhances privacy compared to cloud-based inference, since input data such as locally sourced images or video streams stay within the browser’s sandbox.
This API exposes the minimum amount of information necessary to address the identified § 2 Use cases for the best performance and reliability of results.
No information from the underlying platform is exposed directly. An execution time analysis may reveal indirectly the performance of the underlying platform’s neural network hardware acceleration capabilities relative to another underlying platform.
Note: The group is soliciting further input on the proposed execution time analysis fingerprinting vector and will augment this section with more information and mitigations to inform the implementers of this API.
Unlike
WebGPU,
this
API
does
not
intrinsically
support
custom
shader
authoring;
and
as
a
result
is
not
prone
to
timing
attacks
that
rely
on
shader
caches,
or
other
persistent
data.
The
API
builds
upon
pre-existing
shaders
and
lower
level
primitives
of
the
browser
or
the
underlying
OS.
Web
developers
who
interface
with
GPUDevice
are
expected
to
be
aware
of
WebGPU
compilation
cache
considerations
.
The
WebGPU
API
identifies
machine-specific
artifacts
as
a
privacy
consideration.
Given
the
WebNN
API
defines
means
to
record
an
ML
workload
onto
a
WebGPU-compatible
GPUCommandBuffer
,
compute
unit
scheduling
may
under
certain
circumstances
introduce
a
fingerprint.
However,
similarly
to
WebGPU,
such
fingerprints
are
identical
across
most
or
all
of
the
devices
of
each
vendor,
mitigating
the
concern.
Furthermore,
software
implementations
can
be
used
to
further
eliminate
such
artifacts.
The
WebNN
API
defines
two
developer-settable
preferences
to
help
inform
§ 6.2
Device
Selection
and
allow
the
implementation
to
better
select
the
most
appropriate
underlying
execution
device
for
the
workload.
Device
type
normatively
indicates
the
kind
of
device
and
is
either
"cpu"
or
"gpu".
If
this
type
cannot
be
satisfied,
an
"
OperationError
"
DOMException
is
thrown,
thus
this
type
can
in
some
cases
add
two
bits
of
entropy
to
the
fingerprint.
Power
preference
indicates
preference
as
related
to
the
power
consumption
and
is
considered
a
hint
only
and
as
such
does
not
increase
entropy
of
the
fingerprint.
If
a
future
version
of
this
specification
introduces
support
for
new
a
device
type
that
can
only
support
a
subset
of
MLOperandType
s,
that
may
introduce
a
new
fingerprint.
In general, implementers of this API are expected to apply WebGPU Privacy Considerations to their implementations where applicable.
5. Ethical Considerations
The Working Group has started documenting ethical issues associated with using Machine Learning on the Web, to help identify what mitigations its normative specifications should take into account. The Working Group publishes and maintains an Ethical Principles for Web Machine Learning document [webmachinelearning-ethics] open to contributions from the wider community via a dedicated GitHub repository .
6. Programming Model
6.1. Overview
At the heart of neural networks is a computational graph of mathematical operations. These operations are the building blocks of modern machine learning technologies in computer vision, natural language processing, and robotics. The WebNN API is a specification for constructing, compiling, and executing computational graphs of neural networks.
The
MLGraph
interface
represents
a
compiled
computational
graph
that
is
immutable
(that
is,
a
model).
The
MLGraphBuilder
interface
serves
as
a
builder
(factory)
to
create
an
MLGraph
.
An
MLOperand
is
a
representation
of
data
that
flows
within
the
computational
graph,
which
include
input-values
for
inference,
constants
(including
trained
weights)
used
for
inference,
intermediate
values
(often
referred
to
as
activations)
computed
during
inference,
as
well
as
the
output
values
of
inference.
At
inference
time,
every
MLOperand
will
be
bound
to
a
tensor
(the
actual
data).
The
MLGraphBuilder
interface
enables
the
creation
of
MLOperand
s.
A
key
part
of
the
MLGraphBuilder
interface
are
the
operations
(such
as
MLGraphBuilder
.
gemm()
and
MLGraphBuilder
.
softmax()
).
The
operations
have
a
functional
semantics,
with
no
side
effects.
Each
operation
invocation
conceptually
returns
a
distinct
new
value,
without
changing
the
value
of
any
other
MLOperand
.
The
runtime
values
(of
MLOperand
s)
are
tensors,
which
are
essentially
multidimensional
arrays.
The
representation
of
the
tensors
is
implementation
dependent,
but
it
typically
includes
the
array
data
stored
in
some
buffer
(memory)
and
some
metadata
describing
the
array
data
(such
as
its
shape).
As mentioned above, the operations have a functional semantics. This allows the implementation to potentially share the array data between multiple tensors. For example, the implementation of operations such as reshape, or slice, or squeeze may return a view of its input tensor that shares the same buffer as the input tensor. (In the case of reshape or squeeze, the entire data is shared, while in the case of slice, a part of the input data is shared.) The implementation may use views, as above, for intermediate values.
Before the execution, the computation graph that is used to compute one or more specified outputs needs to be compiled and optimized. The key purpose of the compilation step is to enable optimizations that span two or more operations, such as operation or loop fusion.
There
are
multiple
ways
by
which
the
graph
may
be
compiled.
The
MLGraphBuilder
.
build()
method
compiles
the
graph
in
the
background
without
blocking
the
calling
thread,
and
returns
a
Promise
that
resolves
to
an
MLGraph
.
The
MLGraphBuilder
.
buildSync()
method
compiles
the
graph
immediately
on
the
calling
thread,
which
must
be
a
worker
thread
running
on
CPU
or
GPU
device,
and
returns
an
MLGraph
.
Both
compilation
methods
produce
an
MLGraph
that
represents
a
compiled
graph
for
optimal
execution.
Once
the
MLGraph
is
constructed,
there
are
multiple
ways
by
which
the
graph
may
be
executed.
The
MLContext
.
computeSync()
method
represents
a
way
the
execution
of
the
graph
is
carried
out
immediately
on
the
calling
thread,
which
must
also
be
a
worker
thread,
either
on
a
CPU
or
GPU
device.
The
execution
produces
the
results
of
the
computation
from
all
the
inputs
bound
to
the
graph.
The
MLContext
.
compute()
method
represents
a
way
the
execution
of
the
graph
is
performed
asynchronously
either
on
a
parallel
timeline
in
a
separate
worker
thread
for
the
CPU
execution
or
on
a
GPU
timeline
in
a
GPU
command
queue.
This
method
returns
immediately
without
blocking
the
calling
thread
while
the
actual
execution
is
offloaded
to
a
different
timeline.
This
type
of
execution
is
appropriate
when
the
responsiveness
of
the
calling
thread
is
critical
to
good
user
experience.
The
computation
results
will
be
placed
at
the
bound
outputs
at
the
time
the
operation
is
successfully
completed
on
the
offloaded
timeline
at
which
time
the
calling
thread
is
signaled.
This
type
of
execution
supports
both
the
CPU
and
GPU
device.
In
both
the
MLContext
.
compute()
and
MLContext
.
computeSync()
execution
methods,
the
caller
supplies
the
input
values
using
MLNamedArrayBufferViews
,
binding
the
input
MLOperand
s
to
their
values.
The
caller
then
supplies
pre-allocated
buffers
for
output
MLOperand
s
using
MLNamedArrayBufferViews
.
The
MLCommandEncoder
interface
created
by
the
MLContext
.
createCommandEncoder()
method
supports
a
graph
execution
method
that
provides
the
maximum
flexibility
to
callers
that
also
utilize
WebGPU
in
their
application.
It
does
this
by
placing
the
workload
required
to
initialize
and
compute
the
results
of
the
operations
in
the
graph
onto
a
GPUCommandBuffer
.
The
callers
are
responsible
for
the
eventual
submission
of
this
workload
on
the
GPUQueue
through
the
WebGPU
queue
submission
mechanism.
Once
the
submitted
workload
is
completely
executed,
the
result
is
avaialble
in
the
bound
output
buffers.
6.2. Device Selection
An
MLContext
interface
represents
a
global
state
of
neural
network
execution.
One
of
the
important
context
states
is
the
underlying
execution
device
that
manages
the
resources
and
facilitates
the
compilation
and
the
eventual
execution
of
the
neural
network
graph.
In
addition
to
the
default
method
of
creation
with
MLContextOptions
,
an
MLContext
could
also
be
created
from
a
specific
GPUDevice
that
is
already
in
use
by
the
application,
in
which
case
the
corresponding
GPUBuffer
resources
used
as
graph
constants,
as
well
as
the
GPUTexture
as
graph
inputs
must
also
be
created
from
the
same
device.
In
a
multi-adapter
configuration,
the
device
used
for
MLContext
must
be
created
from
the
same
adapter
as
the
device
used
to
allocate
the
resources
referenced
in
the
graph.
In
a
situation
when
a
GPU
context
executes
a
graph
with
a
constant
or
an
input
in
the
system
memory
as
an
ArrayBufferView
,
the
input
content
is
automatically
uploaded
from
the
system
memory
to
the
GPU
memory,
and
downloaded
back
to
the
system
memory
of
an
ArrayBufferView
output
buffer
at
the
end
of
the
graph
execution.
This
data
upload
and
download
cycles
will
only
occur
whenever
the
execution
device
requires
the
data
to
be
copied
out
of
and
back
into
the
system
memory,
such
as
in
the
case
of
the
GPU.
It
doesn’t
occur
when
the
device
is
a
CPU
device.
Additionally,
the
result
of
the
graph
execution
is
in
a
known
layout
format.
While
the
execution
may
be
optimized
for
a
native
memory
access
pattern
in
an
intermediate
result
within
the
graph,
the
output
of
the
last
operation
of
the
graph
must
convert
the
content
back
to
a
known
layout
format
at
the
end
of
the
graph
in
order
to
maintain
the
expected
behavior
from
the
caller’s
perspective.
When
an
MLContext
is
created
with
MLContextOptions
,
the
user
agent
selects
and
creates
the
underlying
execution
device
by
taking
into
account
the
application’s
power
preference
and
device
type
specified
in
the
MLPowerPreference
and
MLDeviceType
options.
The following table summarizes the types of resource supported by the context created through different method of creation:
| Creation method | ArrayBufferView | GPUBuffer | GPUTexture |
|---|---|---|---|
| MLContextOptions | Yes | No | No |
| GPUDevice | Yes | Yes | Yes |
7. API
7.1. The navigator.ml interface
An
ML
object
is
available
in
the
Window
and
DedicatedWorkerGlobalScope
contexts
through
the
Navigator
and
WorkerNavigator
interfaces
respectively
and
is
exposed
via
navigator.ml
.
interface mixin { [NavigatorML SecureContext ,SameObject ]readonly attribute ML ; };ml Navigator includes NavigatorML ;WorkerNavigator includes NavigatorML ;
7.2. The ML interface
enum {MLDeviceType ,"cpu" };"gpu" enum {MLPowerPreference ,"default" ,"high-performance" };"low-power" dictionary {MLContextOptions MLDeviceType = "cpu";deviceType MLPowerPreference = "default"; }; [powerPreference SecureContext ,Exposed =(Window ,DedicatedWorker )]interface {ML Promise <MLContext >(createContext optional MLContextOptions = {});options Promise <MLContext >(createContext GPUDevice ); [gpuDevice Exposed =(DedicatedWorker )]MLContext (createContextSync optional MLContextOptions = {}); [options Exposed =(DedicatedWorker )]MLContext (createContextSync GPUDevice ); };gpuDevice
7.2.1. Permissions Policy Integration
This
specification
defines
a
policy-controlled
feature
identified
by
the
string
"
webnn
".
Its
default
allowlist
is
'self'
.
7.2.2.
The
createContext()
method
The
createContext()
method
steps
are:
-
If this 's relevant global object 's associated Document is not allowed to use the webnn feature, return a new promise rejected with a "
SecurityError"DOMExceptionand abort these steps. -
Let promise be a new promise .
-
Return promise and run the following steps in parallel .
-
Let options be the first argument.
-
Run the create context steps given options :
-
Let context be a new
MLContextobject. -
If options is a
GPUDeviceobject,-
Set context .
[[contextType]]to " webgpu ". -
Set context .
[[deviceType]]to " gpu ". -
Set context .
[[powerPreference]]to " default ".
-
-
Otherwise,
-
Set context .
[[contextType]]to " default ". -
If options ["
deviceType"] exists , then set context .[[deviceType]]to options ["deviceType"]. Otherwise, set context .[[deviceType]]to " cpu ". -
If options ["
powerPreference"] exists , then set context .[[powerPreference]]to options ["powerPreference"]. Otherwise, set context .[[powerPreference]]to " default ".
-
-
-
If the validate MLContext steps given context return
false, reject promise with a "NotSupportedError"DOMExceptionand abort these steps. -
Resolve promise with context .
7.2.3.
The
createContextSync()
method
The
createContextSync()
method
steps
are:
-
If this 's relevant global object 's associated Document is not allowed to use the webnn feature, throw a "
SecurityError"DOMExceptionand abort these steps. -
Let options be the first argument.
-
Let context be the result of running the create context steps given options .
-
If the validate MLContext steps given context return
false, throw a "NotSupportedError"DOMExceptionand abort these steps. -
Return context .
7.3. The MLGraph interface
The
MLGraph
interface
represents
a
compiled
computational
graph.
A
compiled
graph
once
constructed
is
immutable
and
cannot
be
subsequently
changed.
[SecureContext ,Exposed =(Window ,DedicatedWorker )]interface {};MLGraph
MLGraph
has
the
following
internal
slots:
-
[[context]]of typeMLContext -
The context of type
MLContextassociated with thisMLGraph. -
[[inputDescriptors]]of type record <DOMString,MLOperandDescriptor> -
Maps the name of an input
MLOperandto itsMLOperandDescriptorfor all inputMLOperands of thisMLGraph. -
[[outputDescriptors]]of type record <DOMString,MLOperandDescriptor> -
Maps the name of an output
MLOperandto itsMLOperandDescriptorfor all outputMLOperands of thisMLGraph. -
[[implementation]] -
The underlying implementation provided by the User Agent.
7.3.1. The MLOperandDescriptor dictionary
enum {MLInputOperandLayout ,"nchw" };"nhwc" enum {MLOperandType ,"float32" ,"float16" ,"int32" ,"uint32" ,"int8" };"uint8" dictionary { // The operand type.MLOperandDescriptor required MLOperandType ; // The dimensions field is only required for tensor operands.type sequence <unsigned long >; };dimensions
The
byte
length
of
an
MLOperandDescriptor
desc
is
the
value
returned
by
the
following
steps:
-
Let elementLength be 1.
-
For each dimension of desc .
dimensions:-
Set elementLength to elementLength × dimension .
-
-
Let elementSize be the element size of one of the
ArrayBufferViewtypes that matches desc .typeaccording to this table . -
Return elementLength × elementSize .
7.3.2. The MLOperand interface
An
MLOperand
represents
an
intermediary
graph
being
constructed
as
a
result
of
compositing
parts
of
an
operation
into
a
fully
composed
operation.
For
instance,
an
MLOperand
may
represent
a
constant
feeding
to
an
operation
or
the
result
from
combining
multiple
constants
together
into
an
operation.
See
also
§ 6
Programming
Model
.
[SecureContext ,Exposed =(Window ,DedicatedWorker )]interface {};MLOperand
MLOperand
has
the
following
internal
slots:
-
[[builder]]of typeMLGraphBuilder -
The
MLOperand's associated builder object. -
[[descriptor]]of typeMLOperandDescriptor -
The
MLOperand's descriptor. -
[[name]]of type string -
The
MLOperand's name (only for input operands). -
[[operand]]of type object -
Reference to
MLOperand's corresponding implementation-defined platform operand object. -
[[operator]]of type object -
Reference to
MLOperand's corresponding implementation-defined platform operator object.
To
get
the
rank
of
an
MLOperand
operand
,
run
the
following
steps:
Return the size of operand .
[[descriptor]].dimensions.
Since
the
[[builder]]
object
is
bound
by
the
MLGraphBuilder()
constructor
to
an
MLContext
object,
an
MLOperand
is
also
always
bound
to
the
same
MLContext
object.
7.3.2.1.
Creating
MLOperand
The
MLOperand
objects
are
created
by
the
methods
of
MLGraphBuilder
,
internally
using
the
following
algorithms.
To create MLOperand given builder and desc , run the following steps:
-
If builder is not an instance of
MLGraphBuilder, then throw a "TypeError"DOMExceptionand stop. -
If desc is not an object that implements
MLOperandDescriptor, then throw a "TypeError"DOMExceptionand stop. -
Let operand be a new object .
-
Set operand .
[[builder]]to builder . -
Set operand .
[[descriptor]]to desc . -
Return operand .
To copy MLOperand given operand , run the following steps:
If operand is not an instance of
MLOperand, then throw a "TypeError" and stop.Let result be a new object .
Set result .
[[builder]]to operand .[[builder]].Set result .
[[descriptor]]to operand .[[descriptor]].-
If operand .
[[name]]exists , then set result .[[name]]to operand .[[name]]. Return result .
To check dimensions given dimensions and type , run the following steps:
-
If dimensions is not an array of positive numbers, return
false; -
If dimensions .length is 0, return
false. -
If dimensions .length is too large to be supported by the implementation, return
false. -
If any element of dimensions is not a positive number, or it is too large to be supported by the implementation given type , return
false. -
Return
true.
To validate MLOperand given operand and builder , run the following steps:
If operand .
[[builder]]is not an instance ofMLGraphBuilder, returnfalse.If builder is not
undefinedand is not equal to operand .[[builder]], returnfalse.Let desc be operand .
[[descriptor]].If desc is not an object that implements
MLOperandDescriptor, returnfalse.If desc .
dimensionsexists and invoking check dimensions given desc .dimensionsand desc .typereturnsfalse, then returnfalse.Return
true.
7.3.3. The MLActivation interface
Objects
implementing
the
MLActivation
interface
represent
activation
function
types.
[SecureContext ,Exposed =(Window ,DedicatedWorker )]{};interface { };MLActivation
MLActivation
has
the
following
internal
slots:
[[name]]of type stringThe
MLActivation's name.[[builder]]of typeMLGraphBuilderThe graph builder object this
MLActivationbelongs to.[[options]]of type objectA dictionary containing
MLActivationoptions.[[operator]]of type objectReference to
MLActivation's corresponding implementation-defined platform operator object.
7.3.3.1.
Creating
MLActivation
MLActivation
MLGraphBuilder
and
are
identified
by
their
name.
The
options
dictionary
is
defined
by
those
methods.
The
actual
creation
of
the
activation
function
e.g.
a
To create MLActivation given builder , name , options and init-steps , run the following steps:
If builder is not an instance of
MLGraphBuilder, throw a "TypeError" and abort these steps.If name is
undefinedornull, throw a "TypeError" and abort these steps.Let activation be a new object .
Set activation .
[[builder]]to builder .Set activation .
[[name]]to name .If options is an object , set activation .
[[options]]to options .If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Make a request to the underlying platform to:
Create an implementation-defined platform operator opImpl for the given name operation.
Store a reference of opImpl in activation .
[[operator]].
If init-steps are defined, run init-steps with options .
Otherwise, initialize activation .
[[operator]]given options in an implementation-defined way for the given name operation.
Return activation .
7.4. The MLContext interface
The
MLContext
interface
represents
a
global
state
of
neural
network
compute
workload
and
execution
processes.
Each
MLContext
object
has
associated
context
type
,
device
type
and
power
preference
.
The context type is the type of the execution context that manages the resources and facilitates the compilation and execution of the neural network graph:
-
"
default" - Context created per user preference options.
-
"
webgpu" - Context created from WebGPU device.
The device type indicates the kind of device used for the context. It is one of the following:
-
"
cpu" - Provides the broadest compatibility and usability across all client devices with varying degrees of performance.
-
"
gpu" - Provides the broadest range of achievable performance across graphics hardware platforms from consumer devices to professional workstations.
The power preference indicates preference as related to power consumption. It is one of the following:
-
"
default" - Let the user agent select the most suitable behavior.
-
"
high-performance" - Prioritizes execution speed over power consumption.
-
"
low-power" - Prioritizes power consumption over other considerations such as execution speed.
typedef record <DOMString ,ArrayBufferView >; [MLNamedArrayBufferViews SecureContext ,Exposed =(Window ,DedicatedWorker )]interface {};MLContext
MLContext
has
the
following
internal
slots:
-
[[contextType]]of type context type -
The
MLContext's context type . -
[[deviceType]]of type device type -
The
MLContext's device type . -
[[powerPreference]]of type power preference -
The
MLContext's power preference .
[[contextType]]
is
set
to
default
with
the
MLContextOptions
.
deviceType
set
to
gpu
,
the
user
agent
is
responsible
for
creating
an
internal
GPU
device
that
operates
within
the
context
and
is
capable
of
ML
workload
submission
on
behalf
of
the
calling
application.
In
this
setting
however,
only
ArrayBufferView
inputs
and
outputs
are
allowed
in
and
out
of
the
graph
execution
since
the
application
has
no
way
to
know
what
type
of
internal
GPU
device
is
being
created
on
their
behalf.
In
this
case,
the
user
agent
is
responsible
for
automatic
uploads
and
downloads
of
the
inputs
and
outputs
to
and
from
the
GPU
memory
using
this
said
internal
device.
7.4.1.
The
MLContext
validation
algorithm
To
validate
MLContext
,
given
context
,
run
these
steps:
-
If context .
[[contextType]]is not " webgpu " or " default , returnfalse. -
If context .
[[deviceType]]is not " cpu " or " gpu ", returnfalse. -
If context .
[[powerPreference]]is not " default " or " high-performance " or " low-power ", returnfalse. -
If the user agent cannot support context .
[[contextType]], context .[[deviceType]]and context .[[powerPreference]], returnfalse. -
Return
true;
7.4.2. Synchronous Execution
Synchronously carries out the computational workload of a compiled graph
MLGraph
on
the
calling
thread,
which
must
be
a
worker
thread,
to
produce
results
as
defined
by
the
operations
in
the
graph.
This
method
of
execution
requires
an
MLContext
created
with
MLContextOptions
.
Otherwise,
it
throws
an
"
OperationError
"
DOMException
.
partial interface MLContext { [Exposed =(DedicatedWorker )]undefined (computeSync MLGraph ,graph MLNamedArrayBufferViews ,inputs MLNamedArrayBufferViews ); };outputs
-
graph : an
MLGraph. The compiled graph to be executed. -
inputs : an
MLNamedArrayBufferViews. The resources of inputs. -
outputs : an
MLNamedArrayBufferViews. The pre-allocated resources of required outputs.
Returns:
undefined
.
The
computeSync(graph,
inputs,
outputs)
method
steps
are:
-
If graph .
[[context]].[[contextType]]is not " default , throw an "OperationError"DOMExceptionand stop. -
If invoking the validate graph resources algorithm given inputs and graph .
[[inputDescriptors]]returnsfalse, then throw a "DataError"DOMExceptionand stop. -
If invoking the validate graph resources algorithm given outputs and graph .
[[outputDescriptors]]returnsfalse, then throw a "DataError"DOMExceptionand stop. -
Invoke execute graph given graph , inputs and outputs .
-
If that throws an error, re-throw the error and stop.
-
Return
undefined.
To validate graph resources , given resources and descriptors , run the following steps:
-
Assert : the type of resources is
MLNamedArrayBufferViews. -
For each record < key , value > of resources :
-
If descriptors [ key ] does not exist , return
false. -
Assert : the type of value is
ArrayBufferView. -
If running the validate buffer with descriptor given value and descriptors [ key ] return
false, returnfalse.
-
-
Return
true.
To validate buffer with descriptor given bufferView and descriptor , run the following steps:
-
If bufferView is not an
MLBufferView, returnfalse. -
If bufferView ’s element type does not match to descriptor .
typeaccording to this table , returnfalse. -
If bufferView .[[ByteLength]] is not equal to the byte length of descriptor , return
false.
To execute graph , given graph , inputs and outputs , run the following steps:
-
Assert : the type of inputs is
MLNamedArrayBufferViews. -
Let inputResources denote the input resources of graph .
[[implementation]]. -
For each < key , inputValue > of inputs :
-
Let inputDescriptor be graph .
[[inputDescriptors]][ key ]. -
Let inputTensor be a new tensor for graph .
[[implementation]]as follows:-
Set the data type of inputTensor to the one that matches the element type of inputValue .
-
Set the dimensions of inputTensor to inputDescriptor .
dimensions. -
Set the values of elements in inputTensor to the values of elements in inputValue .
-
-
Request the underlying implementation of graph to bind inputResources [ key ] to inputTensor .
-
-
Assert : the type of outputs is
MLNamedArrayBufferViews. -
For each < key , outputValue > of outputs :
-
Issue a compute request to graph .
[[implementation]]given key and inputResources and wait for completion.-
If that returns an error, then throw an "
OperationError"DOMExceptionand stop. -
Otherwise, store the result in outputTensor .
-
-
Let outputDesc be graph .
[[outputDescriptors]][ key ]. -
If the byte length of outputTensor is not equal to the byte length of outputDesc , then throw a "
DataError"DOMExceptionand stop. -
If the element type of outputTensor doesn’t match the element type of outputValue , then throw a "
DataError"DOMExceptionand stop. -
Request the underlying implementation of graph to set the values of elements in outputValue to the values of elements in outputTensor .
-
-
Return
undefined.
7.4.2.1. Examples
The following code showcases the synchronous computation with optional outputs in a worker.
const context= navigator. ml. createContextSync(); // Build a graph with two outputs. const builder= new MLGraphBuilder( context); const descA= { type: 'float32' , dimensions: [ 3 , 4 ]}; const a= builder. input( 'a' , descA); const descB= { type: 'float32' , dimensions: [ 4 , 3 ]}; const bufferB= new Float32Array( sizeOfShape( descB. dimensions)). fill( 0.5 ); const b= builder. constant( descB, bufferB); const descC= { type: 'float32' , dimensions: [ 3 , 3 ]}; const bufferC= new Float32Array( sizeOfShape( descC. dimensions)). fill( 1 ); const c= builder. constant( descC, bufferC); const d= builder. matmul( a, b); const e= builder. add( d, c); const graph= builder. buildSync({ 'd' : d, 'e' : e}); const bufferA= new Float32Array( sizeOfShape( descA. dimensions)). fill( 0.5 ); const inputs= { 'a' : bufferA}; // Compute d. const bufferD= new Float32Array( sizeOfShape([ 3 , 3 ])); context. computeSync( graph, inputs, { 'd' : bufferD}); console. log( `values: ${ bufferD} ` ); // Compute e. const bufferE= new Float32Array( sizeOfShape([ 3 , 3 ])); context. computeSync( graph, inputs, { 'e' : bufferE}); console. log( `values: ${ bufferE} ` );
7.4.3.
The
MLNamedArrayBufferViews
transfer
algorithm
To
transfer
an
MLNamedArrayBufferViews
views
:
-
Let transferredViews be a new
MLNamedArrayBufferViews. -
For each key -> value of views :
-
Let transferredBuffer be the result of transferring the underlying buffer of value .
-
Let constructor be the appropriate view constructor for the type of
ArrayBufferViewvalue . -
Let elementsNumber be the result of the byte length of value ÷ element size of value .
-
Let transferredView be Construct ( constructor , transferredBuffer , value .[[ByteOffset]], elementsNumber ).
-
Set transferredViews [ key ] to transferredView .
-
-
Return transferredViews .
7.4.4. Asynchronous Execution
Asynchronously carries out the computational workload of a compiled graph
MLGraph
on
a
separate
timeline,
either
on
a
worker
thread
for
the
CPU
execution,
or
on
a
GPU
timeline
for
the
submission
of
GPU
workload
on
the
command
queue.
The
asynchronous
nature
of
this
call
avoids
blocking
the
calling
thread
while
the
computation
for
result
is
ongoing.
This
method
of
execution
requires
an
MLContext
created
with
MLContextOptions
.
Otherwise,
it
throws
an
"
OperationError
"
DOMException
.
MLNamedArrayBufferViews
to
new
views
that
share
the
same
backing
memory
allocations.
The
transferred
views
are
returned
to
the
caller
via
the
promise
fulfillment
with
the
computation
result
written
into
the
backing
memory
of
the
output
views.
dictionary {MLComputeResult MLNamedArrayBufferViews ;inputs MLNamedArrayBufferViews ; };outputs partial interface MLContext {Promise <MLComputeResult >(compute MLGraph ,graph MLNamedArrayBufferViews ,inputs MLNamedArrayBufferViews ); };outputs
-
graph : an
MLGraph. The compiled graph to be executed. -
inputs : an
MLNamedArrayBufferViews. The resources of inputs. Will be transferred if there are no validation errors. -
outputs : an
MLNamedArrayBufferViews. The pre-allocated resources of required outputs. Will be transferred if there are no validation errors.
Returns:
Promise<
MLComputeResult
>.
The
compute(graph,
inputs,
outputs)
method
steps
are:
-
Let promise be a new promise .
-
Return promise and run the following steps in parallel :
-
If graph .
[[context]].[[contextType]]is not " default , reject promise with an "OperationError"DOMExceptionand stop. -
If invoking the validate graph resources algorithm given inputs and graph .
[[inputDescriptors]]returnsfalse, then reject promise with a "DataError"DOMExceptionand stop. -
If invoking the validate graph resources algorithm given outputs and graph .
[[outputDescriptors]]returnsfalse, then reject promise with a "DataError"DOMExceptionand stop. -
Let transferredInputs be the result of transferring
MLNamedArrayBufferViewsinputs . -
Let transferredOutputs be the result of transferring
MLNamedArrayBufferViewsoutputs . -
Invoke execute graph given graph , transferredInputs and transferredOutputs .
-
If that throws an error, reject promise with the error and stop.
-
Otherwise, when execute graph has completed:
-
Let result be a new
MLComputeResult. -
Set result .
inputsto transferredInputs . -
Set result .
outputsto transferredOutputs . -
Resolve promise with result and stop.
-
-
7.4.4.1. Examples
The following code showcases the asynchronous computation.
const operandType= { type: 'float32' , dimensions: [ 2 , 2 ]}; const context= await navigator. ml. createContext(); const builder= new MLGraphBuilder( context); // 1. Create a computational graph 'C = 0.2 * A + B'. const constant= builder. constant( 0.2 ); const A= builder. input( 'A' , operandType); const B= builder. input( 'B' , operandType); const C= builder. add( builder. mul( A, constant), B); // 2. Compile it into an executable. const graph= await builder. build({ 'C' : C}); // 3. Bind inputs to the graph and execute for the result. const bufferA= new Float32Array( 4 ). fill( 1.0 ); const bufferB= new Float32Array( 4 ). fill( 0.8 ); const bufferC= new Float32Array( 4 ); const inputs= { 'A' : bufferA, 'B' : bufferB}; const outputs= { 'C' : bufferC}; const result= await context. compute( graph, inputs, outputs); // The computed result of [[1, 1], [1, 1]] is in the buffer associated with // the output operand. console. log( 'Output value: ' + result. outputs. C); // Note: the result.outputs.C buffer is different from the bufferC, but it // shares the same backing memory allocation.
7.4.5. WebGPU Interoperability
Create
MLCommandEncoder
interface
used
to
record
the
ML
workload
onto
a
WebGPU-compatible
GPUCommandBuffer
to
allow
mixing
of
ML
workload
with
other
GPU
workload
in
an
application
that
leverages
WebGPU.
This
method
only
succeeds
on
an
MLContext
created
with
GPUDevice
.
Otherwise,
it
throws
an
"
OperationError
"
DOMException
.
partial interface MLContext {MLCommandEncoder (); };createCommandEncoder
MLCommandEncoder
.
The
command
encoder
used
to
record
ML
workload
on
the
GPU.
7.5. The MLCommandEncoder interface
The
MLCommandEncoder
interface
represents
a
method
of
execution
that
synchronously
records
the
computational
workload
of
a
compiled
MLGraph
to
a
GPUCommandBuffer
on
the
calling
thread.
Since
the
workload
is
not
immediately
executed,
just
recorded,
this
method
allows
more
flexibility
for
the
caller
to
determine
how
and
when
the
recorded
commands
will
be
submitted
for
execution
on
the
GPU
relative
to
other
GPU
workload
on
the
same
or
different
queue.
typedef (GPUBuffer or GPUTexture );MLGPUResource typedef record <DOMString ,MLGPUResource >; [MLNamedGPUResources SecureContext ,Exposed =(Window ,DedicatedWorker )]interface {};MLCommandEncoder
MLCommandEncoder
has
the
following
internal
slots:
-
[[context]]of typeMLContext -
The context of type
MLContextassociated with thisMLCommandEncoder. -
[[implementation]] -
The underlying implementation provided by the User Agent.
7.5.1. Graph Initialization
Record the initialization of the
MLGraph
.
This
is
a
necessary
step
for
optimal
performance
during
graph
execution
as
it
gives
the
platform
an
opportunity
to
prepare
and
optimize
constant
input
data
for
the
subsequent
execution
of
the
graph.
This
method
should
only
be
called
once
per
graph.
partial interface MLCommandEncoder {);undefined (initializeGraph MLGraph ); };graph
-
graph : an
MLGraph. The compiled graph to be initialized with graph constant inputs.
Returns:
undefined
.
The
initializeGraph(graph)
steps
are:
MLGraphBuilder/constant()
method
as
constant
operands
during
graph
construction
time.
7.5.2. Dispatch Execution Commands
Record the
MLGraph
execution
with
the
inputs
MLNamedGPUResources
and
outputs
MLNamedGPUResources
.
partial interface MLCommandEncoder {);undefined (dispatch MLGraph ,graph MLNamedGPUResources ,inputs MLNamedGPUResources ); };outputs
-
graph : an
MLGraph. The compiled graph to be executed. -
inputs : an
MLNamedGPUResources. The resources of inputs. -
outputs : an
MLNamedGPUResources. The pre-allocated resources of required outputs.
Returns:
undefined
.
The
dispatch(graph,
inputs,
outputs)
steps
are:
-
If any of the following requirements are unmet, then throw a "
DataError"DOMExceptionand stop.-
For each key -> value of inputs :
-
graph .
[[inputDescriptors]][ key ] must exist . -
Let inputDesc be graph .
[[inputDescriptors]][ key ]. -
If value is a
GPUBuffer, then:-
value .
sizemust equal to byte length of inputDesc .
-
-
-
For each key -> value of outputs :
-
graph .
[[outputDescriptors]][ key ] must exist . -
Let outputDesc be graph .
[[outputDescriptors]][ key ]. -
If value is a
GPUBuffer, then:-
value .
sizemust equal to byte length of outputDesc .
-
-
-
-
For each key -> value of inputs :
-
Set the input of graph .
[[implementation]]that is associated with key to value .
-
-
For each key -> value of outputs :
-
Set the output of graph .
[[implementation]]that is associated with key to value .
-
-
Issue a compute request of graph .
[[implementation]]. -
If there is an error returned by graph .
[[implementation]], then:-
Throw an "
OperationError"DOMExceptionand stop.
-
-
Return
undefined.
7.5.3. Generate GPU Command Buffer
Complete the recording of ML workload and return a WebGPU-compatible
GPUCommandBuffer
containing
the
recorded
workload.
partial interface MLCommandEncoder {= {});GPUCommandBuffer (finish optional GPUCommandBufferDescriptor = {}); };descriptor
-
descriptor : an optional
GPUCommandBufferDescriptor. Descriptor of the command buffer.
Returns:
GPUCommandBuffer
.
The
finish(descriptor)
method
steps
are:
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Make a request to the underlying platform to complete the recording of the ML workload, given descriptor .
See the related WebGPU steps .
Return a
GPUCommandBuffercontaining the recorded workload.
7.6. The MLGraphBuilder interface
The
MLGraphBuilder
interface
defines
a
set
of
operations
as
identified
by
the
§ 2
Use
cases
that
can
be
composed
into
a
computational
graph.
It
also
represents
the
intermediate
state
of
a
graph
building
session.
typedef record <DOMString ,MLOperand >;MLNamedOperands dictionary {MLBufferResourceView ; = 0; ;required GPUBuffer resource ;unsigned long long offset = 0;unsigned long long size ; };typedef (ArrayBufferView or MLBufferResourceView ); [MLBufferView SecureContext ,Exposed =(Window ,DedicatedWorker )]interface { // Construct the graph builder from the context.MLGraphBuilder (constructor MLContext ); // Create an operand for a graph input.context );MLOperand (input DOMString ,name MLOperandDescriptor ); // Create an operand for a graph constant.descriptor );MLOperand (constant MLOperandDescriptor ,descriptor MLBufferView ); // Create a single-value operand from the specified number of the specified type.bufferView = "float32");MLOperand (constant double ,value optional MLOperandType = "float32"); // Compile the graph up to the specified output operands asynchronously.type Promise <MLGraph >(build MLNamedOperands ); // Compile the graph up to the specified output operands synchronously. [outputs Exposed =(DedicatedWorker )]MLGraph (buildSync MLNamedOperands ); };outputs
MLGraphBuilder
.
build()
and
MLGraphBuilder
.
buildSync()
methods
compile
the
graph
builder
state
up
to
the
specified
output
operands
into
a
compiled
graph
according
to
the
type
of
MLContext
that
creates
it.
Since
this
operation
can
be
costly
in
some
machine
configurations,
the
calling
thread
of
the
MLGraphBuilder
.
buildSync()
method
must
only
be
a
worker
thread
to
avoid
potential
disruption
of
the
user
experience.
When
the
[[contextType]]
of
the
MLContext
is
set
to
default
,
the
compiled
graph
is
initialized
right
before
the
MLGraph
is
returned.
This
graph
initialization
stage
is
important
for
optimal
performance
of
the
subsequent
graph
executions.
See
§ 7.5.1
Graph
Initialization
for
more
detail.
MLBufferResourceView
has
the
following
members:
resource, of type GPUBufferA
GPUBufferobject. Specifies the GPU buffer source.offset, of type unsigned long long , defaulting to0Specifies an
unsigned long longoffset in the buffer source.size, of type unsigned long longSpecifies the
unsigned long longsize of the buffer view.
MLGraphBuilder
has
the
following
internal
slots:
-
[[context]]of typeMLContext -
The context of type
MLContextassociated with thisMLGraphBuilder.
7.6.1.
The
MLGraphBuilder
constructor
The
new
MLGraphBuilder
constructor
steps
are:
-
If this 's relevant global object 's associated Document is not allowed to use the webnn feature, throw a "
SecurityError"DOMExceptionand abort these steps. -
Let context be the first argument.
-
If the validate MLContext steps given context return
false, throw a "TypeError" and abort these steps. -
Set
[[context]]to context .
7.6.2.
The
input()
method
Create
a
named
MLOperand
based
on
a
descriptor,
that
can
be
used
as
an
input.
-
name : a string name of the input.
-
descriptor : an
MLOperandDescriptorobject.
MLOperand
object.
The
input(name,
descriptor)
steps
are:
-
Let name be the first argument.
-
If name is
undefinedor an empty string , then throw a "TypeError"DOMExceptionand stop.
-
-
Let descriptor be the second argument.
-
If descriptor is not an an object that implements
MLOperandDescriptor, then throw a "TypeError"DOMExceptionand stop. -
Assert : If descriptor .
dimensionsdoes not exist , then descriptor defines a scalar input. -
If descriptor .
dimensionsexists :-
If the check dimensions steps given descriptor .
typeand descriptor .dimensionsreturnfalse, throw a "DataError"DOMExceptionand stop. -
If the byte length of descriptor is not supported by the underlying platform, then throw a "
DataError"DOMExceptionand stop.
-
-
-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let operand be the result of invoking the create MLOperand steps with this and descriptor .
If that throws, re-throw the exception and stop.-
Set operand .
[[name]]to name . -
Make a request to the underlying platform
to register operand asto:Create an
input and store a reference to the correspondingimplementation-defined platformobjectinput operand operandImpl given descriptor .Store a reference of operandImpl in operand .
[[operand]].-
Register operand as an input.
Return operand .
7.6.3. The build() method
Build a composed graph up to a given output operand into a computational graph, asynchronously or synchronously.
7.6.3.1.
The
build(outputs)
method
The
build(outputs)
steps
are:
Let promise be a new promise .
Return promise and run the following steps in parallel .
Return the result of invoking
buildSync(outputs)given outputs .-
If that
fails,throws, re-throw the error and stop.
-
7.6.3.2.
The
buildSync(outputs)
method
The
buildSync(outputs)
steps
are:
If outputs is not an instance of
MLNamedOperands, then throw an "TypeError"DOMExceptionand stop.For each element in outputs :
If element .key is not a string , then throw an "
TypeError"DOMExceptionand stop.If element .value is not an instance of
MLOperand, then throw an "TypeError"DOMExceptionand stop.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionandabort these steps.stop.Let graph be a new
MLGraph:Set graph .
[[context]]to this .[[context]].Set graph .
[[outputDescriptors]]to outputs .
-
Make a request to the underlying platform to:
Connect graph to a new implementation-defined graph implementation graphImpl given graph .
Store a reference to graphImpl in graph .
[[implementation]].
-
Make a request to the underlying platform to initialize the graph:
For each operand in outputs :
If operand was created as an input by the underlying platform:
Add operand to graph .
[[inputDescriptors]].Initialize the weights of operand .
If operand was created as a constant by the underlying platform:
Preprocess and optimize the tensor data of operand .
Update graphImpl with operand .
[[operand]].Update graphImpl with operand .
[[operator]].
Return graph .
7.6.3.
7.6.4.
The
constant()
method
Create
a
constant
MLOperand
that
can
be
used
in
MLGraphBuilder
methods.
7.6.3.1.
7.6.4.1.
The
constant(descriptor,
bufferView)
method
-
descriptor : an
MLOperandDescriptorobject -
bufferView : an
MLBufferView
MLOperand
object.
The
constant(descriptor,
bufferView)
steps
are:
-
Let descriptor be the first argument.
-
If descriptor is not an an object that implements
MLOperandDescriptor, then throw a "TypeError"DOMExceptionand stop.-
If the byte length of descriptor is not supported by the underlying platform, then throw a "
DataError"DOMExceptionand stop. -
If the check dimensions steps given descriptor .
typeand descriptor .dimensionsreturnfalse, throw a "DataError"DOMExceptionand stop.
-
-
Let bufferView be the second argument.
-
If invoking validate buffer with descriptor given bufferView and descriptor return
false, then throw a "TypeError"DOMExceptionand stop.
-
-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let operand be the result of invoking the create MLOperand steps with this and descriptor .
If that throws, re-throw the exception and stop.-
Let bytes be the result of invoking [ get a copy of the bytes held by the buffer source ] given bufferView .
-
Make a request to the underlying platform
to registerto:Create an implementation-defined platform operand constantImpl
asto represent atensor constant withconstant, givenbytes as value and storedescriptor .Store a reference
to the corresponding implementation-defined object toof constantImpl in operand .[[operand]].-
If that fails, throw an " OperationError " DOMException and stop.Register operand as a tensor constant with bytes as value.
-
Return operand .
7.6.3.2.
7.6.4.2.
The
constant(value,
type)
method
-
value : a number
-
type : an optional
MLOperandType, by default "float32" .
MLOperand
object.
The
constant(value,
type)
steps
are:
-
If value is not a number , then then throw a "
TypeError"DOMExceptionand stop. -
If type is
undefined, let type be"float32". -
Otherwise, if type is not one of
MLOperandType, then throw a "TypeError"DOMExceptionand stop. -
Let descriptor be a new
MLOperandDescriptor.-
Set descriptor .
typeto type . -
Set descriptor .
dimensionstoundefined.In the case of a scalar constant, descriptor .dimensionsis ignored.
-
-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let operand be the result of invoking the create MLOperand steps with this and descriptor .
If that throws, re-throw the exception and stop.-
Make a request to the underlying platform
to registerto:Create an implementation-defined platform operand constantImpl
asto represent ascalar constant withconstant, givenvalue as value and storedescriptor .Store a reference of
the implementation-defined platform object for the corresponding (scalar or tensor constant) operand toconstantImpl in operand .[[operand]].-
If that throws, re-throw the error and stop.Register operand as a scalar constant with value as value.
-
Return operand .
7.6.4.
7.6.5.
The
batchNormalization()
method
Normalize
the
tensor
values
of
input
features
across
the
batch
dimension
using
[Batch-Normalization]
.
For
each
input
feature,
the
mean
and
variance
values
of
that
feature
supplied
in
this
calculation
as
parameters
are
previously
computed
across
the
batch
dimension
of
the
input
during
the
model
training
phase
of
this
operation.
dictionary {MLBatchNormalizationOptions ; ; = 1; = 1e-5; ;MLOperand scale ;MLOperand bias ;unsigned long axis = 1;float epsilon = 1e-5;MLActivation activation ; };{ ,partial interface MLGraphBuilder {MLOperand (batchNormalization MLOperand ,input MLOperand ,mean MLOperand ,variance optional MLBatchNormalizationOptions = {}); };options
MLBatchNormalizationOptions
has
the
following
members:
scale, of type MLOperandAn
MLOperand. Specifies the 1-D tensor of the scaling values whose length is equal to the size of the input dimension denoted byaxis.bias, of type MLOperandAn
MLOperand. Specifies the 1-D tensor of the bias values whose length is equal to the size of the input dimension denoted byaxis.axis, of type unsigned long , defaulting to1A
longscalar. Specifies the index to the feature count dimension of the input shape for which the mean and variance values are. Its value must be in the range [0, N-1] where N is the rank of input tensor. The default value is 1, corresponding to the channel ( "c" ) dimension in the "nchw" data layout.epsilon, of type float , defaulting to1e-5A
floatscalar. Specifies A small value to prevent computational error due to divide-by-zero.activation, of type MLActivationAn
MLActivationobject. Specifies the optional activation function that immediately follows the normalization operation.
-
input : an
MLOperand. The input N-D tensor. -
mean : an
MLOperand.TheSpecifies the 1-D tensor of the mean values of the input features across the batch whose length is equal to the size of the input dimension denoted byoptions.axis .axis. -
variance : an
MLOperand. The 1-D tensor of the variance values of the input features across the batch whose length is equal to the size of the input dimension denoted byoptions.axis .axis. -
options : an optional
MLBatchNormalizationOptions.TheSpecifies the optional parameters of the operation.
scale
:
Returns:
an
MLOperand
.
The
1-D
batch-normalized
N-D
tensor
of
the
scaling
values
whose
length
is
equal
to
the
size
of
same
shape
as
the
input
dimension
denoted
by
options.axis
.
tensor.
The
MLOperand
batchNormalization()
.
The
1-D
tensor
of
the
bias
values
whose
length
is
equal
to
the
size
of
method
steps
are:
Let input be the first argument. To validate input
dimension denoted by options.axis ., run these substeps:-
axis :If input is not an object that implements, then throw a "unsigned longMLOperandTypeError"DOMExceptionscalar. The index to the feature count dimension of the input shape for which the meanandvariance values are. Its value mustabort these steps.
-
Let mean be
intherange [0, N-1] where N issecond argument, representing a vector with therank ofmoving mean values for inputtensor. When it’s not specified,. To validate mean , run thedefault value is 1.following substeps:-
If mean is not an object that implements
, then throw a "epsilon :MLOperandfloatTypeErrorscalar. A small value to prevent computational error due to divide-by-zero. The default value is 0.00001 when not specified."DOMExceptionand abort these steps. -
If mean .
.activation : an[[descriptor]].MLActivation[[descriptor]]The optional activation function that immediately follows the normalization operation.is not equal with input .Returns: andimensions.MLOperand[[descriptor]]The batch-normalized N-D tensordimensionsfrom which the dimension represented by options .axis is removed, then throw a "TypeError"DOMExceptionand abort these steps.
-
Let variance be the third argument, representing the moving variance values of input .
Let options be the
same shape asfourth argument. To validate options , run these substeps:If options .axis does not exist , let options ."axis" be 1.
If options .axis is not a number between 0 and the rank of input
tensor., then throw a "TypeError"DOMExceptionand abort these steps.-
WhenIf input is a 4-D tensor of the "nchw"orlayout, set options .axis to 1. If input is a 4-D tensor of the "nhwc" layout,
options.axis should beset options .axis to1 or 3 respectively. The axis value designates3.
If any of the
feature or channel count dimensionfollowing sub-steps fail, throw an "OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps with this and descriptor , that may use the same underlying data as input
tensor..-
Make a request to the underlying platform to initialize the batch normalization:
Create an implementation-defined platform operator batchNormImpl for this method, given input , mean , variance and options .
If options .activation exists ,register it as activation to batchNormImpl .
Connect output as output to batchNormImpl .
Return output .
The behavior of this operation when the input tensor is 4-D of the "nchw" layout and the activation is of operator type relu can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
const shape= [ 1 , null , 1 , 1 ]; return builder. relu( builder. add( builder. mul( builder. reshape( options. scale, shape), builder. div( builder. sub( input, builder. reshape( mean, shape)), builder. pow( builder. add( builder. reshape( variance, shape), builder. constant( options. epsilon)), builder. constant( 0.5 )) )), builder. reshape( options. bias, shape)));
7.6.5.
7.6.6.
The
clamp()
method
Clamp
the
input
tensor
element-wise
within
a
range
specified
by
the
minimum
and
maximum
values.
dictionary {MLClampOptions ; ;float ;minValue float ; };maxValue { = {}); = {});partial interface MLGraphBuilder {MLOperand (clamp MLOperand ,operand optional MLClampOptions = {});options MLActivation (clamp optional MLClampOptions = {}); };options
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
if ( options. minValue=== undefined ) { if ( options. maxValue=== undefined ) { return x; } else { return builder. min( x, builder. constant( options. maxValue)); } } else { if ( options. maxValue=== undefined ) { return builder. max( x, builder. constant( options. minValue)); } else { return builder. min( builder. max( x, builder. constant( options. minValue)), builder. constant( options. maxValue)); } }
To check clamp options given options , run the following steps:
If options is not an object that implements
MLClampOptions, then returnfalse.If options .
minValueand options .maxValueare not a numeric type , then then returnfalse.If options .
minValueis greater than options .maxValue, then returnfalse.Return
true.
7.6.6.1.
The
clamp(operand,
options)
method
-
xoperand : anMLOperand. The input tensor. -
options : an optional
MLClampOptions. The optional parameters of the operation.-
minValue : a
floatscalar. Specifies the minimum value of the range. When it is not specified, the clamping is not performed on the lower limit of the range. -
maxValue : a
floatscalar. Specifies the maximum value of the range. When it is not specified, the clamping is not performed on the upper limit of the range.
-
-
an
MLOperand. The output tensor of the same shape asxoperand .
The
clamp(operand,
options)
method
steps
are:
Let operand be the first argument.
Let options be the second argument.
If running the check clamp options steps with options returns
false, then throw a "TypeError"DOMExceptionand abort these steps.
-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given operand .
Make a request to the underlying platform to:
Create an implementation-defined platform operator clampImpl for this method, given options .
minValueand options .minValue.Store a reference of clampImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent clamp output, given output and clampImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect operand .
[[operand]]as input to clampImpl .Connect output .
[[operand]]as output to clampImpl .
Return output .
7.6.6.2.
The
clamp(options)
method
options : an optional
MLClampOptions. The optional parameters of the operation.minValue : a
floatscalar. Specifies the minimum value of the range. When it is not specified, the clamping is not performed on the lower limit of the range.maxValue : a
floatscalar. Specifies the maximum value of the range. When it is not specified, the clamping is not performed on the upper limit of the range.
an
MLActivation. Theactivation functionoperator representing the clamp operation.
The
behavior
of
this
operation
can
clamp(options)
method
steps
are:
Let options be
generically emulated fromtheusage of other operations as follow. However, user agents typically havefirst argument.If running the check clamp options steps with options returns
false, then throw amore efficient implementation for it, therefore its usage is encouraged from"TypeError"DOMExceptionand abort these steps.
Let op be the
performance standpoint. builder builderresult of invoking the create MLActivation steps with"clamp"and options .If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.6.
7.6.7.
The
concat()
method
Concatenates
the
input
tensors
along
a
given
axis.
partial interface MLGraphBuilder {MLOperand (concat sequence <MLOperand >,inputs unsigned long ); };axis
-
inputs : a sequence of
MLOperand. All input tensors must have the same shape, except for the size of the dimension to concatenate on. -
axis : an
unsigned longscalar. The axis that the inputs concatenate along. Its value must be in the range [0, N-1] where N is the rank of input tensors.
Returns:
an
MLOperand
.
The
concatenated
tensor
of
all
the
inputs
along
the
axis
.
The
output
tensor
has
the
same
shape
except
on
the
dimension
that
all
the
inputs
concatenated
along.
The
size
of
that
dimension
is
computed
as
the
sum
of
all
the
input
sizes
of
the
same
dimension.
The
concat(inputs,
axis)
steps
are:
Let inputs be the first argument.
Assert : the type of inputs is sequence of
MLOperandobjects.Assert : the type of axis is
unsigned long.Assert : the shape, i.e.
dimensions) of each operand in inputs is the same, except on the dimension given by axis on which they are concatenated.If any of the following steps fail, then throw a "
DataError"DOMExceptionand stop.If inputs is not an array of objects , fail.
If axis is not a positive integer number , fail.
If axis is greater than or equal to the rank of inputs , fail.
Let desc be inputs [0].
[[descriptor]].Let desc .
dimensions[ axis ] be0.For each index between 0 and the rank of inputs :
If running validate MLOperand given inputs [ index ] and this returns
false, then fail.For each dim between 0 and the rank of inputs [ index ]:
If the shape of each corresponding dimension and type of the operands, except for those of the dimension given by axis , is not the same, fail.If dim is not equal to axis and if inputs [ index ].
dimensions[ dim ] is not equal to inputs [0].dimensions[ dim ], fail.If dim is equal to axis , add to desc .
dimensions[ axis ] the value of inputs [ index ].dimensions[ dim ].
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and desc .
Make a request to the underlying platform to:
Create an implementation-defined platform operator concatImpl for this method, given inputs and axis .
Store a reference of concatImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent output,given output and concatImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect inputs as input to concatImpl .
Connect output .
[[operand]]as output to concatImpl .
Return output .
7.6.7.
7.6.8.
The
conv2d()
method
Compute
a
2-D
convolution
given
4-D
input
and
filter
tensors
enum {MLConv2dFilterOperandLayout ,"oihw" ,"hwio" ,"ohwi" };"ihwo" enum {MLAutoPad ,"explicit" ,"same-upper" };"same-lower" dictionary {MLConv2dOptions ; ; ; = "explicit"; = 1; = "nchw"; = "oihw"; ; ;sequence <unsigned long >padding ;sequence <unsigned long >strides ;sequence <unsigned long >dilations ;MLAutoPad autoPad = "explicit";unsigned long groups = 1;MLInputOperandLayout inputLayout = "nchw";MLConv2dFilterOperandLayout filterLayout = "oihw";MLOperand bias ;MLActivation activation ; };{ = {});partial interface MLGraphBuilder {MLOperand (conv2d MLOperand ,input MLOperand ,filter optional MLConv2dOptions = {}); };options
input
:
an
has
the
MLOperand
MLConv2dOptions
.
The
input
4-D
tensor.
The
logical
shape
is
interpreted
according
to
value
of
options.inputLayout
.
following
members:
-
filter : anMLOperandpadding. The filter 4-D tensor. The logical shape is interpreted according to the value, ofoptions.filterLayout and options.groups . options : an optional MLConv2dOptionstypesequence<unsigned long>. The optional parameters of the operation. -
padding : aA sequence ofunsigned longof length4. The4: [beginning_height, ending_height, beginning_width, ending_width]. Specifies the additional rows and columns added to the beginning and ending of each spatial dimension ofinput , [beginning_height, ending_height, beginning_width, ending_width]. If not present,thevalues are assumed to be [0,0,0,0].convolution input. The default value is [0, 0, 0, 0]. -
strides, of type: asequence<unsigned long> A sequence of
unsigned longof length2. The2: [stride_height, stride_width]. Specifies the stride of the sliding window for each spatial dimension ofinput , [stride_height, stride_width]. If not present,thevalues are assumed to be [1,1].convolution input. The default value is [1, 1].-
dilations, of type: asequence<unsigned long> A sequence of
unsigned longof length2. The2: [dilation_height, dilation_width]. Specifies the dilation factor for each spatial dimensionof input , [dilation_height, dilation_width]. If not present,applied on thevalues are assumed to be [1,1].convolution filter (kernel). The default value is [1, 1].-
autoPad, of type MLAutoPad , defaulting to: an"explicit" An
MLAutoPadstring . Specifies the automatic input padding options.. TheBy default, this argumentThe default value isset to"explicit" , which means that the values in theoptions.paddingpaddingarray should be used for input padding. When the option is set other than "explicit" , the values in theoptions.paddingpaddingarray are ignored.With the "same-upper" option, the padding values are automatically computed such that the additional ending padding of the spatial input dimensions would allow all of the input values in the corresponding dimension to be filtered.
The "same-lower" option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.
-
groups, of type unsigned long , defaulting to: an1 An
unsigned longscalar.TheSpecifies the number of groups that input channels and output channels are dividedinto,into. The defaulttovalue is 1.-
inputLayout, of type MLInputOperandLayout , defaulting to: an"nchw" An
MLInputOperandLayoutstring . Specifies the layout format of the input and output tensor as. The default value is "nchw" . This option specifiesfollow:follows:-
"nchw":"nchw"-
input tensor: [batches, input_channels, height, width]
-
output tensor: [batches, output_channels, height, width]
-
-
"nhwc":"nhwc" :-
input tensor: [batches, height, width, input_channels]
-
output tensor: [batches, height, width, output_channels]
-
-
filterLayout, of type MLConv2dFilterOperandLayout , defaulting to: an"oihw"An
MLConv2dFilterOperandLayoutstring . Specifies the layout format of the filter tensor as follow:. The default value is "oihw" . This option specifies"oihw":-
"oihw" : [output_channels, input_channels/groups, height, width]
"hwio": -
"hwio" : [height, width, input_channels/groups, output_channels]
"ohwi": -
"ohwi" : [output_channels, height, width, input_channels/groups]
"ihwo": -
"ihwo" : [input_channels/groups, height, width, output_channels]
-
bias, of type MLOperand: anAn
MLOperandobject. Specifies the additional 1-D tensor with the shape of [output_channels] whose values are to be added to the convolution result.. The-
activation, of type MLActivation: an An
MLActivationobject. Specifies the optional activation function that immediately follows the convolution operation.. The
input : an
MLOperand. The input 4-D tensor. The logical shape is interpreted according to the value of options .inputLayout.filter : an
MLOperand. The filter 4-D tensor. The logical shape is interpreted according to the value of options .filterLayoutand options .groups.options : an
MLConv2dOptions. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
4-D
tensor
that
contains
the
convolution
result.
The
output
shape
is
interpreted
according
to
the
options.inputLayout
options
.
inputLayout
value.
More
specifically,
the
spatial
dimensions
or
the
sizes
of
the
last
two
dimensions
of
the
output
tensor
for
the
nchw
input
layout
can
be
calculated
as
follow:
output
size
output_size
=
1
+
(input
size
(input_size
-
(filter
size
(filter_size
-
1)
*
dilation
-
1
+
beginning
padding
beginning_padding
+
ending
padding)
ending_padding)
/
stride
The
conv2d(input,
filter,
options)
steps
are:
If input or filter is not an instance of
MLOperand, then then throw a "TypeError"DOMExceptionand stop.Let input_size be the size of input .
[[descriptor]].dimensions.Let filter_size be the size of filter .
[[descriptor]].dimensions.If input_size is not
4, then then throw a "DataError"DOMExceptionand stop.If filter_size is not
4, then then throw a "DataError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If options .
paddingisundefined, set it to[0, 0, 0, 0].If options .
stridesisundefined, set it to[1, 1].Else if options .
strides.size() is not2, then throw a "TypeError"DOMExceptionand stop.If any element in options .
stridesis equal to 0, then throw a "TypeError"DOMExceptionand stop.If options .
dilationsisundefined, set it to[1, 1].If options .
autoPadisundefined, set it to"explicit".If options .
groupsisundefined, set it to1.If options .
inputLayoutisundefined, set it to"nchw".If options .
filterLayoutisundefined, set it to"oihw".If options .
biasexists and it is not an instance ofMLOperand, then then throw a "TypeError"DOMExceptionand stop.If options .
activationexists and it is not an instance ofMLActivation, then then throw a "TypeError"DOMExceptionand stop.Let output_shape be the result of calculating output dimensions based on input, filter, dilation, padding and stride, taking into account options .
inputLayout.Let desc a new
MLOperandDescriptor.Set desc .
typeto input .[[descriptor]].type.Set desc .
dimensionsto output_shape .If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and desc .
Make a request to the underlying platform to:
Create an implementation-defined platform operator conv2dImpl for this method, given options and filter .
If options .
activationexists ,register it as activation to conv2dImpl .
Store a reference of conv2dImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and conv2dImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to conv2dImpl .Connect output .
[[operand]]as output to conv2dImpl .
Return output .
7.6.8.
7.6.9.
The
convTranspose2d()
method
Compute
a
2-D
transposed
convolution
given
4-D
input
and
filter
tensors
enum {MLConvTranspose2dFilterOperandLayout ,"iohw" ,"hwoi" };"ohwi" dictionary {MLConvTranspose2dOptions ; ; ; ; ; = "explicit"; = 1; = "nchw"; = "iohw"; ; ;sequence <unsigned long >padding ;sequence <unsigned long >strides ;sequence <unsigned long >dilations ;sequence <unsigned long >outputPadding ;sequence <unsigned long >outputSizes ;MLAutoPad autoPad = "explicit";unsigned long groups = 1;MLInputOperandLayout inputLayout = "nchw";MLConvTranspose2dFilterOperandLayout filterLayout = "iohw";MLOperand bias ;MLActivation activation ; };{ ,partial interface MLGraphBuilder {MLOperand (convTranspose2d MLOperand ,input MLOperand ,filter optional MLConvTranspose2dOptions = {}); };options
input
:
an
has
the
MLOperand
MLConvTranspose2dOptions
.
The
input
4-D
tensor.
The
logical
shape
is
interpreted
according
to
value
of
options.inputLayout
.
following
members:
-
filter : anMLOperandpadding. The filter 4-D tensor. The logical shape is interpreted according to the value, ofoptions.filterLayout and options.groups . options : an optional MLConvTranspose2dOptionstypesequence<unsigned long>. The optional parameters of the operation. -
padding : aA sequence ofunsigned longof length4. The4: [beginning_height, ending_height, beginning_width, ending_width]. Specifies the additional rows and columns added to the beginning and ending of each spatial dimension ofinput , [beginning_height, ending_height, beginning_width, ending_width]. If not present,thevalues are assumed to be [0,0,0,0].convolution input. The default value is [0, 0, 0, 0]. -
strides, of type: asequence<unsigned long> A sequence of
unsigned longof length2. The2: [stride_height, stride_width]. Specifies the stride of the sliding window for each spatial dimension ofinput , [stride_height, stride_width]. If not present,thevalues are assumed to be [1,1].convolution input. The default value is [1, 1].-
dilations, of type: asequence<unsigned long> A sequence of
unsigned longof length2. The2: [dilation_height, dilation_width]. Specifies the dilation factor for each spatial dimensionof input , [dilation_height, dilation_width]. If not present,applied on thevalues are assumed to be [1,1].convolution filter (kernel). The default value is [1, 1].-
outputPadding, of type: asequence<unsigned long> A sequence of
unsigned longof length 2.TheSpecifies the padding values applied to each spatial dimension of the output tensor.ThisThe explicit padding values are needed to disambiguate the output tensor shape for transposed convolution when the value of theoptions.stridesoptions .stridesis greater than 1.Note that these values are only used to disambiguate output shape when needed; it does not necessarily cause any padding value to be written to the output tensor.
If not specified, the values are assumed to be [0,0].The default values is [0, 0].
-
outputSizes, of type: asequence<unsigned long> A sequence of
unsigned longof length 2.TheSpecifies the sizes of the last two dimensions of the output tensor. When the output sizes are explicitly specified, the output padding values inoptions.outputPaddingoutputPaddingare ignored.If not specified, the output sizes are automatically computed.
-
autoPad, of type MLAutoPad , defaulting to: an"explicit" An
MLAutoPadstring . Specifies the automatic input padding options.. TheBy default, this argumentThe default value isset to"explicit" , which means that the values in theoptions.paddingpaddingarray should be used for input padding.When the option is set other than "explicit" , the values in the
options.paddingpaddingarray are ignored.With the "same-upper" option, the padding values are automatically computed such that the additional ending padding of the spatial input dimensions would allow all of the input values in the corresponding dimension to be filtered.
The "same-lower" option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.
-
groups, of type unsigned long , defaulting to: an1 An
unsigned longscalar.TheSpecifies the number of groups that input channels and output channels are dividedinto,into. The defaulttovalue is 1.-
inputLayout, of type MLInputOperandLayout , defaulting to: an"nchw" An
MLInputOperandLayoutstring . Specifies the layout format of the input and output tensor as. The default value is "nchw" . This option specifiesfollow:follows:-
"nchw":"nchw"-
input tensor: [batches, input_channels, height, width]
-
output tensor: [batches, output_channels, height, width]
-
-
"nhwc":"nhwc" :-
input tensor: [batches, height, width, input_channels]
-
output tensor: [batches, height, width, output_channels]
-
-
filterLayout, of type MLConvTranspose2dFilterOperandLayout , defaulting to: an"iohw"An
MLConvTranspose2dFilterOperandLayoutstring . Specifies the layout format of the filter tensor as follow:. The default value is "iohw" . This option specifies"iohw":-
"iohw" : [input_channels, output_channels/groups, height, width]
"hwoi": -
"hwoi" : [height, width, output_channels/groups, input_channels]
"ohwi": -
"ohwi" : [output_channels/groups, height, width, input_channels]
-
bias, of type MLOperand: anAn
MLOperandobject. Specifies the additional 1-D tensor with the shape of [output_channels] whose values are to be added to the. Thetransposedconvolution result.-
activation, of type MLActivation: an An
MLActivationobject. Specifies the optional activation function that immediately follows the. Thetransposedconvolution operation.
input : an
MLOperand. The input 4-D tensor. The logical shape is interpreted according to the value of options .inputLayout.filter : an
MLOperand. The filter 4-D tensor. The logical shape is interpreted according to the value of options .filterLayoutandgroups.options : an optional
MLConvTranspose2dOptions.
Returns:
an
MLOperand
.
The
output
4-D
tensor
that
contains
the
transposed
convolution
result.
The
output
shape
is
interpreted
according
to
the
options.inputLayout
options
.
inputLayout
value.
More
specifically,
unless
the
options.outputSizes
options
.
outputSizes
values
are
explicitly
specified,
the
options.outputPadding
options
.
outputPadding
may
be
needed
to
compute
the
spatial
dimension
values
of
the
output
tensor
as
follow:
output
size
output_size
=
(input
size
(input_size
-
1)
*
stride
+
(filter
size
(filter_size
-
1)
*
dilation
+
1
-
beginning
padding
beginning_padding
-
ending
padding
ending_padding
+
output_padding
The
convTranspose2d(input,
filter,
options)
steps
are:
If input or filter is not an instance of
MLOperand, then then throw a "TypeError"DOMExceptionand stop.Let input_size be the size of input .
[[descriptor]].dimensions.Let filter_size be the size of filter .
[[descriptor]].dimensions.If input_size is not
4, then then throw a "DataError"DOMExceptionand stop.If filter_size is not
4, then then throw a "DataError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If options .
paddingisundefined, set it to[0, 0, 0, 0].If options .
stridesisundefined, set it to[1, 1].Else if options .
strides.size() is not2, then throw a "TypeError"DOMExceptionand stop.If any element in options .
stridesis equal to 0, then throw a "TypeError"DOMExceptionand stop.If options .
dilationsisundefined, set it to[1, 1].If options .
outputPaddingisundefined, set it to[0, 0].If options .
autoPadisundefined, set it to"explicit".If options .
groupsisundefined, set it to1.If options .
inputLayoutisundefined, set it to"nchw".If options .
filterLayoutisundefined, set it to"iohw".If options .
biasexists and it is not an instance ofMLOperand, then then throw a "TypeError"DOMExceptionand stop.If options .
activationexists and it is not an instance ofMLActivation, then then throw a "TypeError"DOMExceptionand stop.Let output_shape be the result of calculating output dimensions based on input , filter , options .
dilations, options .paddingand options .strides, taking into account options .inputLayout.Let desc a new
MLOperandDescriptor.Set desc .
typeto input .[[descriptor]].type.Set desc .
dimensionsto output_shape .If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and desc .
Make a request to the underlying platform to:
Create an implementation-defined platform operator convTranspose2dImpl for this method, given options and filter .
If options .
activationexists ,register it as activation to convTranspose2dImpl .
Store a reference of convTranspose2dImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and convTranspose2dImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to convTranspose2dImpl .Connect output .
[[operand]]as output to convTranspose2dImpl .
-
Return output .
7.6.9.
7.6.10.
Element-wise
binary
operations
Compute
the
element-wise
binary
addition,
subtraction,
multiplication,
division,
maximum
and
minimum
of
the
two
input
tensors.
The element-wise binary operations will be broadcasted according to [numpy-broadcasting-rule] . The rank of the output tensor is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.
partial interface MLGraphBuilder {MLOperand (add MLOperand ,a MLOperand );b MLOperand (sub MLOperand ,a MLOperand );b MLOperand (mul MLOperand ,a MLOperand );b MLOperand (div MLOperand ,a MLOperand );b MLOperand (max MLOperand ,a MLOperand );b MLOperand (min MLOperand ,a MLOperand );b MLOperand (pow MLOperand ,a MLOperand ); };b
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
result
of
element-wise
binary
operation
of
the
two
input
tensors.
-
add : Add the values of the two input tensors, element-wise.
-
sub : Subtract the values of the second input tensor from the values of the first input tensor, element-wise.
-
mul : Multiply the values of the two input tensors, element-wise.
-
div : Divide the values of the first input tensor with the values of the second tensor, element-wise.
-
max : Select the greater values of the two input tensors, element-wise.
-
min : Select the lesser values of the two input tensors, element-wise.
-
pow : Compute the values of the values of the first input tensor to the power of the values of the second input tensor, element-wise.
To create element-wise binary operation given op , a and b , run the following steps:
Assert : op is one of "add", "sub", "mul", "div", "max", "min", "pow".
If a or b is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If a .
[[descriptor]].typeis not equal to b .[[descriptor]].type, then throw a "DataError"DOMExceptionand stop.Let descriptor be a new
MLOperandDescriptor.Set descriptor .
dimensions.typeto a .[[descriptor]].type.Let descriptor .
dimensionsbe the result of running the broadcast-shapes steps given a .[[descriptor]].dimensionsand b .[[descriptor]].dimensions.If that throws an error, re-throw the error and stop.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and descriptor .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the binary operation op , given a and b .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect a .
[[operand]]and b .[[operand]]as inputs to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
To broadcast shapes given shape1 and shape2 , run the following steps:
Assert : The type of shape1 and shape2 is
sequence of unsigned long.Let output be the result of invoking the implementation-defined shape broadcast on shape1 and shape2 .
If that fails, throw a "
DataError"DOMExceptionand stop.
Return output .
The most common implementation is that two shapes are compatible, when each of their corresponding dimensions are equal, or one of them is 1. The output shape consists of the maximum of the corresponding dimensions.
The element-wise binary operation algorithms invoke the create element-wise binary operation steps as follows.
add(a,
b)
steps
are:
Let output be the result of running the create element-wise binary operation given "add", a and b .
If that throws an error, then re-throw the error and stop.
Return output .
The
sub(a,
b)
steps
are:
Let output be the result of running the create element-wise binary operation given "sub", a and b .
If that throws an error, then re-throw the error and stop.
Return output .
The
mul(a,
b)
steps
are:
Let output be the result of running the create element-wise binary operation given "mul", a and b .
If that throws an error, then re-throw the error and stop.
Return output .
The
div(a,
b)
steps
are:
Let output be the result of running the create element-wise binary operation given "div", a and b .
If that throws an error, then re-throw the error and stop.
Return output .
The
max(a,
b)
steps
are:
Let output be the result of running the create element-wise binary operation given "max", a and b .
If that throws an error, then re-throw the error and stop.
Return output .
The
min(a,
b)
steps
are:
Let output be the result of running the create element-wise binary operation given "min", a and b .
If that throws an error, then re-throw the error and stop.
Return output .
The
pow(a,
b)
steps
are:
Let output be the result of running the create element-wise binary operation given "pow", a and b .
If that throws an error, then re-throw the error and stop.
Return output .
7.6.10.
7.6.11.
Element-wise
unary
operations
Compute
the
element-wise
unary
operation
for
input
tensor.
partial interface MLGraphBuilder {MLOperand (abs MLOperand );input MLOperand (ceil MLOperand );input MLOperand (cos MLOperand );input MLOperand (exp MLOperand );input MLOperand (floor MLOperand );input MLOperand (log MLOperand );input MLOperand (neg MLOperand );input MLOperand (sin MLOperand );input MLOperand (tan MLOperand ); };input
-
xinput : anMLOperand. The input tensor.
Returns:
an
MLOperand
.
The
output
tensor
that
contains
the
result
of
element-wise
unary
operation
of
the
input
tensor.
The
shape
of
the
output
tensor
is
the
same
as
the
shape
of
input
tensor.
-
abs : Compute the absolute value of the input tensor, element-wise.
-
ceil : Compute the ceiling of the input tensor, element-wise.
-
cos : Compute the cosine of the input tensor, element-wise.
-
exp : Compute the exponential of the input tensor, element-wise.
-
floor : Compute the floor of the input tensor, element-wise.
-
log : Compute the natural logarithm of the input tensor, element-wise.
-
neg : Compute the numerical negative value of the input tensor, element-wise.
-
sin : Compute the sine of the input tensor, element-wise.
-
tan : Compute the tangent of the input tensor, element-wise.
To create element-wise unary operation given op and input , run the following steps:
Assert : op is one of "abs", "ceil", "cos", "exp", "floor", "log", "neg", "sin", "tan".
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.Let kind be
"output".Let descriptor be a new
MLOperandDescriptor.If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the unary operation op .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
The element-wise unary operation algorithms invoke the create element-wise unary operation steps as follows.
abs(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "abs" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
ceil(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "ceil" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
cos(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "cos" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
exp(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "exp" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
floor(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "floor" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
log(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "log" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
neg(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "neg" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
sin(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "sin" and input .
If that throws an error, then re-throw the error and stop.
Return output .
The
tan(input)
steps
are:
Let output be the result of running the create element-wise unary operation given "tan" and input .
If that throws an error, then re-throw the error and stop.
Return output .
7.6.11.
7.6.12.
The
elu()
method
Calculate
the
exponential
linear
unit
function
(ELU)
on
the
input
tensor
element-wise.
The
calculation
follows
the
expression
max(0,
x)
+
alpha
*
(exp(min(0,
x))
-
1)
.
dictionary {MLEluOptions = 1;float = 1; };alpha { = {}); = {});partial interface MLGraphBuilder {MLOperand (elu MLOperand ,input optional MLEluOptions = {});options MLActivation (elu optional MLEluOptions = {}); };options
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. add( builder. max( builder. constant( 0 ), x), builder. mul( builder. constant( options. alpha), builder. sub( builder. exp( builder. min( builder. constant( 0 ), x)), builder. constant( 1 ))));
To check ELU options given options , run the following steps:
If options is not an object that implements
MLEluOptions, then returnfalse.If options .
alphaisundefined, set options .alphato1.Else if options .
alphais not a numeric type , then then returnfalse.Return
true.
7.6.12.1.
The
elu(input,
options)
method
-
xinput : anMLOperand. The input tensor. -
options : an optional
MLEluOptions. The optional parameters of the operation.-
alpha : a
floatscalar multiplier, default to 1.
-
Returns:
-
an
MLOperand. The output tensor of the same shape as x .
The
elu(input,
options)
method
steps
are:
Let input be the first argument.
Let options be the second argument.
If running the check ELU options steps with options returns
false, then throw a "TypeError"DOMExceptionand abort these steps.
-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the ELU operation, given options .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.12.2.
The
elu(options)
method
options : an optional
MLEluOptions. The optional parameters of the operation.alpha : a
floatscalar multiplier, default to 1.
Returns:
an
MLActivation. The activation function representing the elu operation.
The
behavior
of
this
operation
can
elu(options)
method
steps
are:
Let options be
generically emulated fromtheusage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usagefirst argument.If options is
encouraged fromundefined, let options be a newMLEluOptionsobject.If running the
performance standpoint. builder builder builder builder builder buildercheck ELU options steps with options returnsfalse, then throw a "TypeError"DOMExceptionand abort these steps.
Let op be the result of invoking the create MLActivation steps with
"elu"and options .Return op .
7.6.12.
7.6.13.
The
gemm()
method
Calculate
the
general
matrix
multiplication
of
the
Basic
Linear
Algebra
Subprograms
.
The
calculation
follows
the
expression
alpha
*
A
*
B
+
beta
*
C
,
where
A
is
a
2-D
tensor
with
shape
[M,
K]
or
[K,
M],
B
is
a
2-D
tensor
with
shape
[K,
N]
or
[N,
K],
and
C
is
broadcastable
to
the
shape
[M,
N].
A
and
B
may
optionally
be
transposed
prior
to
the
calculation.
dictionary {MLGemmOptions ; = 1.0; = 1.0; ; ;MLOperand c ;float alpha = 1.0;float beta = 1.0;boolean aTranspose =false ;boolean bTranspose =false ; };{ = {});partial interface MLGraphBuilder {MLOperand (gemm MLOperand ,a MLOperand ,b optional MLGemmOptions = {}); };options
MLGemmOptions
has
the
following
members:
c, of type MLOperandAn
MLOperand. Specifies the third input tensor. It is either a scalar, or of the shape that is unidirectionally broadcastable to the shape [M, N] according to [numpy-broadcasting-rule] . When it is not specified, the computation is done as if c is a scalar0.0.alpha, of type float , defaulting to1.0A
floatscalar multiplier for the first input.beta, of type float , defaulting to1.0aTranspose, of type boolean , defaulting tofalseA
booleanindicating if the first input should be transposed prior to calculating the output.bTranspose, of type boolean , defaulting tofalseA
booleanindicating if the second input should be transposed prior to calculating the output.
-
a : an
MLOperand. The first input 2-D tensor with shape [M, K] if aTranspose is false, or [K, M] if aTranspose is true. -
b : an
MLOperand. The second input 2-D tensor with shape [K, N] if bTranspose is false, or [N, K] if bTranspose is true. -
options : an optional
MLGemmOptions. The optional parameters of the operation.
c
:
Returns:
an
MLOperand
.
The
third
input
tensor.
It
is
either
a
scalar,
or
output
2-D
tensor
of
the
shape
[M,
N]
that
contains
the
calculated
product
of
all
the
inputs.
The
gemm(a,
b,
options)
steps
are:
If a or b is
unidirectionally broadcastablenot an instance ofMLOperand, then throw a "TypeError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If options .
alphaisundefined, set it tothe shape [M, N] according1.0.If options .
betaisundefined, set it to[numpy-broadcasting-rule] . When1.0.If options .
aTransposeisundefined, set it tofalse.If options .
aTransposeis notspecified, the computation is done as if cfalse, set it totrue.If options .
bTransposeisa scalar 0.0.undefined, set it tofalse.-
If options .
is notalpha :bTransposefalse, set it totrue. Let shapeA be a .
.float[[descriptor]]dimensionsscalar multiplier forand sizeA thefirst input, default to 1.0.size of shapeA .-
beta :Let shapeB be a ..float[[descriptor]]dimensionsscalar multiplier forand sizeB thethird input, default to 1.0.size of shapeB . -
aTranspose :If sizeA is not2or sizeB is not2, then throw a "booleanDataErrorindicating if the first input should"DOMExceptionand stop. If options .
aTransposeistrue, then let shapeA betransposed prior to calculatingtheoutput, default to false.reverse array of shapeA .-
If options .
bTransposeis:true, then let shapeB be the reverse array of shapeB . If shapeA [1] is not equal to shapeB [0], then throw a "
booleanDataErrorindicating if"DOMExceptionand stop.If options .
cexists and is not unidirectionally broadcastable to thesecond input should be transposed priorshape [ shapeA [0], shapeB [1]] according tocalculatingtheoutput, default[numpy-broadcasting-rule] , then throw a "DataError"DOMExceptionand stop.Type compatibility between a , b and options .ccan be also checked.Let desc a new
MLOperandDescriptor.Set desc .
dimensionstofalse.[ shapeA [0], shapeB [1]].-
Set desc .
to a .Returns: antype.MLOperand[[descriptor]]Thetype. If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output
2-D tensorbe the result ofshape [M, N] that containsinvoking thecalculated productcreate MLOperand steps given this and desc .Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the GEMM operation, given options .
Store a reference of
allopImpl in output .[[operator]].Create an implementation-defined platform operand outputImpl to represent the
inputs.output, given output and opImpl .Store a reference to outputImpl in output .
[[operand]].
Connect a .
[[operand]]and b .[[operand]]as inputs to opImpl .Connect output .
[[operand]]as output to opImpl .
-
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
if ( options. aTranspose) a= builder. transpose( a); if ( options. bTranspose) b= builder. transpose( b); let ab= builder. matmul( builder. mul( builder. constant( options. alpha), a), b); return ( c? builder. add( ab, builder. mul( builder. constant( options. beta), c)) : ab);
7.6.13.
7.6.14.
The
gru()
method
Gated
Recurrent
Unit
[GRU]
recurrent
network
uses
an
update,
reset,
and
new
gate
to
compute
the
output
state
that
rolls
into
the
output
across
the
temporal
sequence
of
the
network.
enum {MLGruWeightLayout , // update-reset-new gate ordering"zrn" // reset-update-new gate ordering };"rzn" enum {MLRecurrentNetworkDirection ,"forward" ,"backward" };"both" dictionary {MLGruOptions ; ; ; ; ; = "forward"; = "zrn"; ;MLOperand bias ;MLOperand recurrentBias ;MLOperand initialHiddenState ;boolean resetAfter =true ;boolean returnSequence =false ;MLRecurrentNetworkDirection direction = "forward";MLGruWeightLayout layout = "zrn";sequence <MLActivation >activations ; };{ , ,partial interface MLGraphBuilder {sequence <MLOperand >(gru MLOperand ,input MLOperand ,weight MLOperand ,recurrentWeight unsigned long ,steps unsigned long ,hiddenSize optional MLGruOptions = {}); };options
input
:
an
has
the
following
members:
MLOperand
MLGruOptions
.
The
input
3-D
tensor
of
shape
[steps,
batch_size,
input_size].
-
weight : anbias, of type MLOperand. The 3-D input weight tensor of shape [num_directions, 3 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout argument. -
recurrentWeight : anAnMLOperand.The 3-D recurrent weightSpecifies the 2-D input bias tensor of shape [num_directions, 3 *hidden_size,hidden_size]. The ordering of theweightbias vectors in the second dimension of the tensor shape is specified according to theoptions.layout argument. steps : anunsigned longlayoutscalar. The number of time steps in the recurrent network. The value must be greater than 0.argument.hiddenSize : an -
unsigned longrecurrentBiasscalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state.options : an optional MLGruOptions, of type MLOperand. The optional parameters of the operation. -
bias : anAnMLOperand.TheSpecifies the 2-Dinputrecurrent bias tensor of shape [num_directions, 3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to theoptions.layout argument. recurrentBias : anargument.MLOperandlayout. The 2-D recurrent bias tensor of shape [num_directions, 3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the options.layout -
initialHiddenState, of type MLOperand: an An
MLOperand. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified,it’s assumed to beimplementations SHOULD use a tensor filled with zero.-
resetAfter, of type boolean , defaulting to: atrue A
booleanindicating whether to apply the reset gate after or before matrix multiplication.Default to true.The default value istrue.-
returnSequence, of type boolean , defaulting to: afalse A
booleanindicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step.Default to false.The default value isfalse.-
direction, of type MLRecurrentNetworkDirection , defaulting to: an"forward" An
MLRecurrentNetworkDirection.TheSpecifies the processing direction of the input sequence. When set to"both", the size of the first dimension of the weight and the bias tensor shapes must be,2,2, and the input is processed in both directions.-
layout, of type MLGruWeightLayout , defaulting to: an"zrn" An
MLGruWeightLayout. The ordering of the weight and bias vectors for the internal gates of GRU, specifically theupdate (z),,reset (r), and,new (n)gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is"zrn"..-
activations, of type sequence< MLActivation >: a A sequence of
MLActivation.ASpecifies a pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified,it’s assumed to beimplementations SHOULD use the the pair of sigmoid ("sigmoid") and the hyperbolic tangent ("tanh")functionfunctions, respectively.
input : an
MLOperand. The input 3-D tensor of shape [steps, batch_size, input_size].weight : an
MLOperand. The 3-D input weight tensor of shape [num_directions, 3 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options .layoutargument.recurrentWeight : an
MLOperand. The 3-D recurrent weight tensor of shape [num_directions, 3 * hidden_size, hidden_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options .layoutargument.steps : an
unsigned longscalar. The number of time steps in the recurrent network. The value must be greater than 0.hiddenSize : an
unsigned longscalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state.options : an optional
MLGruOptions. The optional parameters of the operation.
Returns:
a
sequence
of
MLOperand
.
The
first
element
of
the
sequence
is
a
3-D
tensor
of
shape
[num_directions,
batch_size,
hidden_size],
the
cell
output
from
the
last
time
step
of
the
network.
Additionally,
if
options.returnSequence
options
.
returnSequence
is
set
to
true,
true
,
the
second
element
is
the
4-D
output
tensor
of
shape
[steps,
num_directions,
batch_size,
hidden_size]
containing
every
cell
outputs
from
each
time
step
in
the
temporal
sequence.
The
gru(input,
weight,
recurrentWeight,
steps,
hiddenSize,
options)
steps
are:
If input , weight or recurrentWeight is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If the rank of input or weight is not
3, then throw a "DataError"DOMExceptionand stop.If the rank of weight or recurrentWeight is not
2, then throw a "DataError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
2, then throw a "DataError"DOMExceptionand stop.
If options .
recurrentBiasexists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
2, then throw a "DataError"DOMExceptionand stop.
If options .
exists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
3, then throw a "DataError"DOMExceptionand stop.
If options .
resetAfterisundefined, set it totrue.If options .
returnSequenceisundefined, set it tofalse.If options .
directionisundefined, set it to"forward".If options .
directionis not one ofMLRecurrentNetworkDirection, then throw a "TypeError"DOMExceptionand stop.If options .
layoutisundefined, set it to"zrn".If options .
layoutis not one ofMLGruWeightLayout, then throw a "TypeError"DOMExceptionand stop.If options .
activationsexists and is not an array of size2, or if any of its elements is not an instance ofMLActivation, then throw a "TypeError"DOMExceptionand stop.If steps is not a number or it is
0, then throw a "TypeError"DOMExceptionand stop.Let output be an empty sequence of
MLOperandobjects.If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for
"gru", given weight , recurrentWeight , steps , hiddenSize and options as parameters.
Connect input .
[[operand]]as input to opImpl .Connect output as output to opImpl .
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follows. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
const numDirections= ( options. direction== "both" ? 2 : 1 ); let hiddenState= options. initialHiddenState; if ( ! hiddenState) { const desc= { type: 'float32' , dimensions: [ numDirections, 1 , hiddenSize] }; const totalSize= numDirections* hiddenSize; hiddenState= builder. constant( desc, new Float32Array( totalSize). fill( 0 )); } let sequence= null ; let currentWeight= []; let currentRecurrentWeight= []; let currentBias= []; let currentRecurrentBias= []; for ( let dir= 0 ; dir< numDirections; ++ dir) { currentWeight. push( builder. squeeze( builder. slice( weight, [ dir, 0 , 0 ], [ 1 , 3 * hidden_size, input_size]), { axes: [ 0 ] })); currentRecurrentWeight. push( builder. squeeze( builder. slice( recurrentWeight, [ dir, 0 , 0 ], [ 1 , 3 * hidden_size, hidden_size]), { axes: [ 0 ] })); currentBias. push( options. bias? ( builder. squeeze( builder. slice( options. bias, [ dir, 0 ], [ 1 , 3 * hidden_size]), { axes: [ 0 ] })) : null ); currentRecurrentBias. push( options. recurrentBias? ( builder. squeeze( builder. slice( options. recurrentBias, [ dir, 0 ], [ 1 , 3 * hidden_size]), { axes: [ 0 ] })) : null ); } for ( let step= 0 ; step< steps; ++ step) { let currentHidden= []; let currentOutput= null ; for ( let dir= 0 ; dir< numDirections; ++ dir) { currentHidden. push( builder. squeeze( builder. slice( hiddenState, [ dir, 0 , 0 ], [ 1 , batch_size, hidden_size]), { axes: [ 0 ] })); } for ( let dir= 0 ; dir< numDirections; ++ dir) { let slice= ( dir== 1 || options. direction== "backward" ? steps- step- 1 : step); let currentInput= builder. squeeze( builder. slice( input, [ slice, 0 , 0 ], [ 1 , batch_size, input_size]), { axes: [ 0 ] }); let result= builder. reshape( builder. gruCell( currentInput, currentWeight[ dir], currentRecurrentWeight[ dir], currentHidden[ dir], hiddenSize, { bias: currentBias[ dir], recurrentBias: currentRecurrentBias[ dir], resetAfter: options. resetAfter, layout: options. layout, activations: options. activations}), [ 1 , null , hiddenSize]); currentOutput= ( currentOutput? builder. concat([ currentOutput, result], 0 ) : result); } hiddenState= currentOutput; if ( options. returnSequence) { currentOutput= builder. reshape( currentOutput, [ 1 , numDirections, null , hiddenSize]); sequence= ( sequence? builder. concat([ sequence, currentOutput], 0 ) : currentOutput); } } return ( sequence? [ hiddenState, sequence] : [ hiddenState]);
7.6.14.
7.6.15.
The
gruCell()
method
A
single
time
step
of
the
Gated
Recurrent
Unit
[GRU]
recurrent
network
using
an
update
gate
and
a
reset
gate
to
compute
the
hidden
state
that
rolls
into
the
output
across
the
temporal
sequence
of
a
recurrent
network.
dictionary {MLGruCellOptions ; ; ; = "zrn"; ;MLOperand bias ;MLOperand recurrentBias ;boolean resetAfter =true ;MLGruWeightLayout layout = "zrn";sequence <MLActivation >activations ; };{ , ,partial interface MLGraphBuilder {MLOperand (gruCell MLOperand ,input MLOperand ,weight MLOperand ,recurrentWeight MLOperand ,hiddenState unsigned long ,hiddenSize optional MLGruCellOptions = {}); };options
MLGruCellOptions
has
the
following
members:
bias, of type MLOperandAn
MLOperand. Specifies the 1-D input bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to thelayoutargument.recurrentBias, of type MLOperandAn
MLOperand. Specifies the 1-D recurrent bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to thelayoutargument.resetAfter, of type boolean , defaulting totrueA
booleanindicating whether to apply the reset gate after or before matrix multiplication. The default value istrue.layout, of type MLGruWeightLayout , defaulting to"zrn"An
MLGruWeightLayout. The ordering of the weight and bias vectors for the internal gates of GRU, specifically theupdate (z),reset (r), andnew (n)gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is"zrn".activations, of type sequence< MLActivation >A sequence of
MLActivation. Specifies a pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, implementations SHOULD use the the pair of sigmoid ("sigmoid") and the hyperbolic tangent ("tanh") functions, respectively.
-
input : an
MLOperand. The input 2-D tensor of shape [batch_size, input_size]. -
weight : an
MLOperand. The 2-D input weight tensor of shape [3 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument. -
recurrentWeight : an
MLOperand. The 2-D recurrent weight tensor of shape [3 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument. -
hiddenState : an
MLOperand. The 2-D input hidden state tensor of shape [batch_size, hidden_size]. -
hiddenSize : an
unsigned longscalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state. -
options : an optional
MLGruCellOptions. The optional parameters of the operation.
bias
:
Returns:
an
MLOperand
.
The
1-D
input
bias
2-D
tensor
of
shape
[3
*
hidden_size].
The
ordering
of
[batch_size,
hidden_size],
the
bias
vectors
in
cell
output
hidden
state
of
a
single
time
step
of
the
first
dimension
recurrent
network.
The
gruCell(input,
weight,
recurrentWeight,
hiddenState,
hiddenSize,
options)
steps
are:
If input , weight or recurrentWeight is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If the
tensor shaperank of input or weight isspecified according tonot3, then throw a "DataError"DOMExceptionand stop.If the
options.layout argument.rank of weight or recurrentWeight is not2, then throw a "DataError"DOMExceptionand stop.If options is
undefined, let options be an empty object .-
If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
1, then throw a "DataError"DOMExceptionand stop.
If options .
recurrentBiasexists .:If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
1, then throw a "DataError"DOMExceptionand stop.
If options .
resetAfterisundefined, set it totrue.The 1-D recurrent bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape-
If options .
layoutisspecified accordingundefined, set it tothe options.layout argument."zrn". -
If options .
is not one ofresetAfter :layoutMLGruWeightLayout, then throw a "booleanTypeErrorindicating whether to apply the reset gate after"DOMExceptionand stop. If options .
activationsexists and is not an array of size2, orbefore matrix multiplication. Defaultif any of its elements is not an instance ofMLActivation, then throw a "TypeError"DOMExceptionand stop.Let desc a new
MLOperandDescriptor.Set desc .
dimensionstotrue.[ input .dimensions[0], hiddenSize ].-
Set desc .
to input .layout : antype.MLGruWeightLayout[[descriptor]]The orderingtype. If any of the
weightfollowing sub-steps fail, throw an "OperationError"DOMExceptionandbias vectors forstop.Let output be the
internal gatesresult ofGRU, specificallyinvoking theupdate (z) , reset (r) ,create MLOperand steps given this andnew (n) gate, as indicated in the first dimension ofdesc .Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for
"gruCell", given weight , recurrentWeight , hiddenState , hiddenSize andbias tensor shapes. When not specified, the default layout is "zrn" .options as parameters.-
activations :Store asequencereference of opImpl in output ..MLActivation[[operator]]A pair of activation functions with the first function used for the update (z) and reset (r) gate, and the second used for the new (n) gate. When not specified, it’s default -
Create an implementation-defined platform operand outputImpl to represent the
sigmoid ( "sigmoid" )output, given output andthe hyperbolic tangent ( "tanh" ) function respectively.opImpl . -
Returns: anStore a reference to outputImpl in output ..MLOperand[[operand]]The 2-D tensor of shape [batch_size, hidden_size], the cell
-
Connect input .
[[operand]]as input to opImpl . Connect output
hidden state of a single time step of the recurrent network..[[operand]]as output to opImpl .
Return output .
The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default "zrn" layout, and the activation functions of the update/reset gate and new gate are of the operator types sigmoid and tanh respectively.
const one= builder. constant( 1 ); const zero= builder. constant( 0 ); // update gate (z) let z= builder. sigmoid( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 0 ], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 0 ], [ hiddenSize]) : zero) ), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 0 , 0 ], [ hiddenSize, input_size])) ), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 0 , 0 ], [ hiddenSize, hidden_size])) ) ) ) ); // reset gate (r) let r= builder. sigmoid( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ hiddenSize], [ hiddenSize]) : zero) ), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ hiddenSize, 0 ], [ hiddenSize, input_size])) ), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ hiddenSize, 0 ], [ hiddenSize, hidden_size])) ) ) ) ); // new gate (n) let n; if ( resetAfter) { n= builder. tanh( builder. add( ( options. bias? builder. slice( options. bias, [ 2 * hiddenSize], [ hiddenSize]) : zero), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 2 * hiddenSize, 0 ], [ hiddenSize, input_size])) ), builder. mul( r, builder. add( ( options. recurrentBias? builder. slice( options. recurrentBias, [ 2 * hiddenSize], [ hiddenSize]) : zero), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 2 * hiddenSize, 0 ], [ hiddenSize, hidden_size])) ) ) ) ) ) ); } else { n= builder. tanh( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 2 * hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 2 * hiddenSize], [ hiddenSize]) : zero) ), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 2 * hiddenSize, 0 ], [ hiddenSize, input_size])) ), builder. matmul( builder. mul( r, hiddenState), builder. transpose( builder. slice( recurrentWeight, [ 2 * hiddenSize, 0 ], [ hiddenSize, hidden_size])) ) ) ) ); } // compute the new hidden state return builder. add( builder. mul( z, hiddenState), builder. mul( n, builder. sub( one, z)));
7.6.15.
7.6.16.
The
hardSigmoid()
method
Calculate
the
non-smooth
hard
sigmoid
function
on
the
input
tensor,
used
dictionary {MLHardSigmoidOptions = 0.2; = 0.5;float alpha = 0.2;float beta = 0.5; };{ = {}); = {});partial interface MLGraphBuilder {MLOperand (hardSigmoid MLOperand ,input optional MLHardSigmoidOptions = {});options MLActivation (hardSigmoid optional MLHardSigmoidOptions = {}); };options
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. max( builder. min( builder. add( builder. mul( builder. constant( options. alpha), x), builder. constant( options. beta)), builder. constant( 1 )), builder. constant( 0 ));
MLHardSigmoidOptions
has
the
following
members:
alpha, of type float , defaulting to0.2A
floatscalar multiplier. The default value is0.2.beta, of type float , defaulting to0.5A
floatpoint scalar addition. The default value is0.5.
To check hard-sigmoid options given options , run the following steps:
If options is not an object that implements
MLHardSigmoidOptions, then returnfalse.If options .
alphaisundefined, set options .alphato0.2.Else if options .
alphais not a numeric type , then then returnfalse.If options .
betaisundefined, set options .betato0.5.Else if options .
betais not a numeric type , then then returnfalse.Return
true.
7.6.16.1.
The
hardSigmoid(input,
options)
method
-
xinput : anMLOperand. The input tensor. -
options : an optional
MLHardSigmoidOptions. The optional parameters of the operation.
Returns:
-
an
MLOperand. The output tensor of the same shape asalpha :input .
hardSigmoid(input,
options)
method
steps
are:
Let input be the first argument.
Let options be the second argument.
If running the check hard-sigmoid options steps with options returns
false, then throw a "floatTypeErrorscalar multiplier, default"DOMExceptionand abort these steps.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to
0.2.the underlying platform to:Let opImpl be an implementation-defined platform operator for the hard sigmoid operation, given options .
-
Store a reference of opImpl in output .
.beta :[[operator]] Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
.float[[operand]]
Connect input .
[[operand]]scalar addition, defaultas input to0.5.opImpl .-
Connect output .
as output to opImpl .Returns:[[operand]]
-
Return output .
7.6.16.2.
The
hardSigmoid(options)
method
-
options : an optional
. TheMLOperandMLHardSigmoidOptionsoutput tensoroptional parameters of thesame shape as x .operation.
Returns:
-
an
MLActivation. The activation function representing the hard sigmoid operation.
The
behavior
of
this
operation
can
hardSigmoid(options)
method
steps
are:
Let options be
generically emulated fromtheusage of other operations as follow. However, user agents typically havefirst argument.If running the check hard-sigmoid options steps with options returns
false, then throw amore efficient implementation for it, therefore its usage is encouraged from"TypeError"DOMExceptionand abort these steps.
Let op be the
performance standpoint. builder builder builder builder builder builderresult of invoking the create MLActivation steps with"hardSigmoid"and options .If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.16.
7.6.17.
The
hardSwish()
method
Computes
the
nonlinear
function
y
=
x
*
max(0,
min(6,
(x
+
3)))
/
6
that
is
introduced
by
[MobileNetV3]
on
the
input
tensor
element-wise.
partial interface MLGraphBuilder {MLOperand (hardSwish MLOperand );input MLActivation (); };hardSwish
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. div( builder. mul( x, builder. max( builder. constant( 0 ), builder. min( builder. constant( 6 ), builder. add( x, builder. constant( 3 ))))), builder. constant( 6 ));
7.6.17.1.
The
hardSwish(input)
method
-
xinput : anMLOperand. The input tensor.
Returns:
-
an
MLOperand. The output tensor of the same shape asxinput .
The
hardSwish(input)
method
steps
are:
Let input be the first argument.
-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the hard-swish operation.
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.17.2.
The
hardSwish()
method
None.
Returns:
an
MLActivation. The activation function representing the hard-swish operation.
The
behavior
of
this
operation
can
hardSwish()
method
steps
are:
Let op be
generically emulated fromtheusageresult ofother operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged frominvoking theperformance standpoint. builder x builder builder builder builder builder buildercreate MLActivation steps with"hardSwish".If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.17.
7.6.18.
The
instanceNormalization()
method
Normalize
the
input
features
using
[Instance-Normalization]
.
Unlike
dictionary {MLInstanceNormalizationOptions ; ; = 1e-5; = "nchw";MLOperand scale ;MLOperand bias ;float epsilon = 1e-5;MLInputOperandLayout layout = "nchw"; };{ ,partial interface MLGraphBuilder {MLOperand (instanceNormalization MLOperand ,input optional MLInstanceNormalizationOptions = {}); };options
input
:
an
MLOperand
.
The
input
4-D
tensor.
options
:
an
optional
MLInstanceNormalizationOptions
members
are:
.
The
optional
parameters
of
the
operation.
-
scale, of type MLOperand: an An
MLOperand.TheSpecifies he 1-D tensor of the scaling values whose length is equal to the number of channels, i.e. the size of the feature dimension of theinput e.g.input. For example, forthean input tensor withnchwlayout, thefeature dimensionlength is1.the value of input .[[descriptor]].dimensions[1].-
bias, of type MLOperand: an An
MLOperand.TheSpecifies the 1-D tensor of the bias values whose length is equal to the size of the feature dimension of theinput e.g.input. For example, forthean input tensor withnchwlayout, thefeature dimensionlength is1.the value of input .[[descriptor]].dimensions[1].-
epsilon, of type float , defaulting to: a1e-5 A
floatscalar.ASpecifies a small value to prevent computational error due to divide-by-zero.The default value is 0.00001 when not specified.-
layout, of type MLInputOperandLayout , defaulting to: an"nchw" An
MLInputOperandLayout.This option specifiesSpecifies the layout format of the input.
input : an
MLOperand. Thedefault value isinput 4-D tensor."nchw" .options : an optionalMLInstanceNormalizationOptions. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
instance-normalized
4-D
tensor
of
the
same
shape
as
the
input
tensor.
The
instanceNormalization(input,
options)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If the rank of input is not
4, then throw a "DataError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If options .
scaleis not an instance ofMLOperand, then throw a "TypeError"DOMExceptionand stop.If the rank of options .
scaleis not equal to the size of the channel dimension of input , then throw a "DataError"DOMExceptionand stop.If options .
biasis not an instance ofMLOperand, then throw a "TypeError"DOMExceptionand stop.If the rank of options .
biasis not equal to the size of the channel dimension of input , then throw a "DataError"DOMExceptionand stop.If options .
epsilonisundefined, let it be0.00001.If options .
layoutisundefined, let it be"nchw".Otherwise if options .
layoutis not one ofMLInputOperandLayout, then throw a "DataError"DOMExceptionand stop.If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the instance normalization operation, given options .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
The behavior of this operation when the input tensor is 4-D of the "nchw" layout can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
// The mean reductions happen over the spatial dimensions of the input // e.g. axis 2 and 3 of the input tensor. const reduceOptions= { axes: [ 2 , 3 ], keepDimensions: true }; const mean= builder. reduceMean( input, reduceOptions); const variance= builder. reduceMean( builder. pow( builder. sub( input, mean), buider. constant( 2 )), reduceOptions); // The scale and bias values are applied per input feature // e.g. axis 1 of the input tensor. const shape= [ 1 , null , 1 , 1 ]; return builder. add( builder. mul( builder. reshape( options. scale, shape), builder. div( builder. sub( input, mean), buidler. pow( builder. add( variance, options. epsilon), builder. constant( 0.5 )) ) ), builder. reshape( options. bias, shape) );
7.6.18.
7.6.19.
The
leakyRelu()
method
Calculate
the
leaky
version
of
rectified
linear
function
on
the
input
tensor
element-wise.
The
calculation
follows
the
expression
max(0,
x)
+
alpha
∗
min(0,
x)
.
dictionary {MLLeakyReluOptions = 0.01;float alpha = 0.01; };{ = {}); = {});partial interface MLGraphBuilder {MLOperand (leakyRelu MLOperand ,input optional MLLeakyReluOptions = {});options MLActivation (leakyRelu optional MLLeakyReluOptions = {}); };options
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. add( builder. max( builder. constant( 0 ), x), builder. mul( builder. constant( options. alpha), builder. min( builder. constant( 0 ), x)));
MLLeakyReluOptions
has
the
following
members:
alpha, of type float , defaulting to0.01A
floatscalar multiplier. The default value is0.01.
To check leaky-relu options given options , run the following steps:
If options is not an object that implements
MLLeakyReluOptions, then returnfalse.If options .
alphaisundefined, set options .alphato1.Else if options .
alphais not a numeric type , then then returnfalse.Return
true.
7.6.19.1.
The
leakyRelu(input,
options)
method
-
xinput : anMLOperand. The input tensor. -
options : an optional
MLLeakyReluOptions. The optional parameters of the operation.
Returns:
-
an
MLOperand. The output tensor of the same shape asalpha :input .
The
leakyRelu(input,
options)
method
steps
are:
Let input be the first argument.
Let options be the second argument.
If options is
undefined, let options be a newfloatMLLeakyReluOptionsscalar multiplier, defaultobject.If running the check leaky-relu options steps with options returns
false, then throw a "TypeError"DOMExceptionand abort these steps.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to
0.01.the underlying platform to:-
Let opImpl be an implementation-defined platform operator for the Leaky RELU operation, given options .
Store a reference of opImpl in output .
.Returns:[[operator]]Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
-
-
Connect input .
[[operand]]as input to opImpl . Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.19.2.
The
leakyRelu(options)
method
-
options : an optional
. TheMLOperandMLLeakyReluOptionsoutput tensoroptional parameters of thesame shape as x .operation.
Returns:
-
an
MLActivation. The activation function representing the leaky relu operation.
The
elu(options)
method
steps
are:
Let options be the first argument.
If options is
undefined, let options be a newMLLeakyReluOptionsobject.If running the check leaky-relu options steps with options returns
false, then throw a "TypeError"DOMExceptionand abort these steps.
Let op be the result of invoking the create MLActivation steps with
"leakyRelu"and options .If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.20. The linear() method
Calculate a linear function
y
=
alpha
*
x
+
beta
on
the
input
tensor.
dictionary {MLLinearOptions float alpha = 1;float beta = 0; };partial interface MLGraphBuilder {MLOperand (linear MLOperand ,input optional MLLinearOptions = {});options MLActivation (linear optional MLLinearOptions = {}); };options
The
behavior
of
this
operation
can
be
generically
emulated
from
the
usage
of
other
operations
as
follow.
However,
user
agents
typically
have
a
more
efficient
implementation
for
it,
therefore
its
usage
is
encouraged
from
the
performance
standpoint.
builder
return builder. add( builder. mul( x, builder. constant( options. alpha)), builder. constant( options. beta));
MLLinearOptions
has
the
following
members:
alpha, of type float , defaulting to17.6.19.-
A
floatscalar multiplier. Thelinear() methoddefault value is1. -
beta, of type float , defaulting to0 -
A
floatscalar addition. The default value isCalculate a0.
To
check
linear
function
options
given
options
,
run
the
following
steps:
If options is not an object that implements
MLLinearOptions, then return.y =falseIf options .
alphais* x +undefined, set options .alphato1.Else if options .
alphais not a numeric type , then then returnfalse.If options .
betaon the input tensor. { = 1; = 0; };isundefined, set options .to{ = {}); = {}); };beta0.Else if options .
betais not a numeric type , then then returnfalse.Return
true.
7.6.20.1.
The
linear(input,
options)
method
-
xinput : anMLOperand. The input tensor. -
options : an optional
MLLinearOptions. The optional parameters of the operation.
Returns:
-
an
MLOperand. The output tensor of the same shape asalpha :x .
The
linear(input,
options)
method
steps
are:
Let input be the first argument.
Let options be the second argument.
If running the check linear options steps with options returns
false, then throw a "floatTypeErrorscalar multiplier, default"DOMExceptionand abort these steps.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to
1.the underlying platform to:-
Let opImpl be an implementation-defined platform operator for the linear operation, given options .
Store a reference of opImpl in output .
.beta :[[operator]]Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
.float[[operand]]
-
Connect input .
[[operand]]scalar addition, defaultas input to0.opImpl .-
Connect output .
as output to opImpl .Returns:[[operand]]
Return output .
7.6.20.2.
The
linear(options)
method
-
options : an optional
. TheMLOperandMLLinearOptionsoutput tensoroptional parameters of thesame shape as x .operation.
Returns:
-
an
MLActivation. The activation function representing the linear operation.
The
behavior
of
this
operation
can
linear(options)
method
steps
are:
Let options be
generically emulated fromtheusage of other operations as follow. However, user agents typically havefirst argument.If running the check linear options steps with options returns
false, then throw amore efficient implementation for it, therefore its usage is encouraged from"TypeError"DOMExceptionand abort these steps.
Let op be the
performance standpoint. builder builderresult of invoking the create MLActivation steps with"linear"and options .If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.20.
7.6.21.
The
lstm()
method
Long
Short-Term
Memory
[LSTM]
recurrent
network
uses
an
input,
output,
forget,
and
cell
gate
to
compute
the
output
state
that
rolls
into
the
output
across
the
temporal
sequence
of
the
network.
enum {MLLstmWeightLayout , // input-output-forget-cell gate ordering"iofg" // input-forget-cell-output gate ordering };"ifgo" dictionary {MLLstmOptions ; ; ; ; ; ; = "forward"; = "iofg"; ;MLOperand bias ;MLOperand recurrentBias ;MLOperand peepholeWeight ;MLOperand initialHiddenState ;MLOperand initialCellState ;boolean returnSequence =false ;MLRecurrentNetworkDirection direction = "forward";MLLstmWeightLayout layout = "iofg";sequence <MLActivation >activations ; };{ , ,partial interface MLGraphBuilder {sequence <MLOperand >(lstm MLOperand ,input MLOperand ,weight MLOperand ,recurrentWeight unsigned long ,steps unsigned long ,hiddenSize optional MLLstmOptions = {}); };options
input
:
an
has
the
following
members:
MLOperand
MLLstmOptions
.
The
input
3-D
tensor
of
shape
[steps,
batch_size,
input_size].
-
weight : anbias, of type MLOperand. The 3-D input weight tensor of shape [num_directions, 4 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout argument. -
recurrentWeight : anAnMLOperand.The 3-D recurrent weightSpecifies the 2-D input bias tensor of shape [num_directions, 4 *hidden_size,hidden_size]. The ordering of theweightbias vectors in the second dimension of the tensor shape is specified according tothe options.layout argument. steps : an.unsigned longlayoutscalar. The number of time steps in the recurrent network. The value must be greater than 0.hiddenSize : an -
unsigned longrecurrentBiasscalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state. options : an optionalMLGruOptions, of type MLOperand. The optional parameters of the operation. -
bias : anAnMLOperand.TheSpecifies the 2-Dinputrecurrent bias tensor of shape [num_directions, 4 * hidden_size]. The ordering of the bias vectors in thesecondfirst dimension of the tensor shape is specified according tothe options.layout argument. recurrentBias : an.MLOperandlayoutThe 2-D recurrent bias tensor of shape [num_directions, 4 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the options.layout argument. -
peepholeWeight, of type MLOperand: an An
MLOperand.TheSpecifies the 2-D weight tensor for peepholes of shape [num_directions,34 * hidden_size]. The pack ordering of the weight vectors is for theinput (i),,output (o), and,forget (f)gate, respectively.gate-
initialHiddenState, of type MLOperand: an An
MLOperand.TheSpecifies the 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified,it’s assumed to beimplementations SHOULD use a tensor filled with zero.-
initialCellState, of type MLOperand: an An
MLOperand.TheSpecifies the 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified,it’s assumed to beimplementations SHOULD use a tensor filled with zero.-
returnSequence, of type boolean , defaulting to: afalse A
booleanindicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step.Default to false.-
direction, of type MLRecurrentNetworkDirection , defaulting to: an"forward" An
MLRecurrentNetworkDirection.TheSpecifies the processing direction of the input sequence. When set to"both", the size of the first dimension of the weight and the bias tensor shapes must be,2,2, and the input is processed in both directions.-
layout, of type MLLstmWeightLayout , defaulting to: an"iofg" An
MLLstmWeightLayout. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically theinput (i),,output (o),,forget (f), and,cell (g)gate, as indicated in thesecondfirst dimension of the weight and bias tensor shapes. When not specified, the default layout is"iofg"..-
activations, of type sequence< MLActivation >: a A sequence of
MLActivation. A sequence of three activation functions, the first one is used for theinput (i),,forget (f), and,output (o)gate, the second one is used for thecell (g)gate, and the last used for filtering the output cell state before combining it with the result of the output gate to form the output hidden state. When not specified,they are assumed to beimplementations SHOULD use the sequence of the sigmoid function ("sigmoid") followed by two hyperbolic tangent functions ("tanh") respectively.
input : an
MLOperand. The input 3-D tensor of shape [steps, batch_size, input_size].weight : an
MLOperand. The 3-D input weight tensor of shape [num_directions, 4 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options .layout.recurrentWeight : an
MLOperand. The 3-D recurrent weight tensor of shape [num_directions, 4 * hidden_size, hidden_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options .layoutargument.steps : an
unsigned longscalar. The number of time steps in the recurrent network. The value must be greater than 0.hiddenSize : an
unsigned longscalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state.options : an optional
MLLstmOptions. The optional parameters of the operation.
Returns:
a
sequence
of
MLOperand
.
The
first
element
of
the
sequence
is
a
3-D
tensor
of
shape
[num_directions,
batch_size,
hidden_size],
the
output
hidden
state
from
the
last
time
step
of
the
network.
The
second
element
is
a
3-D
tensor
of
shape
[num_directions,
batch_size,
hidden_size],
the
output
cell
state
from
the
last
time
step
of
the
network.
Additionally,
if
options.returnSequence
options
.
returnSequence
is
set
to
true,
the
third
element
is
the
4-D
output
tensor
of
shape
[steps,
num_directions,
batch_size,
hidden_size]
containing
every
output
from
each
time
step
in
the
temporal
sequence.
The
lstm(input,
weight,
recurrentWeight,
steps,
hiddenSize,
options)
steps
are:
If options is
undefined, let options be an empty object .If options .
directionisundefined, set it to"forward".If options .
directionis not one ofMLRecurrentNetworkDirection, then throw a "TypeError"DOMExceptionand stop.Let num_directions be
1if options .directionis"forward", or otherwise let it be2.If input , weight or recurrentWeight is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.The shape of input , weight or recurrentWeight could be also checked here.If input .
[[descriptor]].dimensions[0] is not equal to steps , then throw a "DataError"DOMExceptionand stop.Let batch_size be input .
[[descriptor]].dimensions[1].If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
2, then throw a "DataError"DOMExceptionand stop.If options .
bias.[[descriptor]].dimensions[0] is not num_directions , then throw a "DataError"DOMExceptionand stop.If options .
bias.[[descriptor]].dimensions[1] is not 4 * hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
recurrentBiasexists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
2, then throw a "DataError"DOMExceptionand stop.If options .
recurrentBias.[[descriptor]].dimensions[0] is not num_directions , then throw a "DataError"DOMExceptionand stop.If options .
recurrentBias.[[descriptor]].dimensions[1] is not 4 * hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
peepholeWeightexists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
2, then throw a "DataError"DOMExceptionand stop.If options .
peepholeWeight.[[descriptor]].dimensions[0] is not num_directions , then throw a "DataError"DOMExceptionand stop.If options .
peepholeWeight.[[descriptor]].dimensions[1] is not 4 * hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
exists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
3, then throw a "DataError"DOMExceptionand stop.If options .
.[[descriptor]].dimensions[0] is not num_directions , then throw a "DataError"DOMExceptionand stop.If options .
.[[descriptor]].dimensions[1] is not equal to batch_size , then throw a "DataError"DOMExceptionand stop.If options .
.[[descriptor]].dimensions[2] is not hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
initialCellStateexists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
3, then throw a "DataError"DOMExceptionand stop.If options .
initialCellState.[[descriptor]].dimensions[0] is not num_directions , then throw a "DataError"DOMExceptionand stop.If options .
initialCellState.[[descriptor]].dimensions[1] is not equal to batch_size , then throw a "DataError"DOMExceptionand stop.If options .
initialCellState.[[descriptor]].dimensions[2] is not hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
returnSequenceisundefined, set it tofalse.If options .
layoutisundefined, set it to"iofg".If options .
layoutis not one ofMLLstmWeightLayout, then throw a "TypeError"DOMExceptionand stop.If options .
activationsexists :If it is not an array of size
3, then throw a "TypeError"DOMExceptionand stop.If any of its elements is not an instance of
MLActivation, then throw a "TypeError"DOMExceptionand stop.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let desc a new
MLOperandDescriptor.Set desc .
dimensionsto [ nume_directions , batch_size , hiddenSize ].Set desc .
typeto input .[[descriptor]].type.Let output0 be the result of invoking the create MLOperand steps given this and desc .
Let output1 be the result of invoking the create MLOperand steps given this and desc .
Set desc .
dimensionsto [ steps , nume_directions , batch_size , hiddenSize ].Let output2 be the result of invoking the create MLOperand steps given this and desc .
Let output be the array [ output0 , output1 , |output2 ].
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the LSTM operation, given weight , recurrentWeight , steps , hiddenSize and options .
Store a reference of opImpl in output0 .
[[operator]], output1 .[[operator]]and output2 .[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output0 .
[[operand]], output1 .[[operand]]and output2 .[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output as output to opImpl .
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
const numDirections= ( options. direction== "both" ? 2 : 1 ); let hiddenState= options. initialHiddenState; let cellState= options. initialCellState; if ( ! hiddenState) { const desc= { type: 'float32' , dimensions: [ numDirections, 1 , hiddenSize] }; const totalSize= numDirections* hiddenSize; hiddenState= builder. constant( desc, new Float32Array( totalSize). fill( 0 )); } if ( ! cellState) { const desc= { type: 'float32' , dimensions: [ numDirections, 1 , hiddenSize] }; const totalSize= numDirections* hiddenSize; cellState= builder. constant( desc, new Float32Array( totalSize). fill( 0 )); } let sequence= null ; let currentWeight= []; let currentRecurrentWeight= []; let currentBias= []; let currentRecurrentBias= []; let currentPeepholeWeight= []; for ( let dir= 0 ; dir< numDirections; ++ dir) { currentWeight. push( builder. squeeze( builder. slice( weight, [ dir, 0 , 0 ], [ 1 , 4 * hidden_size, input_size]), { axes: [ 0 ] })); currentRecurrentWeight. push( builder. squeeze( builder. slice( recurrentWeight, [ dir, 0 , 0 ], [ 1 , 4 * hidden_size, hidden_size]), { axes: [ 0 ] })); currentBias. push( options. bias? ( builder. squeeze( builder. slice( options. bias, [ dir, 0 ], [ 1 , 4 * hidden_size]), { axes: [ 0 ] })) : null ); currentRecurrentBias. push( options. recurrentBias? ( builder. squeeze( builder. slice( options. recurrentBias, [ dir, 0 ], [ 1 , 4 * hidden_size]), { axes: [ 0 ] })) : null ); currentPeepholeWeight. push( options. peepholeWeight? ( builder. squeeze( builder. slice( options. peepholeWeight, [ dir, 0 ], [ 1 , 3 * hidden_size]), { axes: [ 0 ] })) : null ); } for ( let step= 0 ; step< steps; ++ step) { let currentHidden= []; let currentCell= []; let nextHidden= null ; let nextCell= null ; for ( let dir= 0 ; dir< numDirections; ++ dir) { currentHidden. push( builder. squeeze( builder. slice( hiddenState, [ dir, 0 , 0 ], [ 1 , batch_size, hidden_size]), { axes: [ 0 ] })); currentCell. push( builder. squeeze( builder. slice( cellState, [ dir, 0 , 0 ], [ 1 , batch_size, hidden_size]), { axes: [ 0 ] })); } for ( let dir= 0 ; dir< numDirections; ++ dir) { let slice= ( dir== 1 || options. direction== "backward" ? steps- step- 1 : step); let currentInput= builder. squeeze( builder. slice( input, [ slice, 0 , 0 ], [ 1 , batch_size, input_size]), { axes: [ 0 ] }); let results= builder. lstmCell( currentInput, currentWeight[ dir], currentRecurrentWeight[ dir], currentHidden[ dir], currentCell[ dir], hiddenSize, { bias: currentBias[ dir], recurrentBias: currentRecurrentBias[ dir], peepholeWeight: currentPeepholeWeight[ dir], layout: options. layout, activations: options. activations}); let output= builder. reshape( results[ 0 ], [ 1 , null , hiddenSize]); let cell= builder. reshape( results[ 1 ], [ 1 , null , hiddenSize]); nextHidden= ( nextHidden? builder. concat([ nextHidden, output], 0 ) : output); nextCell= ( nextCell? builder. concat([ nextCell, cell], 0 ) : cell); } hiddenState= nextHidden; cellState= nextCell; if ( options. returnSequence) { nextHidden= builder. reshape( nextHidden, [ 1 , numDirections, null , hiddenSize]); sequence= ( sequence? builder. concat([ sequence, nextHidden], 0 ) : nextHidden); } } return ( sequence? [ hiddenState, cellState, sequence] : [ hiddenState, cellState]);
7.6.21.
7.6.22.
The
lstmCell()
method
A
single
time
step
of
the
Long
Short-Term
Memory
[LSTM]
recurrent
network
using
a
cell
state,
an
input,
output,
and
forget
gate
to
compute
the
cell
state
and
the
hidden
state
of
the
next
time
step
that
rolls
into
the
output
across
the
temporal
sequence
of
the
network.
dictionary {MLLstmCellOptions ; ; ; = "iofg"; ;MLOperand bias ;MLOperand recurrentBias ;MLOperand peepholeWeight ;MLLstmWeightLayout layout = "iofg";sequence <MLActivation >activations ; };{ , ,partial interface MLGraphBuilder {sequence <MLOperand >(lstmCell MLOperand ,input MLOperand ,weight MLOperand ,recurrentWeight MLOperand ,hiddenState MLOperand ,cellState unsigned long ,hiddenSize optional MLLstmCellOptions = {}); };options
input
:
an
has
the
following
members:
MLOperand
MLLstmCellOptions
.
The
input
2-D
tensor
of
shape
[batch_size,
input_size].
-
weight : anbias, of type MLOperand. The 2-D input weight tensor of shape [4 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument. -
recurrentWeight : anAnMLOperand. The2-D recurrent weight1-D input bias tensor of shape [4 *hidden_size,hidden_size]. The ordering of theweightbias vectors in the first dimension of the tensor shape is specified according to theoptions.layout argument. hiddenState : anargument.MLOperand . The 2-D input hidden state tensor of shape [batch_size, hidden_size]. cellState : an MLOperandlayout. The 2-D input cell state tensor of shape [batch_size, hidden_size].hiddenSize : an -
unsigned longrecurrentBiasscalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.options : an optional MLLstmCellOptions, of type MLOperand. The optional parameters of the operation. -
bias : anAnMLOperand. The 1-Dinputrecurrent bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to theoptions.layout argument. recurrentBias : anargument.MLOperandlayout. The 1-D recurrent bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the options.layout -
peepholeWeight, of type MLOperand: an An
MLOperand. The 1-D weight tensor for peepholes of shape [3 * hidden_size]. The pack ordering of the weight vectors is for theinput (i),,output (o), and,forget (f)gate, respectively.gate-
layout, of type MLLstmWeightLayout , defaulting to: an"iofg" An
MLLstmWeightLayout. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically theinput (i),,output (o),,forget (f), and,cell (g)gate, as indicated in the first dimension of the weight and bias tensor shapes. When not specified, the default layout is"iofg"..-
activations, of type sequence< MLActivation >: a A sequence of
MLActivation. A sequence of three activation functions, the first one is used for theinput (i),,forget (f), and,output (o)gate, the second one is used for thecell (g)gate, and the last used for filtering the output cell state before combining it with the result of the output gate to form the output hidden state. When not specified, they are assumed to be of the sigmoid function ("sigmoid") followed by two hyperbolic tangent functions ("tanh") respectively.
input : an
MLOperand. The input 2-D tensor of shape [batch_size, input_size].weight : an
MLOperand. The 2-D input weight tensor of shape [4 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument.recurrentWeight : an
MLOperand. The 2-D recurrent weight tensor of shape [4 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument.hiddenState : an
MLOperand. The 2-D input hidden state tensor of shape [batch_size, hidden_size].cellState : an
MLOperand. The 2-D input cell state tensor of shape [batch_size, hidden_size].hiddenSize : an
unsigned longscalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.options : an optional
MLLstmCellOptions. The optional parameters of the operation.
Returns:
a
sequence
of
MLOperand
.
The
first
element
of
the
sequence
is
the
output
hidden
state
of
the
current
time
step
of
the
recurrent
network.
The
following
element
is
the
output
cell
state.
Both
elements
are
2-D
tensors
of
shape
[batch_size,
hidden_size].
The
lstmCell(input,
weight,
recurrentWeight,
hiddenState,
cellState,
hiddenSize,
options)
steps
are:
If input , weight , recurrentWeight , hiddenState or cellState is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If the rank of input , weight , recurrentWeight , hiddenState or cellState is not
2, then throw a "DataError"DOMExceptionand stop.Let batch_size be input .
[[descriptor]].dimensions[0].If options is
undefined, let options be an empty object .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
1, then throw a "DataError"DOMExceptionand stop.If options .
bias.[[descriptor]].dimensions[0] is not 4 * hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
recurrentBiasexists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
1, then throw a "DataError"DOMExceptionand stop.If options .
recurrentBias.[[descriptor]].dimensions[0] is not 4 * hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
peepholeWeightexists .If it is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If its rank is not
1, then throw a "DataError"DOMExceptionand stop.If options .
peepholeWeight.[[descriptor]].dimensions[0] is not 3 * hiddenSize , then throw a "DataError"DOMExceptionand stop.
If options .
layoutisundefined, set it to"iofg".If options .
layoutis not one ofMLLstmWeightLayout, then throw a "TypeError"DOMExceptionand stop.If options .
activationsexists :If it is not an array of size
3, then throw a "TypeError"DOMExceptionand stop.If any of its elements is not an instance of
MLActivation, then throw a "TypeError"DOMExceptionand stop.
Let desc a new
MLOperandDescriptor.Set desc .
dimensionsto [ batch_size , hiddenSize ].Set desc .
typeto input .[[descriptor]].type.If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output0 be the result of invoking the create MLOperand steps given this and desc .
Let output1 be the result of invoking the create MLOperand steps given this and desc .
Let output be the array [ output0 , output1 ].
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the LSTM cell operation, given weight , recurrentWeight , hiddenState , cellState , hiddenSize and options .
Store a reference of opImpl in output0 .
[[operator]]and output1 .[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output0 .
[[operand]]and output1 .[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output as output to opImpl .
Return output .
The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default "iofg" layout, and the activation functions of the input/forget/output gate and the cell gate/the cell state’s filter for the output hidden state are of the operator types sigmoid and tanh respectively.
const zero= builder. constant( 0 ); // input gate (i) let i= builder. sigmoid( builder. add( builder. mul( cellState, ( options. peepholeWeight? builder. slice( options. peepholeWeight, [ 0 ], [ hiddenSize]) : zero) ), builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 0 ], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 0 ], [ hiddenSize]) : zero) ), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 0 , 0 ], [ hiddenSize, input_size])) ), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 0 , 0 ], [ hiddenSize, hidden_size])) ) ) ) ) ); // forget gate (f) let f= builder. sigmoid( builder. add( builder. mul( cellState, ( options. peepholeWeight? builder. slice( options. peepholeWeight, [ 2 * hiddenSize], [ hiddenSize]) : zero) ), builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 2 * hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 2 * hiddenSize], [ hiddenSize]) : zero) ), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 2 * hiddenSize, 0 ], [ hiddenSize, input_size])) ), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 2 * hiddenSize, 0 ], [ hiddenSize, hidden_size])) ) ) ) ) ); // cell gate (g) let g= builder. tanh( builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ 3 * hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ 3 * hiddenSize], [ hiddenSize]) : zero) ), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ 3 * hiddenSize, 0 ], [ hiddenSize, input_size])) ), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ 3 * hiddenSize, 0 ], [ hiddenSize, hidden_size])) ) ) ) ); // output gate (o) let o= builder. sigmoid( builder. add( builder. mul( cellState, ( options. peepholeWeight? builder. slice( options. peepholeWeight, [ hiddenSize], [ hiddenSize]) : zero) ), builder. add( builder. add( ( options. bias? builder. slice( options. bias, [ hiddenSize], [ hiddenSize]) : zero), ( options. recurrentBias? builder. slice( options. recurrentBias, [ hiddenSize], [ hiddenSize]) : zero) ), builder. add( builder. matmul( input, builder. transpose( builder. slice( weight, [ hiddenSize, 0 ], [ hiddenSize, input_size])) ), builder. matmul( hiddenState, builder. transpose( builder. slice( recurrentWeight, [ hiddenSize, 0 ], [ hiddenSize, hidden_size])) ) ) ) ) ); // output cell state (ct) let ct= builder. add( builder. mul( f, cellState), builder. mul( i, g)); // output hidden state (ht) let ht= builder. mul( o, builder. tanh( ct)); return [ ht, ct];
7.6.22.
7.6.23.
The
matmul()
method
Compute
the
matrix
product
of
two
input
tensors.
partial interface MLGraphBuilder {MLOperand (matmul MLOperand ,a MLOperand ); };b
-
a : an
MLOperand. The first N-dimensional inputN-Dtensor. -
b : an
MLOperand. The second N-dimensional inputN-Dtensor.
Returns:
an
MLOperand
.
The
output
N-D
tensor
that
contains
the
matrix
product
of
two
input
tensors.
-
If both a and b are
2-D,2-dimensional, they are multiplied like conventional matrices and produce a2-D2-dimensional tensor as the output. -
If either a or b is
N-D,N-dimensional whereN >, it is treated as a stack of matrices with dimensions corresponding to the last two indices. The matrix multiplication will be broadcasted accordingly by following the [numpy-broadcasting-rule] . The output is a2,2N-DN-dimensional tensor whose rank is the maximum rank of the input tensors. For each dimension, except the last two, of the output tensor, its size is the maximum size along that dimension of the input tensors. -
If a is
1-D,1-dimensional, it is converted to a2-D2-dimensional tensor by prepending a 1 to its dimensions. -
If b is
1-D,1-dimensional, it is converted to a2-D2-dimensional tensor by by appending a 1 to its dimensions. -
If both a and b are
1-D,1-dimensional, the operation is a vector dot-product, which produces a scalar output.
To calculate matmul output sizes , given a and b run the following steps:
Let shapeA be a .
[[descriptor]].dimensionsand sizeA the size of shapeA .Let shapeB be a .
[[descriptor]].dimensionsand sizeB the size of shapeB .If sizeA and sizeB is
1, return[ 1 ].If | sizeA| is
1and sizeB is not, then insert1in the front of shapeA to become [ 1 | shapeA ] and let sizeA be2.If | sizeB| is
1and sizeA is not, then insert1in the front of shapeB to become [ 1 | shapeB ] and let sizeB be2.Let shape be an array whose size size is the maximum of sizeA and sizeB .
For each index between 0 and size :
Set shape [ index ] to the maximum of shapeA [ index ] and shapeB [ index ].
Return shape .
The
matmul(a,
b)
steps
are:
If a or b is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.Let desc a new
MLOperandDescriptor.Set desc .
dimensionsto the result of invoking the calculate matmul output sizes given a and b .Set desc .
typeto a .[[descriptor]].type.If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and desc .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the matrix multiplication operation.
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect a .
[[operand]]and b .[[operand]]as inputs to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.23.
7.6.24.
The
pad()
method
Inflate
the
tensor
with
constant
or
mirrored
values
on
the
edges.
enum {MLPaddingMode ,"constant" ,"edge" ,"reflection" };"symmetric" dictionary {MLPadOptions = "constant"; = 0;MLPaddingMode mode = "constant";float value = 0; };{ , , ,partial interface MLGraphBuilder {MLOperand (pad MLOperand ,input sequence <unsigned long >,beginningPadding sequence <unsigned long >,endingPadding optional MLPadOptions = {}); };options
MLPadOptions
has
the
following
members:
mode, of type MLPaddingMode , defaulting to"constant"An
MLPaddingModestring . Specifies the different ways to pad the tensor. The default value is"constant".value, of type float , defaulting to0A
float. Specifies the padding value whenmodeis set to"constant". The default value is0.
-
input : an
MLOperand. The input tensor. -
beginningPadding : a sequence of
unsigned long. The sequence of unsigned integer values indicating the number of padding values to add at the beginning of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , beginningPadding[d] indicates how many values to add before the content in that dimension. -
endingPadding : a sequence of
unsigned long. The sequence of unsigned integer values indicating the number of padding values to add at the ending of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , endingPadding[d] indicates how many values to add after the content in that dimension. -
options : an optional
MLPadOptions. The optional parameters of the operation.
mode
:
Returns:
an
.
The
MLPaddingMode
MLOperand
different
ways
to
pad
the
padded
output
tensor.
When
not
set,
it’s
assumed
to
Each
dimension
of
the
output
tensor
can
be
"constant".
calculated
as
follow:
value
:
output
size
=
beginning
padding
+
input
size
+
ending
padding
To calculate padding output sizes , given input , beginningPadding and endingPadding , run the following steps:
Let shape be a copy of input .
.float[[descriptor]]The paddimensions.For index between
0and the rank of shape :Add to shape [ index ] the value
whenof beginningPadding [ index ].Add to shape [ index ] the
options.modevalue of endingPadding [ index ].
Return shape .
The
pad(input,
beginningPadding,
endingPadding,
options)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If beginningPadding or endingPadding is not a sequence of
unsigned long, then throw a "TypeError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If options .
modeisundefined, set it to"constant".. When-
Otherwise, if options .
modeis notset, it’s assumedone ofMLPaddingMode, then throw a "TypeError"DOMExceptionand stop.
-
If options .
valueisundefined, set it to0.Let desc be
0.a copy of input .[[descriptor]].-
Set desc .
to the result of invoking the calculate padding output sizes given input , beginningPadding and endingPadding .Returns:dimensions If any of the following sub-steps fail, throw an "
"MLOperandOperationError. The paddedDOMExceptionand stop.Let output
tensor. Each dimensionbe the result of invoking theoutput tensor cancreate MLOperand steps given this and desc .Make a request to the underlying platform to:
Let opImpl be
calculated as follow:an implementation-defined platform operator for the padding operation, given beginningPadding , endingPadding and options .-
Store a reference of opImpl in output .
[[operator]]. Create an implementation-defined platform operand outputImpl to represent the output, given output
size = beginning padding +and opImpl .Store a reference to outputImpl in output .
[[operand]].
Connect input
size + ending padding.[[operand]]as input to opImpl .-
Connect output .
[[operand]]as output to opImpl .
Return output .
Examples for constant, edge, reflection and symmetric padding:
// input: [[1,2,3], [4,5,6]] const input= builder. constant( { type: 'float32' , dimensions: [ 2 , 3 ] }, new Float32Array([ 1 , 2 , 3 , 4 , 5 , 6 ])); const beginningPadding= [ 1 , 2 ]; const endingPadding= [ 1 , 2 ]; // "constant" padded: // [[0,0,0,0,0,0,0], // [0,0,1,2,3,0,0], // [0,0,4,5,6,0,0], // [0,0,0,0,0,0,0]] builder. pad( input, beginningPadding, endingPadding); // "edge" padded: // [[1,1,1,2,3,3,3], // [1,1,1,2,3,3,3], // [4,4,4,5,6,6,6], // [4,4,4,5,6,6,6]] builder. pad( input, beginningPadding, endingPadding, { mode: "edge" }); // "reflection" padded: // [[6,5,4,5,6,5,4], // [3,2,1,2,3,2,1], // [6,5,4,5,6,5,4], // [3,2,1,2,3,2,1]] builder. pad( input, beginningPadding, endingPadding, { mode: "reflection" }); // "symmetric" padded: // [[2,1,1,2,3,3,2], // [2,1,1,2,3,3,2], // [5,4,4,5,6,6,5], // [5,4,4,5,6,6,5]] builder. pad( input, beginningPadding, endingPadding, { mode: "symmetric" });
7.6.24.
7.6.25.
Pooling
operations
Compute
a
mean
,
L2
norm
,
or
max
reduction
operation
across
all
the
elements
within
the
moving
window
over
the
input
tensor.
See
the
description
of
each
type
of
reduction
in
enum {MLRoundingType ,"floor" };"ceil" dictionary {MLPool2dOptions ; ; ; ; = "explicit"; = "nchw"; = "floor"; ;sequence <unsigned long >windowDimensions ;sequence <unsigned long >padding ;sequence <unsigned long >strides ;sequence <unsigned long >dilations ;MLAutoPad autoPad = "explicit";MLInputOperandLayout layout = "nchw";MLRoundingType roundingType = "floor";sequence <unsigned long >outputSizes ; };{ = {}); = {}); = {});partial interface MLGraphBuilder {MLOperand (averagePool2d MLOperand ,input optional MLPool2dOptions = {});options MLOperand (l2Pool2d MLOperand ,input optional MLPool2dOptions = {});options MLOperand (maxPool2d MLOperand ,input optional MLPool2dOptions = {}); };options
-
input : an
MLOperand. The input 4-D tensor. The logical shape is interpreted according to the value of options.layout . -
options : an optional
MLPool2dOptions. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
4-D
tensor
that
contains
the
result
of
the
reduction.
The
logical
shape
is
interpreted
according
to
the
value
of
windowDimensions
layout
.
More
specifically,
if
the
options.roundingType
is
"floor"
,
the
spatial
dimensions
of
the
output
tensor
can
be
calculated
as
follow:
output size = floor(1 + (input size - filter size + beginning padding + ending padding) / stride)
or if options.roundingType is "ceil" :
output size = ceil(1 + (input size - filter size + beginning padding + ending padding) / stride)
// 'global' max pooling builder. maxPool2d( input);
MLPool2dOptions
has
the
following
members:
windowDimensions, of typesequence<unsigned long>A sequence of
unsigned longof length2. The2: [window_height, window_width]. Specifies the dimensions of the slidingwindow, [window_height, window_width]. If not present,window. The default value for the window dimensions areassumed to bethe height and width dimensions of the input shape.-
padding, of type: asequence<unsigned long> A sequence of
unsigned longof length4. The4: [beginning_height, ending_height, beginning_width, ending_width]. Specifies the additional rows and columns added to the beginning and ending of each spatial dimension ofinput , [beginning_height, ending_height, beginning_width, ending_width]. If not present,thevalues are assumed to beconvolution input. The default value is [0,0,0,0].-
strides, of type: asequence<unsigned long> A sequence of
unsigned longof length2. The2: [stride_height, stride_width]. Specifies the stride of the sliding window for each spatial dimension ofinput , [stride_height, stride_width]. If not present,thevalues are assumed to beconvolution input. The default value is [1,1].-
dilations, of type: asequence<unsigned long> A sequence of
unsigned longof length2. The2: [dilation_height, dilation_width]. Specifies the dilation factor for each spatial dimensionof input , [dilation_height, dilation_width]. If not present,applied on thevalues are assumed to beconvolution filter (kernel). The default value is [1,1].-
autoPad, of type MLAutoPad , defaulting to: an"explicit" An
MLAutoPadstring ]. Specifies the automatic input padding options.. TheBy default, this argumentThe default value isset to"explicit" , which means that the values in theoptions.paddingpaddingarray should be used for input padding. When the option is set other than "explicit" , the values in theoptions.paddingpaddingarray are ignored.With the "same-upper" option, the padding values are automatically computed such that the additional ending padding of the spatial input dimensions would allow all of the input values in the corresponding dimension to be filtered.
The "same-lower" option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.
-
layout, of type MLInputOperandLayout , defaulting to: an"nchw" An
MLInputOperandLayoutstring . Specifies the layout format of the input and output tensor as. The default value is "nchw" . This option specifiesfollow:follows:-
"nchw":"nchw"-
input tensor: [batches,
channels,input_channels, height, width] -
output tensor: [batches,
channels,output_channels, height, width]
-
-
"nhwc":"nhwc" :-
input tensor: [batches, height, width,
channels]input_channels] -
output tensor: [batches, height, width,
channels]output_channels]
-
-
roundingType, of type MLRoundingType , defaulting to: an"floor"An
MLRoundingTypestring . Specifies the rounding function used to compute the output shape.. The option specifies-
outputSizes, of type: asequence<unsigned long> A sequence of
unsigned longof length 2.TheSpecifies the sizes of the two spacial dimensions of the output tensor. When the output sizes are explicitly specified, theoptions.roundingTyperoundingTypeis ignored.If not specified, the output sizes are automatically computed.
To create pooling operation given op , input and options , run the following steps:
-
Returns:Assert : op is one of "averagePool2d", "l2Pool2d", "maxPool2d". If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If options is
undefined, let options be a newMLPool2dOptionsobject.If options .
outputSizesexists , or if options .paddingisundefined, set options .paddingto[0, 0, 0, 0].The output 4-D tensor that contains the result of the reduction. The logical shape-
If options .
stridesisinterpreted accordingundefined, set options .stridestothe value of[1, 1]. If options .
dilationsisundefined, set options .dilationsto[1, 1].If options .
autoPadisundefined, set options .autoPadto"explicit.If options .
autoPadis not"explicit", set options .paddingto[0, 0, 0, 0].If options .
layoutis. More specifically, if the options.roundingTypeundefined, set options .layoutto"nchw".If options .
roundingTypeisundefined, set options .roundingTypeto"floor".,-
Let desc be a copy of input .
[[descriptor]]. If any of the
spatialfollowing sub-steps fail, throw an "OperationError"DOMExceptionand stop.Make a request to the underlying platform to:
Calculate the output dimensions
ofgiven input and options . Let desc .dimensionsbe the result of that.Let output
tensor canbecalculated as follow:the result of invoking the create MLOperand steps given this and desc .-
Let opImpl be an implementation-defined platform operator for the op pooling operation, given options .
Store a reference of opImpl in output
size = floor(1 + (input size - filter size + beginning padding + ending padding) / stride).[[operator]].-
or if options.roundingType is "ceil" :Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl . Store a reference to outputImpl in output .
[[operand]].
-
Connect input .
[[operand]]as input to opImpl . Connect output
size = ceil(1 + (input size - filter size + beginning padding + ending padding) / stride).[[operand]]as output to opImpl .A global
-
Return output .
The
following
pooling
operation
such
as
one
for
algorithms
are
supported.
averagePool2d(input,
options)
steps
are:
Let output be the
maxresult of running the create pooling operationis a variantgiven"averagePool2d", input and options .If that throws an error, then re-throw the error and stop.
Return output .
l2Pool2d(input,
options)
steps
are:
Let output be the result of running the create pooling
whereoperation given"l2Pool2d", input and options .If that throws an error, then re-throw the
window dimensions iserror and stop.
Return output .
maxPool2d(input,
options)
steps
are:
Let output be the
spatial dimensions (last two dimensions)result of running the create pooling operation given"maxPool2d", inputshape, as follow. builderand options .If that throws an error, then re-throw the error and stop.
Return output .
7.6.25.
7.6.26.
The
prelu()
method
Calculate
the
parametric
version
of
rectified
linear
function
(Parametric
max(0,
x)
+
slope
∗
min(0,
x)
.
partial interface MLGraphBuilder {MLOperand (prelu MLOperand ,input MLOperand ); };slope
-
xinput : anMLOperand. The input tensor. -
slope : an
MLOperand. The slope tensor. Its shape is either the same as, or unidirectionally broadcastable to the shape of input tensorxinput according to [numpy-broadcasting-rule] .
Returns:
-
an
MLOperand. The output tensor of the same shape as x .
The
prelu(input,
slope)
steps
are:
If input or slope is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.Let descriptor be a new
MLOperandDescriptor.Set descriptor .
dimensions.typeto input .[[descriptor]].type.Let descriptor .
dimensionsbe the result of running the broadcast-shapes steps given input .[[descriptor]].dimensionsand slope .[[descriptor]].dimensions.If that throws an error, re-throw the error and stop.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and descriptor .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the PreLU operation, given slope .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. add( builder. max( builder. constant( 0 ), x), builder. mul( slope, builder. min( builder. constant( 0 ), x)));
7.6.26.
7.6.27.
Reduction
operations
Reduce
the
input
tensor
along
all
dimensions,
or
along
the
axes
.
array
parameter.
For
each
specified
axis,
the
dimension
with
that
index
is
reduced,
i.e.
the
resulting
tensor
will
not
contain
it,
unless
the
keepDimensions
option
is
specified.
The
values
of
the
resulting
tensor
are
calculated
using
the
specified
reduction
function
that
takes
as
parameters
all
the
values
across
the
reduced
dimension.
dictionary {MLReduceOptions ; ;sequence <unsigned long >=axes null ;boolean =keepDimensions false ; };{ = {}); = {}); = {}); = {}); = {}); = {}); = {}); = {}); = {}); = {});partial interface MLGraphBuilder {MLOperand (reduceL1 MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceL2 MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceLogSum MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceLogSumExp MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceMax MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceMean MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceMin MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceProduct MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceSum MLOperand ,input optional MLReduceOptions = {});options MLOperand (reduceSumSquare MLOperand ,input optional MLReduceOptions = {}); };options
-
input : an
MLOperand. The input tensor. -
options : an optional
MLReduceOptions. The optional parameters of the operation.-
axes : a sequence of
unsigned long. The dimensions to reduce. The values in the sequence must be in the range [0, N-1] where N is the rank of input tensor. If not present, all dimensions are reduced. -
keepDimensions : a
boolean. If true, retains reduced dimensions with size of 1. The default value is false.
-
Returns:
an
MLOperand
.
The
reduced
output
tensor.
-
L1 : Compute the L1 norm of all the input values along the axes.
-
L2 : Compute the L2 norm of all the input values along the axes.
-
LogSum : Compute the log value of the sum of all the input values along the axes.
-
LogSumExp : Compute the log value of the sum of the exponent of all the input values along the axes.
-
Max : Compute the maximum value of all the input values along the axes.
-
Mean : Compute the average value of all the input values along the axes.
-
Min : Compute the minimum value of all the input values along the axes.
-
Product : Compute the product of all the input values along the axes.
-
Sum : Compute the sum of all the input values along the axes.
-
SumSquare : Compute the sum of the square of all the input values along the axes.
To create reduce operation given op , input and options , run the following steps:
Assert : op is one of "reduceL1", "reduceL2", "reduceLogSum", "reduceLogSumExp", "reduceMax", "reduceMean", "reduceMin", "reduceProduct", "reduceSum", "reduceSumSquare".
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If options is
undefined, let options be a newMLReduceOptionsobject with options .keepDimensionsset tofalseand options .axesset tonull.If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the op reduce operation, given options .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
The following reduce algorithms are supported.
The
reduceL1(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceL1", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceL2(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceL2", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceLogSum(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceLogSum", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceLogSumExp(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceLogSumExp", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceMax(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceMax", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceMean(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceMean", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceMin(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceMin", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceProduct(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceProduct", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceSum(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceSum", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
The
reduceSumSquare(input,
options)
steps
are:
Let output be the result of running the create reduce operation given "reduceSumSquare", input and options .
If that throws an error, then re-throw the error and stop.
Return output .
7.6.27.
7.6.28.
The
relu()
method
Compute
the
rectified
linear
function
of
the
input
tensor.
partial interface MLGraphBuilder {MLOperand (relu MLOperand );input MLActivation (); };relu
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. max( builder. constant( 0 ), x);
7.6.28.1.
The
relu(input)
method
-
xinput : anMLOperand. The input tensor.
Returns:
-
an
MLOperand. The output tensor of the same shape as x .
The
relu(input)
steps
are:
-
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop. If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the ReLU operation.
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.28.2.
The
relu()
method
None.
Returns:
an
MLActivation. The activation function representing the relu operation.
The
behavior
of
this
operation
can
relu()
method
steps
are:
Let op be
generically emulated fromtheusageresult ofother operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged frominvoking theperformance standpoint.create MLActivation steps with"relu".If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.28.
7.6.29.
The
resample2d()
method
Resample
the
tensor
values
from
the
source
to
the
destination
spatial
dimensions
according
to
the
scaling
factors.
enum {MLInterpolationMode ,"nearest-neighbor" };"linear" dictionary {MLResample2dOptions = "nearest-neighbor"; ; ; ;MLInterpolationMode mode = "nearest-neighbor";sequence <float >scales ;sequence <unsigned long >sizes ;sequence <unsigned long >axes ; };{ = {});partial interface MLGraphBuilder {MLOperand (resample2d MLOperand ,input optional MLResample2dOptions = {}); };options
-
input : an
MLOperand. The input 4-D tensor. -
options : an optional
MLResample2dOptions. The optional parameters of the operation.
mode
:
Returns:
an
.
The
output
4-D
tensor.
MLInterpolationMode
MLOperand
MLResample2dOptions
has
the
following
members:
mode, of type MLInterpolationMode , defaulting to"nearest-neighbor"An
MLInterpolationModestring . Specifies the interpolation algorithm used to fill the output tensor values.If not set, itThe default value isassumed to be the"nearest-neighbor", standing for Nearest Neighbor interpolation.-
scales, of type sequence< float >: a A sequence of
floatof length 2.Each value representsSpecifies the scaling factorused to scalein each spatial dimensions ofinput,the input: [scale_height, scale_width].If not set, the values are assumed to beThe default value is [1.0, 1.0].-
sizes, of type: asequence<unsigned long> A sequence of
unsigned longof length 2.TheSpecifies the target sizes for each spatial dimensions ofinput,the input: [size_height, size_width]. When the target sizes are specified, theoptions.scalesscalesargument isignored asignored, since the scaling factor values are derived from the target sizes of each spatial dimension of the input.-
axes, of type: asequence<unsigned long> A sequence of
unsigned longof length 2.TheSpecifies the two consecutive dimensions of the input tensor to which the interpolation algorithm applies. The valid values in the sequence are [0, 1], [1, 2] or [2, 3].WhenThe default value is [2, 3].
To check resample options given options , run the following steps:
If options is
undefined, let options be a newMLResample2dOptionsobject.If its value is not
specified,one of"nearest-neighbor"or"linear", returnnull.
Otherwise, set options .
modeto"nearest-neighbor".If its size is not
2, or if any of its values is not greater than0, returnnull.
Otherwise, set options .
scalesto[1.0, 1.0].If options .
sizesexists : if its size is not2, or if any of its values is not greater than0, returnnull.If its value is not one of
[0, 1], [1, 2], [2, 3], returnnull.
Otherwise, set options .
axesto[2, 3].Return options .
To
resample
output
sizes
given
input
and
options
,
run
the
sequence
following
steps:
Let desc be an
MLOperandDescriptorinitialized to input .[[descriptor]].If options .
sizesexists , then set desc .[[descriptor]].dimensionsto options .sizesand return desc .For index between
0and the rank of desc .[[descriptor]].dimensions:Let inputSize be the size of input .
[[descriptor]].dimensions[ index ].Let outputSize be inputSize multiplied by options .
scales.If that fails or outputSize is
assumednot a positive number , then throw a "DataError"DOMExceptionand stop.
Set desc .
dimensions[ index ] to outputSize .
Return desc .
The
resample2d(input,
options)
steps
are:
Check if the input is a 4-dimensional tensor: if the size of input .
[[descriptor]].dimensionsis not4, throw a "DataError"DOMExceptionand stop.Let options be
[2, 3].the result of running the check resample options steps given options .-
If that returns
null, then throw a ""Returns:DataErrorDOMExceptionand stop.
-
Let desc be the result of running the resample output sizes steps given options .
If that throws an error, re-throw the error and stop.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and desc .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the resample 2D operation, given options .
Store a reference of opImpl in output .
[[operator]].The-
Create an implementation-defined platform operand outputImpl to represent the output, given output
4-D tensor.and opImpl . Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .-
Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.29.
7.6.30.
The
reshape()
method
Alter
the
shape
of
a
tensor
to
a
new
shape.
Reshape
does
not
copy
or
change
the
content
of
the
tensor.
It
just
changes
the
tensor’s
logical
dimensions
for
the
subsequent
operations.
partial interface MLGraphBuilder {MLOperand (reshape MLOperand ,input sequence <unsigned long ?>); };newShape
-
input : an
MLOperand. The input tensor. -
newShape : a sequence of
nullableunsigned long. The shape of the output tensor. The number of elements implied by newShape must be the same as the number of elements in the input tensor. Only one component of newShape can be the special value ofnull. The size of the dimension with the valuenullis computed so that the total size remains constant.
Returns:
an
MLOperand
.
The
output
tensor.
The
values
of
the
output
tensor
are
the
same
as
values
of
the
input
tensor.
The
shape
of
the
output
tensor
is
specified
by
the
newShape
argument.
The
reshape(input,
newShape)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.Let outputShape be an empty array of
unsigned long.If newShape is a scalar number , set outputShape to
[ 1 ].Otherwise, if newShape is an array of
unsigned long:If the size of newShape is
0, set outputShape to[ 1 ](reshaping to scalar).If newShape contains more than one
nullvalue, then throw a "DataError"DOMExceptionand stop.If any value in newShape is
0, then throw a "DataError"DOMExceptionand stop.Let inputElementCount be the product of all elements in inputs .
[[descriptor]].dimensions.If newShape contains a
nullvalue, set that value to inputElementCount divided by the product of all other values in newShape .If that value is too large for
unsigned long, then throw a "DataError"DOMExceptionand stop.
If product of all values in newShape is not equal to inputElementCount , then throw a "
DataError"DOMExceptionand stop.
Let desc be a copy of input .
[[descriptor]].Set desc .
dimensionsto newShape .If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the create MLOperand steps given this and desc .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the reshape operation.
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.30.
7.6.31.
The
sigmoid()
method
Compute
the
sigmoid
function
of
the
input
tensor.
The
calculation
follows
the
expression
1
/
(exp(-x)
+
1)
.
partial interface MLGraphBuilder {MLOperand (sigmoid MLOperand );input MLActivation (); };sigmoid
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. div( builder. constant( 1 ), builder. add( builder. exp( builder. neg( x)), builder. constant( 1 )));
7.6.31.1.
The
sigmoid(input)
method
-
xinput : anMLOperand. The input tensor.
Returns:
-
an
MLOperand. The output tensor of the same shape asxinput .
The
sigmoid(input)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the sigmoid operation.
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.31.2.
The
sigmoid()
method
None.
Returns:
an
MLActivation. The activation function representing the sigmoid operation.
The
behavior
of
this
operation
can
sigmoid()
method
steps
are:
Let op be
generically emulated fromtheusageresult ofother operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged frominvoking theperformance standpoint. builder builder builder buildercreate MLActivation steps with"sigmoid".If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.31.
7.6.32.
The
slice()
method
Produce
a
slice
of
the
input
tensor.
partial interface MLGraphBuilder {MLOperand (slice MLOperand ,input sequence <unsigned long >,starts sequence <unsigned long >); };sizes
-
input : an
MLOperand. The input tensor. -
starts : a sequence of
unsigned long. The sequence of unsigned integer values indicating the starting index to slice of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , starts[d] indicates the starting index to slice in that dimension. The starting index must be in the range [0, input size - 1] in that dimension. -
sizes : a sequence of
unsigned long. The sequence of unsigned integer values indicating the number of elements to slice of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , sizes[d] indicates the number of elements to slice in that dimension. The size must not be 0 and must satisfy the constraint starting index + size <= input size in that dimension.
Returns:
an
MLOperand
.
The
output
tensor
of
the
same
rank
as
the
input
tensor
with
tensor
values
stripped
to
the
specified
starting
and
ending
indices
in
each
dimension.
The
slice(input,
starts,
sizes)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If starts or sizes is not a sequence of
long, then throw a "TypeError"DOMExceptionand stop.If sizes .size is 0, then throw a "
TypeError"DOMExceptionand stop.Further validation of starts and sizes given input is left implementation-defined .If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the slice operation, given starts and sizes .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.32.
7.6.33.
The
softmax()
method
Compute
the
softmax
values
of
the
2-D
input
tensor
along
axis
1.
partial interface MLGraphBuilder {MLOperand (softmax MLOperand );input MLActivation (); };softmax
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
// This sample deploys a well-known implementation trick [1] to compute the // exponentials of the distances to the max value, instead of the exponentials // of the input values itself, in order to increase the numerical stability of // the result. // [1]: https://cs231n.github.io/linear-classify/#softmax const max_x= builder. reduceMax( x, { axes: [ 1 ], keepDimensions: true }); const exp_x= builder. exp( builder. sub( x, max_x)); return builder. div( exp_x, builder. reduceSum( exp_x, { axes: [ 1 ], keepDimensions: true }));
7.6.33.1.
The
softmax(input)
method
-
xinput : anMLOperand. The input 2-D tensor.
Returns:
-
an
MLOperand. The output 2-D tensor that contains the softmax results, of the same shape as the input tensor.
The
softmax(input)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the softmax operation.
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.33.2.
The
softmax()
method
None.
Returns:
an
MLActivation. The activation function representing the softmax operation.
The
softmax()
method
steps
are:
Let op be the result of invoking the create MLActivation steps with
"softmax".If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.34. The softplus() method
Compute the softplus function of the input tensor. The calculation follows the expression
ln(1
+
exp(steepness
*
x))
/
steepness
.dictionary {MLSoftplusOptions float steepness = 1; };partial interface MLGraphBuilder {MLOperand (softplus MLOperand ,input optional MLSoftplusOptions = {});options MLActivation (softplus optional MLSoftplusOptions = {}); };options
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. div( builder. log( builder. add( builder. exp( builder. mul( x, builder. constant( options. steepness))), builder. constant( 1 ))), builder. constant( options. steepness));
MLSoftplusOptions
has
the
following
members:
steepness, of type float , defaulting to1A
floatscalar parameter. The default value is1.
To check softplus options given options , run the following steps:
If options is not an object , then return
false.If options .
steepnessisundefined, set options .steepnessto1.Else if options .
steepnessis not a numeric type , then then returnfalse.Return
true.
7.6.33.
7.6.34.1.
The
softplus()
softplus(input,
options)
method
Compute
the
softplus
function
of
the
input
tensor.
The
calculation
follows
the
expression
ln(1
+
exp(steepness
*
x))
/
steepness
.
{
= 1;
};
{
= {});
= {});
};
-
xinput : anMLOperand. The input tensor. -
options : an optional
MLSoftplusOptions. The optional parameters of the operation.
Returns:
-
an
MLOperand. The output tensor of the same shape as x .
The
softplus(input,
options)
method
steps
are:
Let input be the first argument.
-
Let options be the second argument.
If running the check softplus options steps with options returns
false, then throw a "TypeError"DOMExceptionand abort these steps.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the softplus operation, given options .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.34.2.
The
softplus(options)
method
options : an optional
MLSoftplusOptions. The optional parameters of the operation.
Returns:
an
MLActivation. The activation function representing the softplus operation.
The
softplus(options)
method
steps
are:
Let options be the first argument.
If running the check softplus options steps with options returns
false, then throw a "TypeError"DOMExceptionand abort these steps.
Let op be the result of invoking the create MLActivation steps with
"softplus"and options .If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.35. The softsign() method
Compute the softsign function of the input tensor. The calculation follows the expression
x
/
(1
+
|x|)
.partial interface MLGraphBuilder {MLOperand (softsign MLOperand );input MLActivation (); };softsign
The
behavior
of
this
operation
can
be
generically
emulated
from
the
usage
of
other
operations
as
follow.
However,
user
agents
typically
have
a
more
efficient
implementation
for
it,
therefore
its
usage
is
encouraged
from
the
performance
standpoint.
builder
builder
builder
builder
builder
return builder. div( x, builder. add( builder. constant( 1 ), builder. abs( x)));
7.6.34.
7.6.35.1.
The
softsign()
softsign(input)
method
Compute
the
softsign
function
of
the
input
tensor.
The
calculation
follows
the
expression
x
/
(1
+
|x|)
.
{
);
();
};
-
xinput : anMLOperand. The input tensor.
Returns:
-
an
MLOperand. The output tensor of the same shape as x .
The
softsign(input)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the softsign operation, given options .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.35.2.
The
softsign()
method
None.
Returns:
an
MLActivation. The activation function representing the softsign operation.
The
behavior
of
this
operation
can
softsign()
method
steps
are:
Let op be
generically emulated fromtheusageresult ofother operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged frominvoking theperformance standpoint.create MLActivation steps with"softsign".If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.35.
7.6.36.
The
split()
method
Split
the
input
tensor
into
a
number
of
sub
tensors
along
the
given
axis.
dictionary {MLSplitOptions = 0;unsigned long axis = 0; };{ , (,partial interface MLGraphBuilder {sequence <MLOperand >(split MLOperand , (input unsigned long or sequence <unsigned long >),splits optional MLSplitOptions = {}); };options
-
input : an
MLOperand. The input tensor. -
splits : an
unsigned longor a sequence ofunsigned long. If anunsigned long, it specifies the number of output tensors along the axis. The number must evenly divide the dimension size of input along options.axis . If a sequence ofunsigned long, it specifies the sizes of each output tensor along the options.axis . The sum of sizes must equal to the dimension size of input along options.axis . -
options : an optional
MLSplitOptions. The optional parameters of the operation.axis : an unsigned long scalar. The dimension along which to split. Its value must be in the range [0, N-1] where N is the rank of input tensor. Default to 0.
Returns:
a
sequence
of
MLOperand
.
The
splitted
output
tensors.
If
splits
is
an
unsigned
long
,
the
length
of
the
output
sequence
equals
to
splits
.
The
shape
of
each
output
tensor
is
the
same
as
input
except
the
dimension
size
of
axis
equals
to
the
quotient
of
dividing
the
dimension
size
of
input
along
axis
by
splits
.
If
splits
is
a
sequence
of
unsigned
long
,
the
length
of
the
output
sequence
equals
to
the
length
of
splits
.
The
shape
of
the
i-th
output
tensor
is
the
same
as
as
input
except
along
axis
where
the
dimension
size
is
splits[i]
.
MLSplitOptions
has
the
following
members:
axis, of type unsigned long , defaulting to0An
unsigned longscalar. The dimension along which to split. Its value must be in the range [0, N-1] where N is the rank of input tensor. The default value is0.
The
split(input,
splits,
options)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If splits is not
unsigned longor a sequence ofunsigned long, then throw a "TypeError"DOMExceptionand stop.If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the split operation, given splits and options .
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
// This sample shows the case that the splits parameter is an array. const outputs= []; let starts= Array( input_rank). fill( 0 ); let sizes= input_shape; let start= 0 ; for ( const sizeof splits) { starts[ options. axis] = start; sizes[ options. axis] = size; outputs. push( builder. slice( input, starts, sizes)); start+= size; } return outputs;
7.6.36.
7.6.37.
The
squeeze()
method
Reduce
the
rank
of
a
tensor
by
eliminating
dimensions
with
size
1
of
the
tensor
shape.
Squeeze
only
affects
the
tensor’s
logical
dimensions.
It
does
not
copy
or
change
the
content
in
the
tensor.
dictionary {MLSqueezeOptions ;sequence <unsigned long >axes ; };{ = {});partial interface MLGraphBuilder {MLOperand (squeeze MLOperand ,input optional MLSqueezeOptions = {}); };options
-
input : an
MLOperand. The input tensor. -
options : an optional
MLSqueezeOptions. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
output
tensor
of
the
same
or
reduced
rank
with
the
shape
dimensions
of
size
1
eliminated.
MLSqueezeOptions
has
the
following
members:
axes, of type: asequence<unsigned long>A sequence of
unsigned long.IndicesSpecifies the indices to the shape dimensions of size 1 to eliminate. The values in the sequence must be in the range [0, N-1] where N is the rank of input tensor. When not specified, every shape dimensions of size 1 in the tensor are eliminated.
The
Returns:
squeeze(input,
options)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.If options is
undefined, let options be an empty object .If options .
axesexists , then:Let dimensions be input .
[[descriptor]].Thedimensions.For index between 0 and the size of options .
axes:Let oneDimIndex be options .
axes[ index ].If dimensions [ oneDimIndex ] is not
1, then throw a "TypeError"DOMExceptionand stop.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output
tensorbe the result of invoking thesame or reduced rank withcopy MLOperand steps given input .Make a request to the
shape dimensionsunderlying platform to:Let opImpl be an implementation-defined platform operator for the squeeze operation, given options .
Store a reference of
size 1 eliminated.opImpl in output .[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
-
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.37.
7.6.38.
The
tanh()
method
Compute
the
hyperbolic
tangent
function
of
the
input
tensor.
The
calculation
follows
the
expression
(exp(2
*
x)
-
1)
/
(exp(2
*
x)
+
1)
.
partial interface MLGraphBuilder {MLOperand (tanh MLOperand );input MLActivation (); };tanh
The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
return builder. div( builder. sub( builder. exp( builder. mul( builder. constant( 2 ), x)), builder. constant( 1 )), builder. add( builder. exp( builder. mul( builder. constant( 2 ), x)), builder. constant( 1 )));
7.6.38.1.
The
tanh(input)
method
-
xinput : anMLOperand. The input tensor.
Returns:
-
an
MLOperand. The output tensor of the same shape as x .
The
tanh(input)
steps
are:
If input is not an instance of
MLOperand, then throw a "TypeError"DOMExceptionand stop.-
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the hyperbolic tangent operation.
Store a reference of opImpl in output .
[[operator]].Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
7.6.38.2.
The
tanh()
method
None.
Returns:
an
MLActivation. The activation function representing the tanh operation.
The
behavior
of
this
operation
can
tanh()
method
steps
are:
Let op be
generically emulated fromtheusageresult ofother operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged frominvoking theperformance standpoint. builder buildercreate MLActivation steps with"tanh".If that throws an error, re-throw the error and abort these steps.
Return op .
7.6.38.
7.6.39.
The
transpose()
method
Permute
the
dimensions
of
the
input
tensor
according
to
the
permutation
argument.
dictionary {MLTransposeOptions ;sequence <unsigned long >permutation ; };{ = {});partial interface MLGraphBuilder {MLOperand (transpose MLOperand ,input optional MLTransposeOptions = {}); };options
-
input : an
MLOperand. The input N-D tensor. -
options : an optional
MLTransposeOptions. The optional parameters of the operation.
Returns:
an
MLOperand
.
The
permuted
or
transposed
N-D
tensor.
MLTransposeOptions
has
the
following
members:
permutation, of type: asequence<unsigned long>A sequence of
unsigned longvalues.TheSpecifies the values used to permute the output shape.When it’s not specified, it’s set toThe default value is [N-1, ..., 0], where N is the rank of the input tensor, e.g. [2,1,0] for a 3-D tensor. These default values cause the output to become a transposed tensor of the input. When specified, the number of values in the sequence must be the same as the rank of the input tensor, and the values in the sequence must be within the range from 0 to N-1 with no two or more same values found in the sequence.
The
transpose(input,
options)
steps
are:
-
If input is not an instance of
, then throw a "Returns:MLOperandTypeError"DOMExceptionand stop. If options is
undefined, let options be an empty object .If options .
permutationisundefined, let options .permutationbe the reversed sequence of all indices for input .[[descriptor]].dimensions.Otherwise if options .
permutationexists :If options .
permutationis not a sequence ofunsigned long, then throw a "TypeError"DOMExceptionand stop.If the rank of options .
permutationis not the same as the rank of input .[[descriptor]].dimensions, then throw a "TypeError"DOMExceptionand stop.If the values in options .
permutationare not between0and the rank of input .[[descriptor]].dimensionsminus1, then throw a "TypeError"DOMExceptionand stop.If the values in options .
permutationcontain duplicate value, then throw a "TypeError"DOMExceptionand stop.
If any of the following sub-steps fail, throw an "
OperationError"DOMExceptionand stop.Let output be the result of invoking the copy MLOperand steps given input .
Make a request to the underlying platform to:
Let opImpl be an implementation-defined platform operator for the transpose operation, given options .
Store a reference of opImpl in output .
[[operator]].The permuted or transposed N-D tensor.-
Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl .
Store a reference to outputImpl in output .
[[operand]].
Connect input .
[[operand]]as input to opImpl .Connect output .
[[operand]]as output to opImpl .
Return output .
8. Examples
const context= await navigator. ml. createContext({ powerPreference: 'low-power' });
constant1 ---++--- Add ---> intermediateOutput1 ---++--- Add ---> intermediateOutput1 ---+ input1 ---+ |+--- Mul---> output+--- Mul---> output constant2 ---+ |+--- Add ---> intermediateOutput2 ---++--- Add ---> intermediateOutput2 ---+ input2 ---+
The following code implements the graph:
// Use tensors in 4 dimensions. const TENSOR_DIMS= [ 1 , 2 , 2 , 2 ]; const TENSOR_SIZE= 8 ; const builder= new MLGraphBuilder( context); // Create MLOperandDescriptor object. const desc= { type: 'float32' , dimensions: TENSOR_DIMS}; // constant1 is a constant MLOperand with the value 0.5. const constantBuffer1= new Float32Array( TENSOR_SIZE). fill( 0.5 ); const constant1= builder. constant( desc, constantBuffer1); // input1 is one of the input MLOperands. Its value will be set before execution. const input1= builder. input( 'input1' , desc); // constant2 is another constant MLOperand with the value 0.5. const constantBuffer2= new Float32Array( TENSOR_SIZE). fill( 0.5 ); const constant2= builder. constant( desc, constantBuffer2); // input2 is another input MLOperand. Its value will be set before execution. const input2= builder. input( 'input2' , desc); // intermediateOutput1 is the output of the first Add operation. const intermediateOutput1= builder. add( constant1, input1); // intermediateOutput2 is the output of the second Add operation. const intermediateOutput2= builder. add( constant2, input2); // output is the output MLOperand of the Mul operation. const output= builder. mul( intermediateOutput1, intermediateOutput2);
// Compile the constructed graph. const graph= await builder. build({ 'output' : output});
The following code executes the compiled graph.
// Setup the input buffers with value 1. const inputBuffer1= new Float32Array( TENSOR_SIZE). fill( 1 ); const inputBuffer2= new Float32Array( TENSOR_SIZE). fill( 1 ); const outputBuffer= new Float32Array( TENSOR_SIZE); // Execute the compiled graph with the specified inputs. const inputs= { 'input1' : inputBuffer1, 'input2' : inputBuffer2, }; const outputs= { 'output' : outputBuffer}; const result= await context. compute( graph, inputs, outputs); console. log( 'Output value: ' + result. outputs. output); // Output value: 2.25,2.25,2.25,2.25,2.25,2.25,2.25,2.25
9. Appendices
9.1.
MLOperandType
and
ArrayBufferView
compatibility
MLOperandType
|
ArrayBufferView
|
|---|---|
float32
|
Float32Array
|
float16
|
Float16Array
|
int32
|
Int32Array
|
uint32
|
Uint32Array
|
int8
|
Int8Array
|
uint8
|
Uint8Array
|
Float16Array
is
at
ECMA
Stage
3
signaling
its
design
is
finished.
Implementers
wanting
to
enable
this
type
ahead
native
implementations
can
emulate
the
type
by
passing
raw
bits
via
Uint16Array
.
[Issue
webnn#373]
10. Acknowledgements
This specification follows the concepts of the Android Neural Networks API C API.
Thanks to Tomoyuki Shimizu, Ningxin Hu, Zhiqiang Yu and Belem Zhang for the use cases.
Thanks to Nikhil Thorat, Daniel Smilkov, Ganesan Ramalingam, Rafael Cintron and Benjamin Poulain for their contributions to the API specification.
Thanks to Sangwhan Moon and the W3C Technical Architecture Group for review of this specification for web architecture fit, design consistency and developer ergonomics.
Thanks to W3C Privacy Interest Group for privacy and security review and feedback.
Thanks to Alex Gough and the Chrome Security team for security review and questions.
Thanks to Michal Karzynski for sharing practical guidelines and learnings from ONNX.
Thanks to Kaustubha Govind and Chrome privacy reviewers for feedback and privacy considerations.
Thanks to Jiewei Qian for Chromium implementation review and feedback.