Web Neural Network API

1. Introduction

The Web Neural Network API defines a web-friendly hardware-agnostic abstraction layer that makes use of Machine Learning capabilities of operating systems and underlying hardware platforms without being tied to platform-specific capabilities. The abstraction layer addresses the requirements of key Machine Learning JavaScript frameworks and also allows web developers familiar with the ML domain to write custom code without the help of libraries. A complementary Model Loader API defines a higher-level abstraction targeting primarily web developers.

For an illustrated introduction, please see the explainer .

2. Use cases

2.1. Application Use Cases

This section illustrates application-level use cases for neural network inference hardware acceleration. All applications in those use cases can be built on top of pre-trained deep neural network (DNN) [models] .

Note: Please be aware that some of the use cases described here, are by their very nature, privacy-invasive. Developers who are planning to use the API for such use cases should ensure that the API is being used to benefit users, for purposes that users understand, and approve. They should apply the Ethical Principles for Web Machine Learning [webmachinelearning-ethics] and implement appropriate privacy risk mitigations such as transparency, data minimisation, and users controls.

2.1.1. Person Detection

A user opens a web-based video conferencing application, but she temporarily leaves from her room. The application is watching whether she is in front of her PC by using object detection (for example, using object detection approaches such as [SSD] or [YOLO] that use a single DNN) to detect regions in a camera input frame that include persons.

When she comes back, the application automatically detects her and notifies other online users that she is active now.

2.1.2. Semantic Segmentation

A user joins a teleconference via a web-based video conferencing application at her desk since no meeting room in her office is available. During the teleconference, she does not wish that her room and people in the background are visible. To protect the privacy of the other people and the surroundings, the application runs a machine learning model such as [DeepLabv3+] or [MaskR-CNN] to semantically split an image into segments and replaces segments that represent other people and background with another picture.

2.1.3. Skeleton Detection

A web-based video conferencing application tracks a pose of user’s skeleton by running a machine learning model, which allows for real-time human pose estimation, such as [PoseNet] to recognize her gesture and body language. When she raises her hand, her microphone is automatically unmuted and she can start speaking on the teleconference.

2.1.4. Face Recognition

There are multiple people in the conference room and they join an online meeting using a web-based video conferencing application. The application detects faces of participants by using object detection (for example, using object detection approaches such as [SSD] ) and checks whether each face was present at the previous meeting or not by running a machine learning model such as [FaceNet] , which verifies whether two faces would be identical or not.

2.1.5. Facial Landmark Detection

A user wants to find new glasses that beautifully fits her on an online glasses store. The online store offers web-based try-on simulator that runs a machine learning model such as Face Alignment Network [FAN] to detect facial landmarks like eyes, nose, mouth, etc. When she chooses a pair of glasses, the simulator properly renders the selected glasses on the detected position of eyes on her facial image.

2.1.6. Style Transfer

A user is looking for cosmetics on an online store and wondering which color may fit her face. The online store shows sample facial makeup images of cosmetics, and offers makeup simulator that runs a machine learning model like [ContextualLoss] or [PairedCycleGAN] to transfer the makeup style of the sample makeup image to her facial image. She can check how the selected makeup looks like on her face by the simulator.

2.1.7. Super Resolution

A web-based video conferencing is receiving a video stream from its peer, but the resolution of the video becomes lower due to network congestion. To prevent degradation of the perceived video quality, the application runs a machine learning model for super-resolution such as [SRGAN] to generate higher-resolution video frames.

2.1.8. Image Captioning

For better accessibility, a web-based presentation application provides automatic image captioning by running a machine learning model such as [im2txt] which predicts explanatory words of the presentation slides.

2.1.9. Machine Translation

Multiple people from various countries are talking via a web-based real-time text chat application. The application translates their conversation by using a machine learning model such as [GNMT] or [OpenNMT] , which translates every text into different language.

2.1.10. Emotion Analysis

A user is talking to her friend via a web-based real-time text chat application, and she is wondering how the friend feels because she cannot see the friend’s face. The application analyses the friend’s emotion by using a machine learning model such as [DeepMoji] , which infers emotion from input texts, and displays an emoji that represents the estimated emotion.

2.1.11. Video Summarization

A web-based video conferencing application records received video streams, and it needs to reduce recorded video data to be stored. The application generates the short version of the recorded video by using a machine learning model for video summarization such as [Video-Summarization-with-LSTM] .

2.1.12. Noise Suppression

A web-based video conferencing application records received audio streams, but usually the background noise is everywhere. The application leverages real-time noise suppression using Recurrent Neural Network such as [RNNoise] for suppressing background dynamic noise like baby cry or dog barking to improve audio experiences in video conferences.

2.1.13. Detecting fake video

A user is exposed to realistic fake videos generated by ‘deepfake’ on the web. The fake video can swap the speaker’s face into the president’s face to incite a user politically or to manipulate user’s opinion. The deepfake detection applications such as [FaceForensics++] analyze the videos and protect a user against the fake videos or images. When she watches a fake video on the web, the detection application alerts her of the fraud video in real-time.

2.2. Framework Use Cases

This section collects framework-level use cases for a dedicated low-level API for neural network inference hardware acceleration. It is expected that Machine Learning frameworks will be key consumers of the Web Neural Network API (WebNN API) and the low-level details exposed through the WebNN API are abstracted out from typical web developers. However, it is also expected that web developers with specific interest and competence in Machine Learning will want to interface with the WebNN API directly instead of a higher-level ML framework.

2.2.1. Custom Layer

A web application developer wants to run a DNN model on the WebNN API. However, she has found that some of activation functions like [LeakyReLU] , [ELU] , etc. are not included in the WebNN API. To address this issue, she constructs custom layers of the additional activation functions on top of the WebNN API. Note that the scope of custom layers may include convolution, normalization, etc. as well as activation.

2.2.2. Network Concatenation

A web application uses a DNN model, and its model data of upper convolutional layers and lower fully-connected layers are stored in separate files, since model data of the fully-connected layers are periodically updated due to fine tuning at the server side.

Therefore, the application downloads both partial model files at first and concatenates them into a single model. When the model is updated, the application downloads fine-tuned part of the model and replace only the fully-connected layers with it.

2.2.3. Performance Adaptation

A web application developer has a concern about performance of her DNN model on mobile devices. She has confirmed that it may run too slow on mobile devices which do not have GPU acceleration. To address this issue, her web application refers to the WebNN API to confirm whether acceleration is available or not, so that the application can display the warning for devices without acceleration.

After several weeks, she has developed a tiny DNN model that can even run on CPU. In order to accommodate CPU execution, she modifies the application so that the application loads the tiny model in the case of CPU-only devices.

2.2.4. Operation Level Execution

A JavaScript ML framework is responsible for loading, interpreting and executing a ML model. During the model execution phase, the framework iterates through the operations of the model and executes each operation on the hardware device, like CPU, GPU or ML accelerator. To avoid the unnecessary data copying across devices, the framework selects the same device to execute the operations. For a compute intensive operation, such as convolution 2D or matrix multiplication, the framework uses WebNN API to execute it with the ML-specific acceleration available on that selected device.

2.2.5. Integration with real-time video processing

The user experience of WebRTC-based video conferencing is enhanced using real-time video processing. For example, background blur implemented using a § 2.1.2 Semantic Segmentation model blurs the background in the user’s live camera feed. To satisfy the performance requirements of this use case, the WebNN API integrates with primitives from other Web APIs that make up the media pipeline to allow WebNN API-based transformation of real-time video streams.

3. Security Considerations

This specification defines a low-level API for neural network inference hardware acceleration. This API is considered a powerful feature [POWERFUL-FEATURES] because it grants low-level access to a user’s computer. To meet the authentication and confidentiality expectations of a powerful feature and to prevent man-in-the-middle attacks, all interfaces defined by this specification are only available in a secure context.

This API is disabled by default in all cross-origin frames using the § 7.2.1 Permissions Policy Integration . This prevents third-party content from using this API unless the embedding page explicitly sets a policy that grants permission.

This API allows creation of an MLContext from a GPUDevice defined by WebGPU specification. See WebGPU Security Considerations for more information regarding security characteristics of this context.

Once the graph is fully constructed and compiled, the input shapes into each of the operations in the graph are inferred and finalized. The bounds checking occurs when the compute method is invoked that executes the graph against the actual data. No actual data is bound to the compiled graph before this stage. It is the implementation’s responsibility to make sure proper bounds checking occurs against the shapes of the data already inferred by that time.

Document operations susceptible to out-of-bounds access as a guidance to implementers.

As a future-proofing measure, the API design allows certain operations that can be generically emulated to be deprecated for security, performance, or other reasons without breaking compatibility. This is made possible by high-level functions that are defined in terms of smaller primitive operations defined in this specifications. This enables a native implementation of a high-level function to be replaced with a polyfill implementation.

Investigate side channel attack feasibility considering the current state where CPU is shared between processes running renderers.

In order to not allow an attacker to target a specific implementation that may contain a flaw, the § 6.2 Device Selection mechanism is a hint only, and the concrete device selection is left to the implementation - a user agent could for instance choose never to run a model on a device with known vulnerabilities. As a further mitigation, no device enumeration mechanism is defined.

Hinting partially mitigates the concern. Investigate additional mitigations.

The API design minimizes the attack surface for the compiled computational graph. The MLGraphBuilder interface that hosts the various operations is a data definition API and as such doesn’t execute anything, only constructs data. What follows, is that the potential for an attack is limited to when binding the data to the graph before executing it by invoking the MLContext. compute() method. This enables implementers to focus on hardening the MLContext. compute() method. For example, by making sure it honors the boundary of data and fails appropriately when the bounds are not respected.

Purpose-built Web APIs for measuring high-resolution time mitigate against timing attacks using techniques such as resolution reduction, adding jitter, detection of abuse and API call throttling [hr-time-3] . The practical deployment of WebNN implementations are likely to bring enough jitter to make timing attacks impractical (e.g. because they would use IPC) but implementers are advised to consider and test their implementations against timing attacks.

3.1. Guidelines for new operations

To ensure operations defined in this specification are shaped in a way they can be implemented securely, this section includes guidelines on how operations are expected to be defined to reduce potential for implementation problems. These guidelines are expected to evolve over time to align with industry best practices:

Prefer simplicity of arguments
Don’t use parsers for complex data formats
If an operation can be decomposed to low level primitives:
- Add an informative emulation path
- Prefer primitives over new high level operations but consider performance consequences
Operations should follow a consistent style for inputs and attributes
Operation families such as pooling and reduction should share API shape and options
Formalize failure cases into test cases whenever possible
When in doubt, leave it out: API surface should be as small as possible required to satisfy the use cases, but no smaller
Try to keep the API free of implementation details that might inhibit future evolution, do not overspecify
Fail fast: the sooner the web developer is informed of an issue, the better

In general, always consider the security and privacy implications as documented in [security-privacy-questionnaire] by the Technical Architecture Group and the Privacy Interest Group when adding new features.

4. Privacy Considerations

This API enhances privacy compared to cloud-based inference, since input data such as locally sourced images or video streams stay within the browser’s sandbox.

This API exposes the minimum amount of information necessary to address the identified § 2 Use cases for the best performance and reliability of results.

No information from the underlying platform is exposed directly. An execution time analysis may reveal indirectly the performance of the underlying platform’s neural network hardware acceleration capabilities relative to another underlying platform.

Note: The group is soliciting further input on the proposed execution time analysis fingerprinting vector and will augment this section with more information and mitigations to inform the implementers of this API.

Unlike WebGPU, this API does not intrinsically support custom shader authoring; and as a result is not prone to timing attacks that rely on shader caches, or other persistent data. The API builds upon pre-existing shaders and lower level primitives of the browser or the underlying OS. Web developers who interface with GPUDevice are expected to be aware of WebGPU compilation cache considerations .

The WebGPU API identifies machine-specific artifacts as a privacy consideration. Given the WebNN API defines means to record an ML workload onto a WebGPU-compatible GPUCommandBuffer, compute unit scheduling may under certain circumstances introduce a fingerprint. However, similarly to WebGPU, such fingerprints are identical across most or all of the devices of each vendor, mitigating the concern. Furthermore, software implementations can be used to further eliminate such artifacts.

The WebNN API defines two developer-settable preferences to help inform § 6.2 Device Selection and allow the implementation to better select the most appropriate underlying execution device for the workload. Device type normatively indicates the kind of device and is either "cpu" or "gpu". If this type cannot be satisfied, an " OperationError " DOMException is thrown, thus this type can in some cases add two bits of entropy to the fingerprint. Power preference indicates preference as related to the power consumption and is considered a hint only and as such does not increase entropy of the fingerprint.

If a future version of this specification introduces support for new a device type that can only support a subset of MLOperandType s, that may introduce a new fingerprint.

In general, implementers of this API are expected to apply WebGPU Privacy Considerations to their implementations where applicable.

5. Ethical Considerations

The Working Group has started documenting ethical issues associated with using Machine Learning on the Web, to help identify what mitigations its normative specifications should take into account. The Working Group publishes and maintains an Ethical Principles for Web Machine Learning document [webmachinelearning-ethics] open to contributions from the wider community via a dedicated GitHub repository .

6. Programming Model

6.1. Overview

At the heart of neural networks is a computational graph of mathematical operations. These operations are the building blocks of modern machine learning technologies in computer vision, natural language processing, and robotics. The WebNN API is a specification for constructing, compiling, and executing computational graphs of neural networks.

The MLGraph interface represents a compiled computational graph that is immutable (that is, a model).

The MLGraphBuilder interface serves as a builder (factory) to create an MLGraph. An MLOperand is a representation of data that flows within the computational graph, which include input-values for inference, constants (including trained weights) used for inference, intermediate values (often referred to as activations) computed during inference, as well as the output values of inference. At inference time, every MLOperand will be bound to a tensor (the actual data).

The MLGraphBuilder interface enables the creation of MLOperand s. A key part of the MLGraphBuilder interface are the operations (such as MLGraphBuilder. gemm() and MLGraphBuilder. softmax() ). The operations have a functional semantics, with no side effects. Each operation invocation conceptually returns a distinct new value, without changing the value of any other MLOperand.

The runtime values (of MLOperand s) are tensors, which are essentially multidimensional arrays. The representation of the tensors is implementation dependent, but it typically includes the array data stored in some buffer (memory) and some metadata describing the array data (such as its shape).

As mentioned above, the operations have a functional semantics. This allows the implementation to potentially share the array data between multiple tensors. For example, the implementation of operations such as reshape, or slice, or squeeze may return a view of its input tensor that shares the same buffer as the input tensor. (In the case of reshape or squeeze, the entire data is shared, while in the case of slice, a part of the input data is shared.) The implementation may use views, as above, for intermediate values.

Before the execution, the computation graph that is used to compute one or more specified outputs needs to be compiled and optimized. The key purpose of the compilation step is to enable optimizations that span two or more operations, such as operation or loop fusion.

There are multiple ways by which the graph may be compiled. The MLGraphBuilder. build() method compiles the graph in the background without blocking the calling thread, and returns a Promise that resolves to an MLGraph. The MLGraphBuilder. buildSync() method compiles the graph immediately on the calling thread, which must be a worker thread running on CPU or GPU device, and returns an MLGraph. Both compilation methods produce an MLGraph that represents a compiled graph for optimal execution.

Once the MLGraph is constructed, there are multiple ways by which the graph may be executed. The MLContext. computeSync() method represents a way the execution of the graph is carried out immediately on the calling thread, which must also be a worker thread, either on a CPU or GPU device. The execution produces the results of the computation from all the inputs bound to the graph.

The MLContext. compute() method represents a way the execution of the graph is performed asynchronously either on a parallel timeline in a separate worker thread for the CPU execution or on a GPU timeline in a GPU command queue. This method returns immediately without blocking the calling thread while the actual execution is offloaded to a different timeline. This type of execution is appropriate when the responsiveness of the calling thread is critical to good user experience. The computation results will be placed at the bound outputs at the time the operation is successfully completed on the offloaded timeline at which time the calling thread is signaled. This type of execution supports both the CPU and GPU device.

In both the MLContext. compute() and MLContext. computeSync() execution methods, the caller supplies the input values using MLNamedArrayBufferViews, binding the input MLOperand s to their values. The caller then supplies pre-allocated buffers for output MLOperand s using MLNamedArrayBufferViews.

The MLCommandEncoder interface created by the MLContext. createCommandEncoder() method supports a graph execution method that provides the maximum flexibility to callers that also utilize WebGPU in their application. It does this by placing the workload required to initialize and compute the results of the operations in the graph onto a GPUCommandBuffer. The callers are responsible for the eventual submission of this workload on the GPUQueue through the WebGPU queue submission mechanism. Once the submitted workload is completely executed, the result is avaialble in the bound output buffers.

6.2. Device Selection

An MLContext interface represents a global state of neural network execution. One of the important context states is the underlying execution device that manages the resources and facilitates the compilation and the eventual execution of the neural network graph. In addition to the default method of creation with MLContextOptions, an MLContext could also be created from a specific GPUDevice that is already in use by the application, in which case the corresponding GPUBuffer resources used as graph constants, as well as the GPUTexture as graph inputs must also be created from the same device. In a multi-adapter configuration, the device used for MLContext must be created from the same adapter as the device used to allocate the resources referenced in the graph.

In a situation when a GPU context executes a graph with a constant or an input in the system memory as an ArrayBufferView, the input content is automatically uploaded from the system memory to the GPU memory, and downloaded back to the system memory of an ArrayBufferView output buffer at the end of the graph execution. This data upload and download cycles will only occur whenever the execution device requires the data to be copied out of and back into the system memory, such as in the case of the GPU. It doesn’t occur when the device is a CPU device. Additionally, the result of the graph execution is in a known layout format. While the execution may be optimized for a native memory access pattern in an intermediate result within the graph, the output of the last operation of the graph must convert the content back to a known layout format at the end of the graph in order to maintain the expected behavior from the caller’s perspective.

When an MLContext is created with MLContextOptions, the user agent selects and creates the underlying execution device by taking into account the application’s power preference and device type specified in the MLPowerPreference and MLDeviceType options.

The following table summarizes the types of resource supported by the context created through different method of creation:

Creation method	ArrayBufferView	GPUBuffer	GPUTexture
MLContextOptions	Yes	No	No
GPUDevice	Yes	Yes	Yes

7. API

7.1. The navigator.ml interface

An ML object is available in the Window and DedicatedWorkerGlobalScope contexts through the Navigator and WorkerNavigator interfaces respectively and is exposed via navigator.ml.

interface mixin NavigatorML {
  [SecureContext, SameObject] readonly attribute ML ml;
};
Navigator includes NavigatorML;
WorkerNavigator includes NavigatorML;

7.2. The ML interface

enum MLDeviceType {
  "cpu",
  "gpu"
};
enum MLPowerPreference {
  "default",
  "high-performance",
  "low-power"
};
dictionary MLContextOptions {
  MLDeviceType deviceType = "cpu";
  MLPowerPreference powerPreference = "default";
};
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface ML {
  Promise<MLContext> createContext(optional MLContextOptions options = {});
  Promise<MLContext> createContext(GPUDevice gpuDevice);
  [Exposed=(DedicatedWorker)]
  MLContext createContextSync(optional MLContextOptions options = {});
  [Exposed=(DedicatedWorker)]
  MLContext createContextSync(GPUDevice gpuDevice);
};

7.2.1. Permissions Policy Integration

This specification defines a policy-controlled feature identified by the string " webnn ". Its default allowlist is 'self'.

7.2.2. The `createContext()` method

The



createContext()

method steps are:

If this 's relevant global object 's associated Document is not allowed to use the webnn feature, return a new promise rejected with a " SecurityError " DOMException and abort these steps.
Let promise be a new promise .
Return promise and run the following steps in parallel .
Let options be the first argument.
Run the create context steps given options:
1. Let context be a new MLContext object.
2. If options is a GPUDevice object,
  1. Set context. [[contextType]] to " webgpu ".
  2. Set context. [[deviceType]] to " gpu ".
  3. Set context. [[powerPreference]] to " default ".
3. Otherwise,
  1. Set context. [[contextType]] to " default ".
  2. If options [" deviceType "] exists , then set context. [[deviceType]] to options [" deviceType "]. Otherwise, set context. [[deviceType]] to " cpu ".
  3. If options [" powerPreference "] exists , then set context. [[powerPreference]] to options [" powerPreference "]. Otherwise, set context. [[powerPreference]] to " default ".
If the validate MLContext steps given context return false, reject promise with a " NotSupportedError " DOMException and abort these steps.
Resolve promise with context.

7.2.3. The `createContextSync()` method

The



createContextSync()

method steps are:

If this 's relevant global object 's associated Document is not allowed to use the webnn feature, throw a " SecurityError " DOMException and abort these steps.
Let options be the first argument.
Let context be the result of running the create context steps given options.
If the validate MLContext steps given context return false, throw a " NotSupportedError " DOMException and abort these steps.
Return context.

7.3. The MLGraph interface

The



MLGraph

interface represents a compiled computational graph. A compiled graph once constructed is immutable and cannot be subsequently changed.

[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLGraph {};



MLGraph

has the following internal slots:

[[context]] of type MLContext: The context of type MLContext associated with this MLGraph.
[[inputDescriptors]] of type record < DOMString, MLOperandDescriptor >: Maps the name of an input MLOperand to its MLOperandDescriptor for all input MLOperand s of this MLGraph.
[[outputDescriptors]] of type record < DOMString, MLOperandDescriptor >: Maps the name of an output MLOperand to its MLOperandDescriptor for all output MLOperand s of this MLGraph.
[[implementation]]: The underlying implementation provided by the User Agent.

7.3.1. The MLOperandDescriptor dictionary

enum MLInputOperandLayout {
  "nchw",
  "nhwc"
};
enum MLOperandType {
  "float32",
  "float16",
  "int32",
  "uint32",
  "int8",
  "uint8"
};
dictionary MLOperandDescriptor {
  // The operand type.
  required MLOperandType type;
  // The dimensions field is only required for tensor operands.
  sequence<unsigned long> dimensions;
};

The byte length of an



MLOperandDescriptor

desc is the value returned by the following steps:

Let elementLength be 1.
For each dimension of desc. dimensions:
1. Set elementLength to elementLength × dimension.
Let elementSize be the element size of one of the ArrayBufferView types that matches desc. type according to this table .
Return elementLength × elementSize.

7.3.2. The MLOperand interface

An MLOperand represents an intermediary graph being constructed as a result of compositing parts of an operation into a fully composed operation.

For instance, an MLOperand may represent a constant feeding to an operation or the result from combining multiple constants together into an operation. See also § 6 Programming Model .

[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLOperand {};



MLOperand

has the following internal slots:

[[builder]] of type MLGraphBuilder: The MLOperand 's associated builder object.
[[descriptor]] of type MLOperandDescriptor: The MLOperand 's descriptor.
[[name]] of type string: The MLOperand 's name (only for input operands).
[[operand]] of type object: Reference to MLOperand 's corresponding implementation-defined platform operand object.
[[operator]] of type object: Reference to MLOperand 's corresponding implementation-defined platform operator object.

To get the rank of an MLOperandoperand, run the following steps:

Return the size of operand.[[descriptor]].dimensions.

Since the [[builder]] object is bound by the MLGraphBuilder() constructor to an MLContext object, an MLOperand is also always bound to the same MLContext object.

7.3.2.1. Creating `MLOperand`

The



MLOperand

objects are created by the methods of



MLGraphBuilder

, internally using the following algorithms.

To create MLOperand given builder and desc, run the following steps:

If builder is not an instance of MLGraphBuilder, then throw a " TypeError " DOMException and stop.
If desc is not an object that implements MLOperandDescriptor, then throw a " TypeError " DOMException and stop.
Let operand be a new object .
Set operand. [[builder]] to builder.
Set operand. [[descriptor]] to desc.
Return operand.

To copy MLOperand given operand, run the following steps:

If operand is not an instance of MLOperand, then throw a " TypeError " and stop.
Let result be a new object .
Set result.[[builder]] to operand.[[builder]].
Set result.[[descriptor]] to operand.[[descriptor]].
If operand.[[name]] exists , then set result.[[name]] to operand.[[name]].
Return result.

To check dimensions given dimensions and type, run the following steps:

If dimensions is not an array of positive numbers, return false ;
If dimensions.length is 0, return false.
If dimensions.length is too large to be supported by the implementation, return false.
If any element of dimensions is not a positive number, or it is too large to be supported by the implementation given type, return false.
Return true.

To validate MLOperand given operand and builder, run the following steps:

If operand.[[builder]] is not an instance of MLGraphBuilder, return false.
If builder is not undefined and is not equal to operand.[[builder]], return false.
Let desc be operand.[[descriptor]].
If desc is not an object that implements MLOperandDescriptor, return false.
If desc.dimensions exists and invoking check dimensions given desc.dimensions and desc.type returns false, then return false.
Return true.

7.3.3. The MLActivation interface

Objects implementing the MLActivation interface represent activation function types.

[SecureContext, Exposed=(Window, DedicatedWorker)]
 {};

interface MLActivation {
};



MLActivation

has the following internal slots:

[[name]] of type string: The MLActivation 's name.
[[builder]] of type MLGraphBuilder: The graph builder object this MLActivation belongs to.
[[options]] of type object: A dictionary containing MLActivation options.
[[operator]] of type object: Reference to MLActivation 's corresponding implementation-defined platform operator object.

These activations function types are used to create other operations. One such use of this interface is for when an activation function is fused into another operation such as ~~§ 7.6.7~~ § 7.6.8 The conv2d() method or ~~§ 7.6.4~~ § 7.6.5 The batchNormalization() method during a graph construction session. Such fused activation functions can provide a significant performance improvement when supported natively by the underlying implementation. This is intended as an optimization opportunity for implementers.

7.3.3.1. Creating `MLActivation`

The ~~implementation of the~~



MLActivation

~~interface can simply be a struct that holds a string type of~~ objects (including the ~~activation function along with other properties needed.~~ ones passed as input to methods) are created by the methods of


MLGraphBuilder

and are identified by their name. The options dictionary is defined by those methods. The actual creation of the activation function e.g. a ~~§ 7.6.30~~ § 7.6.31 The sigmoid() method or ~~§ 7.6.27~~ § 7.6.28 The relu() method can then be deferred until when the rest of the graph is ready to connect with it such as during the construction of ~~§ 7.6.7~~ § 7.6.8 The conv2d() method for example.

To create MLActivation given builder,name,options and init-steps, run the following steps:

If builder is not an instance of MLGraphBuilder, throw a " TypeError " and abort these steps.
If name is undefined or null, throw a " TypeError " and abort these steps.
Let activation be a new object .
Set activation.[[builder]] to builder.
Set activation.[[name]] to name.
If options is an object , set activation.[[options]] to options.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Make a request to the underlying platform to:
  1. Create an implementation-defined platform operator opImpl for the given name operation.
  2. Store a reference of opImpl in activation.[[operator]].
2. If init-steps are defined, run init-steps with options.
  1. Otherwise, initialize activation.[[operator]] given options in an implementation-defined way for the given name operation.
Return activation.

7.4. The MLContext interface

The



MLContext

interface represents a global state of neural network compute workload and execution processes. Each



MLContext

object has associated context type , device type and power preference .

The context type is the type of the execution context that manages the resources and facilitates the compilation and execution of the neural network graph:

" default ": Context created per user preference options.
" webgpu ": Context created from WebGPU device.

The device type indicates the kind of device used for the context. It is one of the following:

" cpu ": Provides the broadest compatibility and usability across all client devices with varying degrees of performance.
" gpu ": Provides the broadest range of achievable performance across graphics hardware platforms from consumer devices to professional workstations.

The power preference indicates preference as related to power consumption. It is one of the following:

" default ": Let the user agent select the most suitable behavior.
" high-performance ": Prioritizes execution speed over power consumption.
" low-power ": Prioritizes power consumption over other considerations such as execution speed.

typedef record<DOMString, ArrayBufferView> MLNamedArrayBufferViews;
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLContext {};



MLContext

has the following internal slots:

[[contextType]] of type context type: The MLContext 's context type .
[[deviceType]] of type device type: The MLContext 's device type .
[[powerPreference]] of type power preference: The MLContext 's power preference .

When the



[[contextType]]

is set to default with the



MLContextOptions



deviceType

set to gpu , the user agent is responsible for creating an internal GPU device that operates within the context and is capable of ML workload submission on behalf of the calling application. In this setting however, only



ArrayBufferView

inputs and outputs are allowed in and out of the graph execution since the application has no way to know what type of internal GPU device is being created on their behalf. In this case, the user agent is responsible for automatic uploads and downloads of the inputs and outputs to and from the GPU memory using this said internal device.

7.4.1. The `MLContext` validation algorithm

To validate



MLContext

, given context, run these steps:

If context. [[contextType]] is not " webgpu " or " default , return false.
If context. [[deviceType]] is not " cpu " or " gpu ", return false.
If context. [[powerPreference]] is not " default " or " high-performance " or " low-power ", return false.
If the user agent cannot support context. [[contextType]], context. [[deviceType]] and context. [[powerPreference]], return false.
Return true ;

7.4.2. Synchronous Execution

Synchronously carries out the computational workload of a compiled graph



MLGraph

on the calling thread, which must be a worker thread, to produce results as defined by the operations in the graph. This method of execution requires an



MLContext

created with



MLContextOptions

. Otherwise, it throws an "



OperationError



DOMException

partial interface MLContext {
  [Exposed=(DedicatedWorker)]
  undefined computeSync(
      MLGraph graph, MLNamedArrayBufferViews inputs, MLNamedArrayBufferViews outputs);
};

Arguments:

graph : an MLGraph. The compiled graph to be executed.
inputs : an MLNamedArrayBufferViews. The resources of inputs.
outputs : an MLNamedArrayBufferViews. The pre-allocated resources of required outputs.

Returns: undefined.

The



computeSync(graph,
inputs,
outputs)

method steps are:

If graph. [[context]]. [[contextType]] is not " default , throw an " OperationError " DOMException and stop.
If invoking the validate graph resources algorithm given inputs and graph. [[inputDescriptors]] returns false, then throw a " DataError " DOMException and stop.
If invoking the validate graph resources algorithm given outputs and graph. [[outputDescriptors]] returns false, then throw a " DataError " DOMException and stop.
Invoke execute graph given graph, inputs and outputs.
If that throws an error, re-throw the error and stop.
Return undefined.

To validate graph resources , given resources and descriptors, run the following steps:

Assert : the type of resources is MLNamedArrayBufferViews.
For each record < key, value > of resources:
1. If descriptors [ key ] does not exist , return false.
2. Assert : the type of value is ArrayBufferView.
3. If running the validate buffer with descriptor given value and descriptors [ key ] return false, return false.
Return true.

To validate buffer with descriptor given bufferView and descriptor, run the following steps:

If bufferView is not an MLBufferView, return false.
If bufferView ’s element type does not match to descriptor. type according to this table , return false.
If bufferView.[[ByteLength]] is not equal to the byte length of descriptor, return false.

To execute graph , given graph, inputs and outputs, run the following steps:

Assert : the type of inputs is MLNamedArrayBufferViews.
Let inputResources denote the input resources of graph. [[implementation]].
For each < key, inputValue > of inputs:
1. Let inputDescriptor be graph. [[inputDescriptors]] [ key ].
2. Let inputTensor be a new tensor for graph. [[implementation]] as follows:
  1. Set the data type of inputTensor to the one that matches the element type of inputValue.
  2. Set the dimensions of inputTensor to inputDescriptor. dimensions.
  3. Set the values of elements in inputTensor to the values of elements in inputValue.
3. Request the underlying implementation of graph to bind inputResources [ key ] to inputTensor.
Assert : the type of outputs is MLNamedArrayBufferViews.
For each < key, outputValue > of outputs:
1. Issue a compute request to graph. [[implementation]] given key and inputResources and wait for completion.
  1. If that returns an error, then throw an " OperationError " DOMException and stop.
  2. Otherwise, store the result in outputTensor.
2. Let outputDesc be graph. [[outputDescriptors]] [ key ].
3. If the byte length of outputTensor is not equal to the byte length of outputDesc, then throw a " DataError " DOMException and stop.
4. If the element type of outputTensor doesn’t match the element type of outputValue, then throw a " DataError " DOMException and stop.
5. Request the underlying implementation of graph to set the values of elements in outputValue to the values of elements in outputTensor.
Return undefined.

7.4.2.1. Examples

The following code showcases the synchronous computation with optional outputs in a worker.

const context = navigator.ml.createContextSync();
// Build a graph with two outputs.
const builder = new MLGraphBuilder(context);
const descA = {type: 'float32', dimensions: [3, 4]};
const a = builder.input('a', descA);
const descB = {type: 'float32', dimensions: [4, 3]};
const bufferB = new Float32Array(sizeOfShape(descB.dimensions)).fill(0.5);
const b = builder.constant(descB, bufferB);
const descC = {type: 'float32', dimensions: [3, 3]};
const bufferC = new Float32Array(sizeOfShape(descC.dimensions)).fill(1);
const c = builder.constant(descC, bufferC);
const d = builder.matmul(a, b);
const e = builder.add(d, c);
const graph = builder.buildSync({'d': d, 'e': e});
const bufferA = new Float32Array(sizeOfShape(descA.dimensions)).fill(0.5);
const inputs = {'a': bufferA};
// Compute d.
const bufferD = new Float32Array(sizeOfShape([3, 3]));
context.computeSync(graph, inputs, {'d': bufferD});
console.log(`values: ${bufferD}`);
// Compute e.
const bufferE = new Float32Array(sizeOfShape([3, 3]));
context.computeSync(graph, inputs, {'e': bufferE});
console.log(`values: ${bufferE}`);

7.4.3. The `MLNamedArrayBufferViews` transfer algorithm

To transfer an



MLNamedArrayBufferViews

views:

Let transferredViews be a new MLNamedArrayBufferViews.
For each key -> value of views:
1. Let transferredBuffer be the result of transferring the underlying buffer of value.
2. Let constructor be the appropriate view constructor for the type of ArrayBufferView value.
3. Let elementsNumber be the result of the byte length of value ÷ element size of value.
4. Let transferredView be Construct ( constructor, transferredBuffer, value.[[ByteOffset]], elementsNumber ).
5. Set transferredViews [ key ] to transferredView.
Return transferredViews.

7.4.4. Asynchronous Execution

Asynchronously carries out the computational workload of a compiled graph



MLGraph

on a separate timeline, either on a worker thread for the CPU execution, or on a GPU timeline for the submission of GPU workload on the command queue. The asynchronous nature of this call avoids blocking the calling thread while the computation for result is ongoing. This method of execution requires an



MLContext

created with



MLContextOptions

. Otherwise, it throws an "



OperationError



DOMException

In accordance with the Web IDL warning , to prevent the calling thread from modifying the input and output resources while the computation is ongoing, this method transfers the input and output



MLNamedArrayBufferViews

to new views that share the same backing memory allocations. The transferred views are returned to the caller via the promise fulfillment with the computation result written into the backing memory of the output views.

dictionary MLComputeResult {
  MLNamedArrayBufferViews inputs;
  MLNamedArrayBufferViews outputs;
};
partial interface MLContext {
  Promise<MLComputeResult> compute(
      MLGraph graph, MLNamedArrayBufferViews inputs, MLNamedArrayBufferViews outputs);
};

Arguments:

graph : an MLGraph. The compiled graph to be executed.
inputs : an MLNamedArrayBufferViews. The resources of inputs. Will be transferred if there are no validation errors.
outputs : an MLNamedArrayBufferViews. The pre-allocated resources of required outputs. Will be transferred if there are no validation errors.

Returns: Promise< MLComputeResult >.

The



compute(graph,
inputs,
outputs)

method steps are:

Let promise be a new promise .
Return promise and run the following steps in parallel :
1. If graph. [[context]]. [[contextType]] is not " default , reject promise with an " OperationError " DOMException and stop.
2. If invoking the validate graph resources algorithm given inputs and graph. [[inputDescriptors]] returns false, then reject promise with a " DataError " DOMException and stop.
3. If invoking the validate graph resources algorithm given outputs and graph. [[outputDescriptors]] returns false, then reject promise with a " DataError " DOMException and stop.
4. Let transferredInputs be the result of transferring MLNamedArrayBufferViews inputs.
5. Let transferredOutputs be the result of transferring MLNamedArrayBufferViews outputs.
6. Invoke execute graph given graph, transferredInputs and transferredOutputs.
7. If that throws an error, reject promise with the error and stop.
8. Otherwise, when execute graph has completed:
  1. Let result be a new MLComputeResult.
  2. Set result. inputs to transferredInputs.
  3. Set result. outputs to transferredOutputs.
  4. Resolve promise with result and stop.

7.4.4.1. Examples

The following code showcases the asynchronous computation.

const operandType = {type: 'float32', dimensions: [2, 2]};
const context = await navigator.ml.createContext();
const builder = new MLGraphBuilder(context);
// 1. Create a computational graph 'C = 0.2 * A + B'.
const constant = builder.constant(0.2);
const A = builder.input('A', operandType);
const B = builder.input('B', operandType);
const C = builder.add(builder.mul(A, constant), B);
// 2. Compile it into an executable.
const graph = await builder.build({'C': C});
// 3. Bind inputs to the graph and execute for the result.
const bufferA = new Float32Array(4).fill(1.0);
const bufferB = new Float32Array(4).fill(0.8);
const bufferC = new Float32Array(4);
const inputs = {'A': bufferA, 'B': bufferB};
const outputs = {'C': bufferC};
const result = await context.compute(graph, inputs, outputs);
// The computed result of [[1, 1], [1, 1]] is in the buffer associated with
// the output operand.
console.log('Output value: ' + result.outputs.C);
// Note: the result.outputs.C buffer is different from the bufferC, but it
// shares the same backing memory allocation.

7.4.5. WebGPU Interoperability

Create



MLCommandEncoder

interface used to record the ML workload onto a WebGPU-compatible



GPUCommandBuffer

to allow mixing of ML workload with other GPU workload in an application that leverages WebGPU. This method only succeeds on an



MLContext

created with



GPUDevice

. Otherwise, it throws an "



OperationError



DOMException

partial interface MLContext {
  MLCommandEncoder createCommandEncoder();
};

Returns:



MLCommandEncoder

. The command encoder used to record ML workload on the GPU.

7.5. The MLCommandEncoder interface

The



MLCommandEncoder

interface represents a method of execution that synchronously records the computational workload of a compiled



MLGraph

to a



GPUCommandBuffer

on the calling thread. Since the workload is not immediately executed, just recorded, this method allows more flexibility for the caller to determine how and when the recorded commands will be submitted for execution on the GPU relative to other GPU workload on the same or different queue.

typedef (GPUBuffer or GPUTexture) MLGPUResource;
typedef record<DOMString, MLGPUResource> MLNamedGPUResources;
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLCommandEncoder {};



MLCommandEncoder

has the following internal slots:

[[context]] of type MLContext: The context of type MLContext associated with this MLCommandEncoder.
[[implementation]]: The underlying implementation provided by the User Agent.

7.5.1. Graph Initialization

Record the initialization of the



MLGraph

. This is a necessary step for optimal performance during graph execution as it gives the platform an opportunity to prepare and optimize constant input data for the subsequent execution of the graph. This method should only be called once per graph.

partial interface MLCommandEncoder {
  );

  undefined initializeGraph(MLGraph graph);
};

Arguments:

graph : an MLGraph. The compiled graph to be initialized with graph constant inputs.

Returns: undefined.

The


initializeGraph(graph)

steps are:

Graph initialization stage typically involves a process known as "weight preprocessing" where all the constant inputs to the graph are preprocessed and cached at the operating system level for subsequent graph execution calls. The initializing inputs are typically the constant weight data specified through the



MLGraphBuilder/constant()

method as constant operands during graph construction time.

7.5.2. Dispatch Execution Commands

Record the



MLGraph

execution with the inputs



MLNamedGPUResources

and outputs



MLNamedGPUResources

partial interface MLCommandEncoder {
  );

  undefined dispatch(MLGraph graph, MLNamedGPUResources inputs, MLNamedGPUResources outputs);
};

Arguments:

graph : an MLGraph. The compiled graph to be executed.
inputs : an MLNamedGPUResources. The resources of inputs.
outputs : an MLNamedGPUResources. The pre-allocated resources of required outputs.

Returns: undefined.

The


dispatch(graph,
inputs,
outputs)

steps are:

If any of the following requirements are unmet, then throw a " DataError " DOMException and stop.
1. For each key -> value of inputs:
  1. graph. [[inputDescriptors]] [ key ] must exist .
  2. Let inputDesc be graph. [[inputDescriptors]] [ key ].
  3. If value is a GPUBuffer, then:
    1. value. size must equal to byte length of inputDesc.
2. For each key -> value of outputs:
  1. graph. [[outputDescriptors]] [ key ] must exist .
  2. Let outputDesc be graph. [[outputDescriptors]] [ key ].
  3. If value is a GPUBuffer, then:
    1. value. size must equal to byte length of outputDesc.
For each key -> value of inputs:
1. Set the input of graph. [[implementation]] that is associated with key to value.
For each key -> value of outputs:
1. Set the output of graph. [[implementation]] that is associated with key to value.
Issue a compute request of graph. [[implementation]].
If there is an error returned by graph. [[implementation]], then:
1. Throw an " OperationError " DOMException and stop.
Return undefined.

7.5.3. Generate GPU Command Buffer

Complete the recording of ML workload and return a WebGPU-compatible



GPUCommandBuffer

containing the recorded workload.

partial interface MLCommandEncoder {
   = {});

  GPUCommandBuffer finish(optional GPUCommandBufferDescriptor descriptor = {});
};

Arguments:

descriptor : an optional GPUCommandBufferDescriptor. Descriptor of the command buffer.

Returns: GPUCommandBuffer.

The


finish(descriptor)

method steps are:

If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Make a request to the underlying platform to complete the recording of the ML workload, given descriptor.
  See the related WebGPU steps .
Return a GPUCommandBuffer containing the recorded workload.

7.6. The MLGraphBuilder interface

The MLGraphBuilder interface defines a set of operations as identified by the § 2 Use cases that can be composed into a computational graph. It also represents the intermediate state of a graph building session.

;

typedef record<DOMString, MLOperand> MLNamedOperands;
dictionary MLBufferResourceView {
  ;
   = 0;
  ;

  required GPUBuffer resource;
  unsigned long long offset = 0;
  unsigned long long size;
};
typedef (ArrayBufferView or MLBufferResourceView) MLBufferView;
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLGraphBuilder {
  // Construct the graph builder from the context.
  constructor(MLContext context);
  // Create an operand for a graph input.
  );

  MLOperand input(DOMString name, MLOperandDescriptor descriptor);
  // Create an operand for a graph constant.
  );

  MLOperand constant(MLOperandDescriptor descriptor, MLBufferView bufferView);
  // Create a single-value operand from the specified number of the specified type.
   = "float32");

  MLOperand constant(double value, optional MLOperandType type = "float32");
  // Compile the graph up to the specified output operands asynchronously.
  Promise<MLGraph> build(MLNamedOperands outputs);
  // Compile the graph up to the specified output operands synchronously.
  [Exposed=(DedicatedWorker)]
  MLGraph buildSync(MLNamedOperands outputs);
};

Both



MLGraphBuilder



build()

and



MLGraphBuilder



buildSync()

methods compile the graph builder state up to the specified output operands into a compiled graph according to the type of



MLContext

that creates it. Since this operation can be costly in some machine configurations, the calling thread of the



MLGraphBuilder



buildSync()

method must only be a worker thread to avoid potential disruption of the user experience. When the



[[contextType]]

of the



MLContext

is set to default , the compiled graph is initialized right before the



MLGraph

is returned. This graph initialization stage is important for optimal performance of the subsequent graph executions. See § 7.5.1 Graph Initialization for more detail.

MLBufferResourceView has the following members:

resource, of type GPUBuffer: A GPUBuffer object. Specifies the GPU buffer source.
offset, of type unsigned long long , defaulting to 0: Specifies an unsigned long long offset in the buffer source.
size, of type unsigned long long: Specifies the unsigned long long size of the buffer view.


MLGraphBuilder

has the following internal slots:

[[context]] of type MLContext: The context of type MLContext associated with this MLGraphBuilder.

7.6.1. The `MLGraphBuilder` constructor

The new



MLGraphBuilder

constructor steps are:

If this 's relevant global object 's associated Document is not allowed to use the webnn feature, throw a " SecurityError " DOMException and abort these steps.
Let context be the first argument.
If the validate MLContext steps given context return false, throw a " TypeError " and abort these steps.
Set [[context]] to context.

7.6.2. The `input()` method

Create a named



MLOperand

based on a descriptor, that can be used as an input.

Arguments:

name : a string name of the input.
descriptor : an MLOperandDescriptor object.

Returns: : an



MLOperand

object.

The



input(name,
descriptor)

steps are:

The permissions and context validity have been checked by § 7.6.1 The MLGraphBuilder constructor steps.

Let name be the first argument.
1. If name is undefined or an empty string , then throw a " TypeError " DOMException and stop.
Let descriptor be the second argument.
1. If descriptor is not an an object that implements MLOperandDescriptor, then throw a " TypeError " DOMException and stop.
2. Assert : If descriptor. dimensions does not exist , then descriptor defines a scalar input.
3. If descriptor. dimensions exists :
  1. If the check dimensions steps given descriptor. type and descriptor. dimensions return false, throw a " DataError " DOMException and stop.
  2. If the byte length of descriptor is not supported by the underlying platform, then throw a " DataError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let operand be the result of invoking the create MLOperand steps with this and descriptor.
  ~~If that throws, re-throw the exception and stop.~~
2. Set operand. [[name]] to name.
3. Make a request to the underlying platform ~~to register operand as~~ to:
  1. Create an ~~input and store a reference to the corresponding~~ implementation-defined platform ~~object~~ input operand operandImpl given descriptor.
  2. Store a reference of operandImpl in operand. [[operand]].
  3. Register operand as an input.
Return operand.

7.6.3. The build() method

Build a composed graph up to a given output operand into a computational graph, asynchronously or synchronously.

7.6.3.1. The `build(outputs)` method

The


build(outputs)

steps are:

The permissions and context validity have been checked by § 7.6.1 The MLGraphBuilder constructor steps.

Let promise be a new promise .
Return promise and run the following steps in parallel .
Return the result of invoking buildSync(outputs) given outputs.
1. If that ~~fails,~~ throws, re-throw the error and stop.

7.6.3.2. The `buildSync(outputs)` method

The


buildSync(outputs)

steps are:

The permissions and context validity have been checked by § 7.6.1 The MLGraphBuilder constructor steps.

If outputs is not an instance of MLNamedOperands, then throw an " TypeError " DOMException and stop.
For each element in outputs:
1. If element.key is not a string , then throw an " TypeError " DOMException and stop.
2. If element.value is not an instance of MLOperand, then throw an " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and ~~abort these steps.~~ stop.
1. Let graph be a new MLGraph:
  1. Set graph.[[context]] to this .[[context]].
  2. Set graph.[[outputDescriptors]] to outputs.
2. Make a request to the underlying platform to:
  1. Connect graph to a new implementation-defined graph implementation graphImpl given graph.
  2. Store a reference to graphImpl in graph.[[implementation]].
3. Make a request to the underlying platform to initialize the graph:
  1. For each operand in outputs:
    1. If operand was created as an input by the underlying platform:
      1. Add operand to graph.[[inputDescriptors]].
      2. Initialize the weights of operand.
    2. If operand was created as a constant by the underlying platform:
      1. Preprocess and optimize the tensor data of operand.
    3. Update graphImpl with operand.[[operand]].
    4. Update graphImpl with operand.[[operator]].
Return graph.

7.6.3. 7.6.4. The constant() method

Create a constant



MLOperand

that can be used in



MLGraphBuilder

methods.

7.6.3.1. 7.6.4.1. The `constant(descriptor, bufferView)` method

Arguments:

descriptor : an MLOperandDescriptor object
bufferView : an MLBufferView

Returns: : an



MLOperand

object.

The



constant(descriptor,
bufferView)

steps are:

The permissions and context validity have been checked by § 7.6.1 The MLGraphBuilder constructor steps.

Let descriptor be the first argument.
If descriptor is not an an object that implements MLOperandDescriptor, then throw a " TypeError " DOMException and stop.
1. If the byte length of descriptor is not supported by the underlying platform, then throw a " DataError " DOMException and stop.
2. If the check dimensions steps given descriptor. type and descriptor. dimensions return false, throw a " DataError " DOMException and stop.
Let bufferView be the second argument.
1. If invoking validate buffer with descriptor given bufferView and descriptor return false, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let operand be the result of invoking the create MLOperand steps with this and descriptor.
  ~~If that throws, re-throw the exception and stop.~~
2. Let bytes be the result of invoking [ get a copy of the bytes held by the buffer source ] given bufferView.
3. Make a request to the underlying platform ~~to register~~ to:
  1. Create an implementation-defined platform operand constantImpl as to represent a ~~tensor constant with~~ constant, given bytes as value and store descriptor.
  2. Store a reference ~~to the corresponding implementation-defined object to~~ of constantImpl in operand. [[operand]].
  3. ~~If that fails, throw an " OperationError " DOMException and stop.~~ Register operand as a tensor constant with bytes as value.
Return operand.

7.6.3.2. 7.6.4.2. The `constant(value, type)` method

Arguments:

value : a number
type : an optional MLOperandType, by default "float32" .

Returns: : an



MLOperand

object.

The



constant(value,
type)

steps are:

The permissions and context validity have been checked by § 7.6.1 The MLGraphBuilder constructor steps.

If value is not a number , then then throw a " TypeError " DOMException and stop.
If type is undefined, let type be "float32".
Otherwise, if type is not one of MLOperandType, then throw a " TypeError " DOMException and stop.
Let descriptor be a new MLOperandDescriptor.
1. Set descriptor. type to type.
2. Set descriptor. dimensions to undefined.
  
  In the case of a scalar constant, descriptor. dimensions is ignored.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let operand be the result of invoking the create MLOperand steps with this and descriptor.
  ~~If that throws, re-throw the exception and stop.~~
2. Make a request to the underlying platform ~~to register~~ to:
  1. Create an implementation-defined platform operand constantImpl as to represent a ~~scalar constant with~~ constant, given value as value and store descriptor.
  2. Store a reference of ~~the implementation-defined platform object for the corresponding (scalar or tensor constant) operand to~~ constantImpl in operand. [[operand]].
  3. ~~If that throws, re-throw the error and stop.~~ Register operand as a scalar constant with value as value.
Return operand.

7.6.4. 7.6.5. The batchNormalization() method

Normalize the tensor values of input features across the batch dimension using [Batch-Normalization] . For each input feature, the mean and variance values of that feature supplied in this calculation as parameters are previously computed across the batch dimension of the input during the model training phase of this operation.

dictionary MLBatchNormalizationOptions {
  ;
  ;
   = 1;
   = 1e-5;
  ;

  MLOperand scale;
  MLOperand bias;
  unsigned long axis = 1;
  float epsilon = 1e-5;
  MLActivation activation;
};
 {
  ,

partial interface MLGraphBuilder {
  MLOperand batchNormalization(MLOperand input, MLOperand mean, MLOperand variance,
                             optional MLBatchNormalizationOptions options = {});
};

MLBatchNormalizationOptions has the following members:

scale, of type MLOperand: An MLOperand. Specifies the 1-D tensor of the scaling values whose length is equal to the size of the input dimension denoted by axis.
bias, of type MLOperand: An MLOperand. Specifies the 1-D tensor of the bias values whose length is equal to the size of the input dimension denoted by axis.
axis, of type unsigned long , defaulting to 1: A long scalar. Specifies the index to the feature count dimension of the input shape for which the mean and variance values are. Its value must be in the range [0, N-1] where N is the rank of input tensor. The default value is 1, corresponding to the channel ( "c" ) dimension in the "nchw" data layout.
epsilon, of type float , defaulting to 1e-5: A float scalar. Specifies A small value to prevent computational error due to divide-by-zero.
activation, of type MLActivation: An MLActivation object. Specifies the optional activation function that immediately follows the normalization operation.

Arguments:

input : an MLOperand. The input N-D tensor.
mean : an MLOperand. ~~The~~ Specifies the 1-D tensor of the mean values of the input features across the batch whose length is equal to the size of the input dimension denoted by ~~options.axis .~~ axis.
variance : an MLOperand. The 1-D tensor of the variance values of the input features across the batch whose length is equal to the size of the input dimension denoted by ~~options.axis .~~ axis.
options : an optional MLBatchNormalizationOptions. ~~The~~ Specifies the optional parameters of the operation.

~~scale :~~ Returns: an MLOperand. The ~~1-D~~ batch-normalized N-D tensor of the ~~scaling values whose length is equal to the size of~~ same shape as the input ~~dimension denoted by options.axis .~~ tensor.

~~bias : an~~

The


MLOperand


batchNormalization()


.
The
1-D
tensor
of
the
bias
values
whose
length
is
equal
to
the
size
of

method steps are:

Let input be the first argument. To validate input dimension denoted by options.axis ., run these substeps:
1. ~~axis :~~ If input is not an object that implements unsigned long MLOperand, then throw a " TypeError " DOMException ~~scalar. The index to the feature count dimension of the input shape for which the mean~~ and ~~variance values are. Its value must~~ abort these steps.
Let mean be in the ~~range [0, N-1] where N is~~ second argument, representing a vector with the ~~rank of~~ moving mean values for input tensor. When it’s not specified,. To validate mean, run the ~~default value is 1.~~ following substeps:
1. If mean is not an object that implements epsilon : MLOperand, then throw a " float TypeError ~~scalar. A small value to prevent computational error due to divide-by-zero. The default value is 0.00001 when not specified.~~ " DOMException and abort these steps.
2. If mean.activation : an [[descriptor]]. MLActivation [[descriptor]]. ~~The optional activation function that immediately follows the normalization operation.~~ Returns: an dimensions is not equal with input. MLOperand [[descriptor]]. ~~The batch-normalized N-D tensor~~ dimensions from which the dimension represented by options.axis is removed, then throw a " TypeError " DOMException and abort these steps.
Let variance be the third argument, representing the moving variance values of input.
Let options be the ~~same shape as~~ fourth argument. To validate options, run these substeps:
1. If options.axis does not exist , let options."axis" be 1.
2. If options.axis is not a number between 0 and the rank of input tensor., then throw a " TypeError " DOMException and abort these steps.
3. ~~When~~ If input is a 4-D tensor of the "nchw" or layout, set options.axis to 1.
4. If input is a 4-D tensor of the "nhwc" layout, ~~options.axis should be~~ set options.axis to ~~1 or 3 respectively. The axis value designates~~ 3.
If any of the ~~feature or channel count dimension~~ following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps with this and descriptor, that may use the same underlying data as input tensor..
2. Make a request to the underlying platform to initialize the batch normalization:
  1. Create an implementation-defined platform operator batchNormImpl for this method, given input,mean,variance and options.
  2. If options.activation exists ,register it as activation to batchNormImpl.
  3. Connect output as output to batchNormImpl.
Return output.

The behavior of this operation when the input tensor is 4-D of the "nchw" layout and the activation is of operator type relu can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.

const shape = [1,null,1,1];
return builder.relu(
  builder.add(
    builder.mul(
      builder.reshape(options.scale, shape),
      builder.div(
        builder.sub(input, builder.reshape(mean, shape)),
        builder.pow(
          builder.add(builder.reshape(variance, shape), builder.constant(options.epsilon)),
          builder.constant(0.5))
        )),
    builder.reshape(options.bias, shape)));

7.6.5. 7.6.6. The clamp() method

Clamp the input tensor element-wise within a range specified by the minimum and maximum values.

dictionary MLClampOptions {
  ;
  ;

  float minValue;
  float maxValue;
};
 {
   = {});
   = {});

partial interface MLGraphBuilder {
  MLOperand clamp(MLOperand operand, optional MLClampOptions options = {});
  MLActivation clamp(optional MLClampOptions options = {});
};

The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.

if (options.minValue === undefined) {  if (options.maxValue === undefined) {    return x;  } else {    return builder.min(x, builder.constant(options.maxValue));  }} else {  if (options.maxValue === undefined) {    return builder.max(x, builder.constant(options.minValue));  } else {    return builder.min(
        builder.max(x, builder.constant(options.minValue)),
        builder.constant(options.maxValue));
  }}

To check clamp options given options, run the following steps:

If options is not an object that implements MLClampOptions, then return false.
If options.minValue and options.maxValue are not a numeric type , then then return false.
If options.minValue is greater than options.maxValue, then return false.
Return true.

7.6.6.1. The `clamp(operand, options)` method

Arguments:

x operand : an MLOperand. The input tensor.
options : an optional MLClampOptions. The optional parameters of the operation.
- minValue : a float scalar. Specifies the minimum value of the range. When it is not specified, the clamping is not performed on the lower limit of the range.
- maxValue : a float scalar. Specifies the maximum value of the range. When it is not specified, the clamping is not performed on the upper limit of the range.

Returns:

an MLOperand. The output tensor of the same shape as x operand .

The


clamp(operand,
options)

method steps are:

Let operand be the first argument.
Let options be the second argument.
1. If running the check clamp options steps with options returns false, then throw a " TypeError " DOMException and abort these steps.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given operand.
2. Make a request to the underlying platform to:
  1. Create an implementation-defined platform operator clampImpl for this method, given options.minValue and options.minValue.
  2. Store a reference of clampImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent clamp output, given output and clampImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect operand.[[operand]] as input to clampImpl.
4. Connect output.[[operand]] as output to clampImpl.
Return output.

7.6.6.2. The `clamp(options)` method

Arguments:

options : an optional MLClampOptions. The optional parameters of the operation.
- minValue : a float scalar. Specifies the minimum value of the range. When it is not specified, the clamping is not performed on the lower limit of the range.
- maxValue : a float scalar. Specifies the maximum value of the range. When it is not specified, the clamping is not performed on the upper limit of the range.

Returns:

an MLActivation. The ~~activation function~~ operator representing the clamp operation.

The ~~behavior of this operation can~~



clamp(options)

method steps are:

Let options be ~~generically emulated from~~ the ~~usage of other operations as follow. However, user agents typically have~~ first argument.
1. If running the check clamp options steps with options returns false, then throw a ~~more efficient implementation for it, therefore its usage is encouraged from~~ " TypeError " DOMException and abort these steps.
Let op be the ~~performance standpoint. builder builder~~ result of invoking the create MLActivation steps with "clamp" and options.
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.6. 7.6.7. The concat() method

Concatenates the input tensors along a given axis. ~~{ );~~

partial interface MLGraphBuilder {
  MLOperand concat(sequence<MLOperand> inputs, unsigned long axis);
};

Arguments:

inputs : a sequence of MLOperand. All input tensors must have the same shape, except for the size of the dimension to concatenate on.
axis : an unsigned long scalar. The axis that the inputs concatenate along. Its value must be in the range [0, N-1] where N is the rank of input tensors.

Returns: an MLOperand. The concatenated tensor of all the inputs along the axis . The output tensor has the same shape except on the dimension that all the inputs concatenated along. The size of that dimension is computed as the sum of all the input sizes of the same dimension.

The


concat(inputs,
axis)

steps are:

The permissions and context validity have been checked by § 7.6.1 The MLGraphBuilder constructor steps.

Let inputs be the first argument.
1. Assert : the type of inputs is sequence of MLOperand objects.
2. Assert : the type of axis is unsigned long.
3. Assert : the shape, i.e. dimensions ) of each operand in inputs is the same, except on the dimension given by axis on which they are concatenated.
4. Assert : the type of each operand in inputs is the same.
5. If any of the following steps fail, then throw a " DataError " DOMException and stop.
  1. If inputs is not an array of objects , fail.
  2. If axis is not a positive integer number , fail.
  3. If axis is greater than or equal to the rank of inputs, fail.
  4. Let desc be inputs [0]. [[descriptor]].
  5. Let desc.dimensions [ axis ] be 0.
  6. For each index between 0 and the rank of inputs:
    1. If running validate MLOperand given inputs [ index ] and this returns false, then fail.
    2. For each dim between 0 and the rank of inputs [ index ]:
      If the shape of each corresponding dimension and type of the operands, except for those of the dimension given by axis, is not the same, fail.
      1. If dim is not equal to axis and if inputs [ index ]. dimensions [ dim ] is not equal to inputs [0]. dimensions [ dim ], fail.
      2. If inputs [ dim ]. type is not equal to inputs [0]. type.
      3. If dim is equal to axis, add to desc.dimensions [ axis ] the value of inputs [ index ]. dimensions [ dim ].
6. If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
  1. Let output be the result of invoking the create MLOperand steps given this and desc.
  2. Make a request to the underlying platform to:
    1. Create an implementation-defined platform operator concatImpl for this method, given inputs and axis.
    2. Store a reference of concatImpl in output.[[operator]].
    3. Create an implementation-defined platform operand outputImpl to represent output,given output and concatImpl.
    4. Store a reference to outputImpl in output.[[operand]].
  3. Connect inputs as input to concatImpl.
  4. Connect output.[[operand]] as output to concatImpl.
7. Return output.

7.6.7. 7.6.8. The conv2d() method

Compute a 2-D convolution given 4-D input and filter tensors

enum MLConv2dFilterOperandLayout {
  "oihw",
  "hwio",
  "ohwi",
  "ihwo"
};
enum MLAutoPad {
  "explicit",
  "same-upper",
  "same-lower"
};
dictionary MLConv2dOptions {
  ;
  ;
  ;
   = "explicit";
   = 1;
   = "nchw";
   = "oihw";
  ;
  ;

  sequence<unsigned long> padding;
  sequence<unsigned long> strides;
  sequence<unsigned long> dilations;
  MLAutoPad autoPad = "explicit";
  unsigned long groups = 1;
  MLInputOperandLayout inputLayout = "nchw";
  MLConv2dFilterOperandLayout filterLayout = "oihw";
  MLOperand bias;
  MLActivation activation;
};
 {
   = {});

partial interface MLGraphBuilder {
  MLOperand conv2d(MLOperand input, MLOperand filter, optional MLConv2dOptions options = {});
};

~~Arguments:~~

~~input : an~~ MLOperand MLConv2dOptions . The input 4-D tensor. The logical shape is interpreted according to has the ~~value of options.inputLayout .~~ following members:

~~filter : an~~ ~~MLOperand~~ padding ~~. The filter 4-D tensor. The logical shape is interpreted according to the value~~ , of ~~options.filterLayout and options.groups . options : an optional MLConv2dOptions~~ type sequence<unsigned long> ~~. The optional parameters of the operation.~~

~~padding : a~~ A sequence of unsigned long of length ~~4. The~~ 4: [beginning_height, ending_height, beginning_width, ending_width]. Specifies the additional rows and columns added to the beginning and ending of each spatial dimension of ~~input , [beginning_height, ending_height, beginning_width, ending_width]. If not present,~~ the ~~values are assumed to be [0,0,0,0].~~ convolution input. The default value is [0, 0, 0, 0].

strides : a , of type sequence<unsigned long>

A sequence of unsigned long of length ~~2. The~~ 2: [stride_height, stride_width]. Specifies the stride of the sliding window for each spatial dimension of ~~input , [stride_height, stride_width]. If not present,~~ the ~~values are assumed to be [1,1].~~ convolution input. The default value is [1, 1].

dilations : a , of type sequence<unsigned long>

A sequence of unsigned long of length ~~2. The~~ 2: [dilation_height, dilation_width]. Specifies the dilation factor for each spatial dimension ~~of input , [dilation_height, dilation_width]. If not present,~~ applied on the ~~values are assumed to be [1,1].~~ convolution filter (kernel). The default value is [1, 1].

autoPad : an , of type MLAutoPad , defaulting to "explicit"

An MLAutoPad . The string . Specifies the automatic input padding options. ~~By default, this argument~~ The default value is ~~set to~~ "explicit" , which means that the values in the ~~options.padding~~ padding array should be used for input padding. When the option is set other than "explicit" , the values in the ~~options.padding~~ padding array are ignored.

With the "same-upper" option, the padding values are automatically computed such that the additional ending padding of the spatial input dimensions would allow all of the input values in the corresponding dimension to be filtered.

The "same-lower" option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.

groups : an , of type unsigned long , defaulting to 1

An unsigned long scalar. ~~The~~ Specifies the number of groups that input channels and output channels are divided ~~into,~~ into. The default to value is 1.

inputLayout : an , of type MLInputOperandLayout , defaulting to "nchw"

An MLInputOperandLayout . The default value is "nchw" . This option specifies string . Specifies the layout format of the input and output tensor as ~~follow:~~ follows:

~~"nchw":~~ "nchw"
- input tensor: [batches, input_channels, height, width]
- output tensor: [batches, output_channels, height, width]
~~"nhwc":~~ "nhwc" :
- input tensor: [batches, height, width, input_channels]
- output tensor: [batches, height, width, output_channels]

The default value is "nchw" .

filterLayout : an , of type MLConv2dFilterOperandLayout , defaulting to "oihw"

An MLConv2dFilterOperandLayout . The default value is "oihw" . This option specifies string . Specifies the layout format of the filter tensor as follow:

~~"oihw":~~

"oihw" : [output_channels, input_channels/groups, height, width]
~~"hwio":~~
"hwio" : [height, width, input_channels/groups, output_channels]
~~"ohwi":~~
"ohwi" : [output_channels, height, width, input_channels/groups]
~~"ihwo":~~
"ihwo" : [input_channels/groups, height, width, output_channels]

The default value is "oihw" .

bias : an , of type MLOperand

An MLOperand . The object. Specifies the additional 1-D tensor with the shape of [output_channels] whose values are to be added to the convolution result.

activation : an , of type MLActivation

An MLActivation . The object. Specifies the optional activation function that immediately follows the convolution operation.

Arguments:

input : an MLOperand. The input 4-D tensor. The logical shape is interpreted according to the value of options .inputLayout.
filter : an MLOperand. The filter 4-D tensor. The logical shape is interpreted according to the value of options .filterLayout and options .groups.
options : an MLConv2dOptions. The optional parameters of the operation.

Returns: an MLOperand. The output 4-D tensor that contains the convolution result. The output shape is interpreted according to the ~~options.inputLayout~~ options .inputLayout value. More specifically, the spatial dimensions or the sizes of the last two dimensions of the output tensor for the nchw input layout can be calculated as follow:

~~output size~~ output_size = 1 + ~~(input size~~ (input_size - ~~(filter size~~ (filter_size - 1) * dilation - 1 + ~~beginning padding~~ beginning_padding + ~~ending padding)~~ ending_padding) / stride

A depthwise conv2d operation is a variant of grouped convolution, used in models like the MobileNet, where the options.groups = input_channels = output_channels and the shape of filter tensor is [options.groups, 1, height, width] for "oihw" layout, [height, width, 1, options.groups] for "hwio" layout, [options.groups, height, width, 1] for "ohwi" layout and [1, height, width, options.groups] for "ihwo" layout.

The


conv2d(input,
filter,
options)

steps are:

If input or filter is not an instance of MLOperand, then then throw a " TypeError " DOMException and stop.
Let input_size be the size of input.[[descriptor]].dimensions.
Let filter_size be the size of filter.[[descriptor]].dimensions.
If input_size is not 4, then then throw a " DataError " DOMException and stop.
If filter_size is not 4, then then throw a " DataError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.padding is undefined, set it to [0, 0, 0, 0].
If options.strides is undefined, set it to [1, 1].
Else if options.strides.size() is not 2, then throw a " TypeError " DOMException and stop.
If any element in options.strides is equal to 0, then throw a " TypeError " DOMException and stop.
If options.dilations is undefined, set it to [1, 1].
If options.autoPad is undefined, set it to "explicit".
If options.groups is undefined, set it to 1.
If options.inputLayout is undefined, set it to "nchw".
If options.filterLayout is undefined, set it to "oihw".
If options.bias exists and it is not an instance of MLOperand, then then throw a " TypeError " DOMException and stop.
If options.activation exists and it is not an instance of MLActivation, then then throw a " TypeError " DOMException and stop.
Let output_shape be the result of calculating output dimensions based on input, filter, dilation, padding and stride, taking into account options.inputLayout.
Let desc a new MLOperandDescriptor.
Set desc.type to input.[[descriptor]].type.
Set desc.dimensions to output_shape.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps given this and desc.
2. Make a request to the underlying platform to:
  1. Create an implementation-defined platform operator conv2dImpl for this method, given options and filter.
    1. If options.activation exists ,register it as activation to conv2dImpl.
  2. Store a reference of conv2dImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and conv2dImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to conv2dImpl.
4. Connect output.[[operand]] as output to conv2dImpl.
Return output.

7.6.8. 7.6.9. The convTranspose2d() method

Compute a 2-D transposed convolution given 4-D input and filter tensors

enum MLConvTranspose2dFilterOperandLayout {
  "iohw",
  "hwoi",
  "ohwi"
};
dictionary MLConvTranspose2dOptions {
  ;
  ;
  ;
  ;
  ;
   = "explicit";
   = 1;
   = "nchw";
   = "iohw";
  ;
  ;

  sequence<unsigned long> padding;
  sequence<unsigned long> strides;
  sequence<unsigned long> dilations;
  sequence<unsigned long> outputPadding;
  sequence<unsigned long> outputSizes;
  MLAutoPad autoPad = "explicit";
  unsigned long groups = 1;
  MLInputOperandLayout inputLayout = "nchw";
  MLConvTranspose2dFilterOperandLayout filterLayout = "iohw";
  MLOperand bias;
  MLActivation activation;
};
 {
  ,

partial interface MLGraphBuilder {
  MLOperand convTranspose2d(MLOperand input, MLOperand filter,
                            optional MLConvTranspose2dOptions options = {});
};

~~Arguments:~~

~~input : an~~ MLOperand MLConvTranspose2dOptions . The input 4-D tensor. The logical shape is interpreted according to has the ~~value of options.inputLayout .~~ following members:

~~filter : an~~ ~~MLOperand~~ padding ~~. The filter 4-D tensor. The logical shape is interpreted according to the value~~ , of ~~options.filterLayout and options.groups . options : an optional MLConvTranspose2dOptions~~ type sequence<unsigned long> ~~. The optional parameters of the operation.~~

strides : a , of type sequence<unsigned long>

dilations : a , of type sequence<unsigned long>

outputPadding : a , of type sequence<unsigned long>

A sequence of unsigned long of length 2. ~~The~~ Specifies the padding values applied to each spatial dimension of the output tensor. ~~This~~ The explicit padding values are needed to disambiguate the output tensor shape for transposed convolution when the value of the ~~options.strides~~ options .strides is greater than 1.

Note that these values are only used to disambiguate output shape when needed; it does not necessarily cause any padding value to be written to the output tensor. ~~If not specified, the values are assumed to be [0,0].~~

The default values is [0, 0].

outputSizes : a , of type sequence<unsigned long>

A sequence of unsigned long of length 2. ~~The~~ Specifies the sizes of the last two dimensions of the output tensor. When the output sizes are explicitly specified, the output padding values in ~~options.outputPadding~~ outputPadding are ignored.

If not specified, the output sizes are automatically computed.

autoPad : an , of type MLAutoPad , defaulting to "explicit"

When the option is set other than "explicit" , the values in the ~~options.padding~~ padding array are ignored.

The "same-lower" option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.

groups : an , of type unsigned long , defaulting to 1

An unsigned long scalar. ~~The~~ Specifies the number of groups that input channels and output channels are divided ~~into,~~ into. The default to value is 1.

inputLayout : an , of type MLInputOperandLayout , defaulting to "nchw"

An MLInputOperandLayout . The default value is "nchw" . This option specifies string . Specifies the layout format of the input and output tensor as ~~follow:~~ follows:

~~"nchw":~~ "nchw"
- input tensor: [batches, input_channels, height, width]
- output tensor: [batches, output_channels, height, width]
~~"nhwc":~~ "nhwc" :
- input tensor: [batches, height, width, input_channels]
- output tensor: [batches, height, width, output_channels]

The default value is "nchw" .

filterLayout : an , of type MLConvTranspose2dFilterOperandLayout , defaulting to "iohw"

An MLConvTranspose2dFilterOperandLayout . The default value is "iohw" . This option specifies string . Specifies the layout format of the filter tensor as follow:

~~"iohw":~~

"iohw" : [input_channels, output_channels/groups, height, width]
~~"hwoi":~~
"hwoi" : [height, width, output_channels/groups, input_channels]
~~"ohwi":~~
"ohwi" : [output_channels/groups, height, width, input_channels]

The default value is "iohw" .

bias : an , of type MLOperand

An MLOperand . The object. Specifies the additional 1-D tensor with the shape of [output_channels] whose values are to be added to the ~~transposed~~ convolution result.

activation : an , of type MLActivation

An MLActivation . The object. Specifies the optional activation function that immediately follows the ~~transposed~~ convolution operation.

Arguments:

input : an MLOperand. The input 4-D tensor. The logical shape is interpreted according to the value of options .inputLayout.
filter : an MLOperand. The filter 4-D tensor. The logical shape is interpreted according to the value of options .filterLayout and groups.
options : an optional MLConvTranspose2dOptions.

Returns: an MLOperand. The output 4-D tensor that contains the transposed convolution result. The output shape is interpreted according to the ~~options.inputLayout~~ options .inputLayout value. More specifically, unless the ~~options.outputSizes~~ options .outputSizes values are explicitly specified, the ~~options.outputPadding~~ options .outputPadding may be needed to compute the spatial dimension values of the output tensor as follow:

~~output size~~ output_size = ~~(input size~~ (input_size - 1) * stride + ~~(filter size~~ (filter_size - 1) * dilation + 1 - ~~beginning padding~~ beginning_padding - ~~ending padding~~ ending_padding + output_padding

The


convTranspose2d(input,
filter,
options)

steps are:

If input or filter is not an instance of MLOperand, then then throw a " TypeError " DOMException and stop.
Let input_size be the size of input.[[descriptor]].dimensions.
Let filter_size be the size of filter.[[descriptor]].dimensions.
If input_size is not 4, then then throw a " DataError " DOMException and stop.
If filter_size is not 4, then then throw a " DataError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.padding is undefined, set it to [0, 0, 0, 0].
If options.strides is undefined, set it to [1, 1].
Else if options.strides.size() is not 2, then throw a " TypeError " DOMException and stop.
If any element in options.strides is equal to 0, then throw a " TypeError " DOMException and stop.
If options.dilations is undefined, set it to [1, 1].
If options.outputPadding is undefined, set it to [0, 0].
If options.autoPad is undefined, set it to "explicit".
If options.groups is undefined, set it to 1.
If options.inputLayout is undefined, set it to "nchw".
If options.filterLayout is undefined, set it to "iohw".
If options.bias exists and it is not an instance of MLOperand, then then throw a " TypeError " DOMException and stop.
If options.activation exists and it is not an instance of MLActivation, then then throw a " TypeError " DOMException and stop.
Let output_shape be the result of calculating output dimensions based on input,filter,options.dilations,options.padding and options.strides, taking into account options.inputLayout.
Let desc a new MLOperandDescriptor.
Set desc.type to input.[[descriptor]].type.
Set desc.dimensions to output_shape.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps given this and desc.
2. Make a request to the underlying platform to:
  1. Create an implementation-defined platform operator convTranspose2dImpl for this method, given options and filter.
    1. If options.activation exists ,register it as activation to convTranspose2dImpl.
  2. Store a reference of convTranspose2dImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and convTranspose2dImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to convTranspose2dImpl.
4. Connect output.[[operand]] as output to convTranspose2dImpl.
Return output.

7.6.9. 7.6.10. Element-wise binary operations

Compute the element-wise binary addition, subtraction, multiplication, division, maximum and minimum of the two input tensors. ~~{ ); ); ); ); ); ); );~~

The element-wise binary operations will be broadcasted according to [numpy-broadcasting-rule] . The rank of the output tensor is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.

partial interface MLGraphBuilder {  MLOperand add(MLOperand a, MLOperand b);  MLOperand sub(MLOperand a, MLOperand b);  MLOperand mul(MLOperand a, MLOperand b);  MLOperand div(MLOperand a, MLOperand b);  MLOperand max(MLOperand a, MLOperand b);  MLOperand min(MLOperand a, MLOperand b);  MLOperand pow(MLOperand a, MLOperand b);
};

Arguments:

a : an MLOperand. The first input tensor.
b : an MLOperand. The second input tensor.

Returns: an MLOperand. The output tensor that contains the result of element-wise binary operation of the two input tensors.

The element-wise binary operation will be broadcasted according to [numpy-broadcasting-rule] . The rank of the output tensor is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.

Operation types:

add : Add the values of the two input tensors, element-wise.
sub : Subtract the values of the second input tensor from the values of the first input tensor, element-wise.
mul : Multiply the values of the two input tensors, element-wise.
div : Divide the values of the first input tensor with the values of the second tensor, element-wise.
max : Select the greater values of the two input tensors, element-wise.
min : Select the lesser values of the two input tensors, element-wise.
pow : Compute the values of the values of the first input tensor to the power of the values of the second input tensor, element-wise.

To create element-wise binary operation given op,a and b, run the following steps:

Assert :op is one of "add", "sub", "mul", "div", "max", "min", "pow".
If a or b is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If a.[[descriptor]].type is not equal to b.[[descriptor]].type, then throw a " DataError " DOMException and stop.
Let descriptor be a new MLOperandDescriptor.
Set descriptor.dimensions.type to a.[[descriptor]].type.
Let descriptor.dimensions be the result of running the broadcast-shapes steps given a.[[descriptor]].dimensions and b.[[descriptor]].dimensions.
1. If that throws an error, re-throw the error and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps given this and descriptor.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the binary operation op, given a and b.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect a.[[operand]] and b.[[operand]] as inputs to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

To broadcast shapes given shape1 and shape2, run the following steps:

Assert : The type of shape1 and shape2 is sequence of unsigned long.
Let output be the result of invoking the implementation-defined shape broadcast on shape1 and shape2.
1. If that fails, throw a " DataError " DOMException and stop.
Return output.
The most common implementation is that two shapes are compatible, when each of their corresponding dimensions are equal, or one of them is 1. The output shape consists of the maximum of the corresponding dimensions.

The element-wise binary operation algorithms invoke the create element-wise binary operation steps as follows.

The


add(a,
b)

steps are:

Let output be the result of running the create element-wise binary operation given "add", a and b.
1. If that throws an error, then re-throw the error and stop.
Return output.

The sub(a, b) steps are:

Let output be the result of running the create element-wise binary operation given "sub", a and b.
1. If that throws an error, then re-throw the error and stop.
Return output.

The mul(a, b) steps are:

Let output be the result of running the create element-wise binary operation given "mul", a and b.
1. If that throws an error, then re-throw the error and stop.
Return output.

The div(a, b) steps are:

Let output be the result of running the create element-wise binary operation given "div", a and b.
1. If that throws an error, then re-throw the error and stop.
Return output.

The max(a, b) steps are:

Let output be the result of running the create element-wise binary operation given "max", a and b.
1. If that throws an error, then re-throw the error and stop.
Return output.

The min(a, b) steps are:

Let output be the result of running the create element-wise binary operation given "min", a and b.
1. If that throws an error, then re-throw the error and stop.
Return output.

The pow(a, b) steps are:

Let output be the result of running the create element-wise binary operation given "pow", a and b.
1. If that throws an error, then re-throw the error and stop.
Return output.

7.6.10. 7.6.11. Element-wise unary operations

Compute the element-wise unary operation for input tensor. ~~{ ); ); ); ); ); ); ); ); );~~

partial interface MLGraphBuilder {
  MLOperand abs(MLOperand input);
  MLOperand ceil(MLOperand input);
  MLOperand cos(MLOperand input);
  MLOperand exp(MLOperand input);
  MLOperand floor(MLOperand input);
  MLOperand log(MLOperand input);
  MLOperand neg(MLOperand input);
  MLOperand sin(MLOperand input);
  MLOperand tan(MLOperand input);
};

Arguments:

x input : an MLOperand. The input tensor.

Returns: an MLOperand. The output tensor that contains the result of element-wise unary operation of the input tensor. The shape of the output tensor is the same as the shape of input tensor.

Operation types:

abs : Compute the absolute value of the input tensor, element-wise.
ceil : Compute the ceiling of the input tensor, element-wise.
cos : Compute the cosine of the input tensor, element-wise.
exp : Compute the exponential of the input tensor, element-wise.
floor : Compute the floor of the input tensor, element-wise.
log : Compute the natural logarithm of the input tensor, element-wise.
neg : Compute the numerical negative value of the input tensor, element-wise.
sin : Compute the sine of the input tensor, element-wise.
tan : Compute the tangent of the input tensor, element-wise.

To create element-wise unary operation given op and input, run the following steps:

Assert :op is one of "abs", "ceil", "cos", "exp", "floor", "log", "neg", "sin", "tan".
If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
Let kind be "output".
Let descriptor be a new MLOperandDescriptor.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the unary operation op.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

The element-wise unary operation algorithms invoke the create element-wise unary operation steps as follows.

The


abs(input)

steps are:

Let output be the result of running the create element-wise unary operation given "abs" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The ceil(input) steps are:

Let output be the result of running the create element-wise unary operation given "ceil" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The cos(input) steps are:

Let output be the result of running the create element-wise unary operation given "cos" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The exp(input) steps are:

Let output be the result of running the create element-wise unary operation given "exp" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The floor(input) steps are:

Let output be the result of running the create element-wise unary operation given "floor" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The log(input) steps are:

Let output be the result of running the create element-wise unary operation given "log" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The neg(input) steps are:

Let output be the result of running the create element-wise unary operation given "neg" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The sin(input) steps are:

Let output be the result of running the create element-wise unary operation given "sin" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

The tan(input) steps are:

Let output be the result of running the create element-wise unary operation given "tan" and input.
1. If that throws an error, then re-throw the error and stop.
Return output.

7.6.11. 7.6.12. The elu() method

Calculate the exponential linear unit function (ELU) on the input tensor element-wise. The calculation follows the expression


max(0,
x)
+
alpha
*
(exp(min(0,
x))
-
1)

dictionary MLEluOptions {
   = 1;

  float alpha = 1;
};
 {
   = {});
   = {});

partial interface MLGraphBuilder {
  MLOperand elu(MLOperand input, optional MLEluOptions options = {});
  MLActivation elu(optional MLEluOptions options = {});
};

return builder.add(
          builder.max(builder.constant(0), x),
          builder.mul(
            builder.constant(options.alpha),
            builder.sub(
              builder.exp(builder.min(builder.constant(0), x)),
              builder.constant(1))));

To check ELU options given options, run the following steps:

If options is not an object that implements MLEluOptions, then return false.
If options.alpha is undefined, set options.alpha to 1.
Else if options.alpha is not a numeric type , then then return false.
Return true.

7.6.12.1. The `elu(input, options)` method

Arguments:

x input : an MLOperand. The input tensor.
options : an optional MLEluOptions. The optional parameters of the operation.
- alpha : a float scalar multiplier, default to 1.

Returns:

an MLOperand. The output tensor of the same shape as x .

The


elu(input,
options)

method steps are:

Let input be the first argument.
Let options be the second argument.
1. If running the check ELU options steps with options returns false, then throw a " TypeError " DOMException and abort these steps.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the ELU operation, given options.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.12.2. The `elu(options)` method

Arguments:

options : an optional MLEluOptions. The optional parameters of the operation.
- alpha : a float scalar multiplier, default to 1.

Returns:

an MLActivation. The activation function representing the elu operation.

The ~~behavior of this operation can~~



elu(options)

method steps are:

Let options be ~~generically emulated from~~ the ~~usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage~~ first argument.
1. If options is ~~encouraged from~~ undefined, let options be a new MLEluOptions object.
2. If running the ~~performance standpoint. builder builder builder builder builder builder~~ check ELU options steps with options returns false, then throw a " TypeError " DOMException and abort these steps.
Let op be the result of invoking the create MLActivation steps with "elu" and options.
Return op.

7.6.12. 7.6.13. The gemm() method

Calculate the general matrix multiplication of the Basic Linear Algebra Subprograms . The calculation follows the expression


alpha
*
A
*
B
+
beta
*
C

, where

is a 2-D tensor with shape [M, K] or [K, M],

is a 2-D tensor with shape [K, N] or [N, K], and

is broadcastable to the shape [M, N].

and

may optionally be transposed prior to the calculation.

dictionary MLGemmOptions {
  ;
   = 1.0;
   = 1.0;
  ;
  ;

  MLOperand c;
  float alpha = 1.0;
  float beta = 1.0;
  boolean aTranspose = false;
  boolean bTranspose = false;
};
 {
   = {});

partial interface MLGraphBuilder {
  MLOperand gemm(MLOperand a, MLOperand b, optional MLGemmOptions options = {});
};

MLGemmOptions has the following members:

c, of type MLOperand: An MLOperand. Specifies the third input tensor. It is either a scalar, or of the shape that is unidirectionally broadcastable to the shape [M, N] according to [numpy-broadcasting-rule] . When it is not specified, the computation is done as if c is a scalar 0.0.
alpha, of type float , defaulting to 1.0: A float scalar multiplier for the first input.
beta, of type float , defaulting to 1.0: A float scalar multiplier for the third input c.
aTranspose, of type boolean , defaulting to false: A boolean indicating if the first input should be transposed prior to calculating the output.
bTranspose, of type boolean , defaulting to false: A boolean indicating if the second input should be transposed prior to calculating the output.

Arguments:

a : an MLOperand. The first input 2-D tensor with shape [M, K] if aTranspose is false, or [K, M] if aTranspose is true.
b : an MLOperand. The second input 2-D tensor with shape [K, N] if bTranspose is false, or [N, K] if bTranspose is true.
options : an optional MLGemmOptions. The optional parameters of the operation.

~~c :~~ Returns: an MLOperand. The ~~third input tensor. It is either a scalar, or~~ output 2-D tensor of ~~the~~ shape [M, N] that contains the calculated product of all the inputs.

The


gemm(a,
b,
options)

steps are:

If a or b is ~~unidirectionally broadcastable~~ not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.alpha is undefined, set it to ~~the shape [M, N] according~~ 1.0.
If options.beta is undefined, set it to ~~[numpy-broadcasting-rule] . When~~ 1.0.
If options.aTranspose is undefined, set it to false.
If options.aTranspose is not ~~specified, the computation is done as if c~~ false, set it to true.
If options.bTranspose is ~~a scalar 0.0.~~ undefined, set it to false.
If options.alpha : bTranspose is not false, set it to true.
Let shapeA be a. float [[descriptor]].dimensions ~~scalar multiplier for~~ and sizeA the ~~first input, default to 1.0.~~ size of shapeA.
~~beta :~~ Let shapeB be a. float [[descriptor]].dimensions ~~scalar multiplier for~~ and sizeB the ~~third input, default to 1.0.~~ size of shapeB.
~~aTranspose :~~ If sizeA is not 2 or sizeB is not 2, then throw a " boolean DataError ~~indicating if the first input should~~ " DOMException and stop.
If options.aTranspose is true, then let shapeA be ~~transposed prior to calculating~~ the ~~output, default to false.~~ reverse array of shapeA.
If options.bTranspose : is true, then let shapeB be the reverse array of shapeB.
If shapeA [1] is not equal to shapeB [0], then throw a " boolean DataError ~~indicating if~~ " DOMException and stop.
If options.c exists and is not unidirectionally broadcastable to the ~~second input should be transposed prior~~ shape [ shapeA [0], shapeB [1]] according to ~~calculating~~ the ~~output, default~~ [numpy-broadcasting-rule] , then throw a " DataError " DOMException and stop.
Type compatibility between a,b and options.c can be also checked.
Let desc a new MLOperandDescriptor.
Set desc.dimensions to ~~false.~~ [ shapeA [0], shapeB [1]].
Set desc.Returns: an type to a. MLOperand [[descriptor]]. ~~The~~ type.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output 2-D tensor be the result of ~~shape [M, N] that contains~~ invoking the ~~calculated product~~ create MLOperand steps given this and desc.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the GEMM operation, given options.
  2. Store a reference of ~~all~~ opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the ~~inputs.~~ output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect a.[[operand]] and b.[[operand]] as inputs to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

if (options.aTranspose)
  a = builder.transpose(a);
if (options.bTranspose)
  b = builder.transpose(b);
let ab = builder.matmul(builder.mul(builder.constant(options.alpha), a), b);
return (c ? builder.add(ab, builder.mul(builder.constant(options.beta), c)) : ab);

7.6.13. 7.6.14. The gru() method

Gated Recurrent Unit [GRU] recurrent network uses an update, reset, and new gate to compute the output state that rolls into the output across the temporal sequence of the network.

enum MLGruWeightLayout {
  "zrn",  // update-reset-new gate ordering
  "rzn"   // reset-update-new gate ordering
};
enum MLRecurrentNetworkDirection {
  "forward",
  "backward",
  "both"
};
dictionary MLGruOptions {
  ;
  ;
  ;
  ;
  ;
   = "forward";
   = "zrn";
  ;

  MLOperand bias;
  MLOperand recurrentBias;
  MLOperand initialHiddenState;
  boolean resetAfter = true;
  boolean returnSequence = false;
  MLRecurrentNetworkDirection direction = "forward";
  MLGruWeightLayout layout = "zrn";
  sequence<MLActivation> activations;
};
 {
  ,
                          ,

partial interface MLGraphBuilder {
  sequence<MLOperand> gru(MLOperand input, MLOperand weight, MLOperand recurrentWeight,
                          unsigned long steps, unsigned long hiddenSize,
                          optional MLGruOptions options = {});
};

~~Arguments:~~

~~input : an~~ MLOperand MLGruOptions . The input 3-D tensor of shape [steps, batch_size, input_size]. has the following members:

~~weight : an~~ bias, of type MLOperand . The 3-D input weight tensor of shape [num_directions, 3 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout argument.: ~~recurrentWeight : an~~ An MLOperand. ~~The 3-D recurrent weight~~ Specifies the 2-D input bias tensor of shape [num_directions, 3 * ~~hidden_size,~~ hidden_size]. The ordering of the ~~weight~~ bias vectors in the second dimension of the tensor shape is specified according to the ~~options.layout argument. steps : an~~ unsigned long layout ~~scalar. The number of time steps in the recurrent network. The value must be greater than 0.~~ argument.
~~hiddenSize : an~~
~~unsigned long~~ recurrentBias ~~scalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state.~~ ~~options : an optional MLGruOptions~~ , of type MLOperand ~~. The optional parameters of the operation.~~: ~~bias : an~~ An MLOperand. ~~The~~ Specifies the 2-D ~~input~~ recurrent bias tensor of shape [num_directions, 3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the ~~options.layout argument. recurrentBias : an~~ MLOperand layout . The 2-D recurrent bias tensor of shape [num_directions, 3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the options.layout argument.
initialHiddenState : an , of type MLOperand: An MLOperand. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, ~~it’s assumed to be~~ implementations SHOULD use a tensor filled with zero.
resetAfter : a , of type boolean , defaulting to true: A boolean indicating whether to apply the reset gate after or before matrix multiplication. ~~Default to true.~~ The default value is true.
returnSequence : a , of type boolean , defaulting to false: A boolean indicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step. ~~Default to false.~~ The default value is false.
direction : an , of type MLRecurrentNetworkDirection , defaulting to "forward": An MLRecurrentNetworkDirection. ~~The~~ Specifies the processing direction of the input sequence. When set to "both" ,, the size of the first dimension of the weight and the bias tensor shapes must be 2, 2, and the input is processed in both directions.
layout : an , of type MLGruWeightLayout , defaulting to "zrn": An MLGruWeightLayout. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the update (z) ,, reset (r) ,, and new (n) gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is "zrn" ..
activations : a , of type sequence< MLActivation >: A sequence of MLActivation. A Specifies a pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, ~~it’s assumed to be~~ implementations SHOULD use the the pair of sigmoid ( "sigmoid" ) and the hyperbolic tangent ( "tanh" ) ~~function~~ functions, respectively.

Arguments:

input : an MLOperand. The input 3-D tensor of shape [steps, batch_size, input_size].
weight : an MLOperand. The 3-D input weight tensor of shape [num_directions, 3 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout argument.
recurrentWeight : an MLOperand. The 3-D recurrent weight tensor of shape [num_directions, 3 * hidden_size, hidden_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout argument.
steps : an unsigned long scalar. The number of time steps in the recurrent network. The value must be greater than 0.
hiddenSize : an unsigned long scalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state.
options : an optional MLGruOptions. The optional parameters of the operation.

Returns: a sequence of MLOperand. The first element of the sequence is a 3-D tensor of shape [num_directions, batch_size, hidden_size], the cell output from the last time step of the network. Additionally, if ~~options.returnSequence~~ options.returnSequence is set to ~~true,~~ true, the second element is the 4-D output tensor of shape [steps, num_directions, batch_size, hidden_size] containing every cell outputs from each time step in the temporal sequence.

The


gru(input,
weight,
recurrentWeight,
steps,
hiddenSize,
options)

steps are:

If input,weight or recurrentWeight is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If the rank of input or weight is not 3, then throw a " DataError " DOMException and stop.
If the rank of weight or recurrentWeight is not 2, then throw a " DataError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.bias exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 2, then throw a " DataError " DOMException and stop.
If options.recurrentBias exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 2, then throw a " DataError " DOMException and stop.
If options.initialHiddenState exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 3, then throw a " DataError " DOMException and stop.
If options.resetAfter is undefined, set it to true.
If options.returnSequence is undefined, set it to false.
If options.direction is undefined, set it to "forward".
If options.direction is not one of MLRecurrentNetworkDirection, then throw a " TypeError " DOMException and stop.
If options.layout is undefined, set it to "zrn".
If options.layout is not one of MLGruWeightLayout, then throw a " TypeError " DOMException and stop.
If options.activations exists and is not an array of size 2, or if any of its elements is not an instance of MLActivation, then throw a " TypeError " DOMException and stop.
If steps is not a number or it is 0, then throw a " TypeError " DOMException and stop.
Let output be an empty sequence of MLOperand objects.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for "gru", given weight,recurrentWeight,steps,hiddenSize and options as parameters.
2. Connect input.[[operand]] as input to opImpl.
3. Connect output as output to opImpl.
Return output.

The behavior of this operation can be generically emulated from the usage of other operations as follows. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.

const numDirections = (options.direction == "both" ? 2 : 1);
let hiddenState = options.initialHiddenState;
if (!hiddenState) {
  const desc = { type: 'float32', dimensions: [numDirections, 1, hiddenSize] };
  const totalSize = numDirections * hiddenSize;
  hiddenState = builder.constant(desc, new Float32Array(totalSize).fill(0));
}
let sequence = null;
let currentWeight = [];
let currentRecurrentWeight = [];
let currentBias = [];
let currentRecurrentBias = [];
for (let dir = 0; dir < numDirections; ++dir) {
  currentWeight.push(builder.squeeze(builder.slice(weight, [dir, 0, 0], [1, 3 * hidden_size, input_size]), { axes: [0] }));
  currentRecurrentWeight.push(builder.squeeze(builder.slice(recurrentWeight, [dir, 0, 0], [1, 3 * hidden_size, hidden_size]), { axes: [0] }));
  currentBias.push(options.bias ? (builder.squeeze(builder.slice(options.bias, [dir, 0], [1, 3 * hidden_size]), { axes: [0] })) : null);
  currentRecurrentBias.push(options.recurrentBias ?
    (builder.squeeze(builder.slice(options.recurrentBias, [dir, 0], [1, 3 * hidden_size]), { axes: [0] })) : null);
}
for (let step = 0; step < steps; ++step) {
  let currentHidden = [];
  let currentOutput = null;
  for (let dir = 0; dir < numDirections; ++dir) {
    currentHidden.push(builder.squeeze(builder.slice(hiddenState, [dir, 0, 0], [1, batch_size, hidden_size]), { axes: [0] }));
  }
  for (let dir = 0; dir < numDirections; ++dir) {
    let slice = (dir == 1 || options.direction == "backward" ? steps - step - 1 : step);
    let currentInput = builder.squeeze(builder.slice(input, [slice, 0, 0], [1, batch_size, input_size]), { axes: [0] });
    let result = builder.reshape(
      builder.gruCell(
        currentInput, currentWeight[dir], currentRecurrentWeight[dir],
        currentHidden[dir], hiddenSize, { bias: currentBias[dir],
        recurrentBias: currentRecurrentBias[dir], resetAfter: options.resetAfter,
        layout: options.layout, activations: options.activations }),
      [1, null, hiddenSize]);
    currentOutput = (currentOutput ? builder.concat([currentOutput, result], 0) : result);
  }
  hiddenState = currentOutput;
  if (options.returnSequence) {
    currentOutput = builder.reshape(currentOutput, [1, numDirections, null, hiddenSize]);
    sequence = (sequence ? builder.concat([sequence, currentOutput], 0) : currentOutput);
  }
}
return (sequence ? [hiddenState, sequence] : [hiddenState]);

7.6.14. 7.6.15. The gruCell() method

A single time step of the Gated Recurrent Unit [GRU] recurrent network using an update gate and a reset gate to compute the hidden state that rolls into the output across the temporal sequence of a recurrent network.

dictionary MLGruCellOptions {
  ;
  ;
  ;
   = "zrn";
  ;

  MLOperand bias;
  MLOperand recurrentBias;
  boolean resetAfter = true;
  MLGruWeightLayout layout = "zrn";
  sequence<MLActivation> activations;
};
 {
  ,
                    ,

partial interface MLGraphBuilder {
  MLOperand gruCell(MLOperand input, MLOperand weight, MLOperand recurrentWeight,
                    MLOperand hiddenState, unsigned long hiddenSize,
                    optional MLGruCellOptions options = {});
};

MLGruCellOptions has the following members:

bias, of type MLOperand: An MLOperand. Specifies the 1-D input bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the layout argument.
recurrentBias, of type MLOperand: An MLOperand. Specifies the 1-D recurrent bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the layout argument.
resetAfter, of type boolean , defaulting to true: A boolean indicating whether to apply the reset gate after or before matrix multiplication. The default value is true.
layout, of type MLGruWeightLayout , defaulting to "zrn": An MLGruWeightLayout. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the update (z),reset (r), and new (n) gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is "zrn".
activations, of type sequence< MLActivation >: A sequence of MLActivation. Specifies a pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, implementations SHOULD use the the pair of sigmoid ( "sigmoid" ) and the hyperbolic tangent ( "tanh" ) functions, respectively.

Arguments:

input : an MLOperand. The input 2-D tensor of shape [batch_size, input_size].
weight : an MLOperand. The 2-D input weight tensor of shape [3 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument.
recurrentWeight : an MLOperand. The 2-D recurrent weight tensor of shape [3 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument.
hiddenState : an MLOperand. The 2-D input hidden state tensor of shape [batch_size, hidden_size].
hiddenSize : an unsigned long scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.
options : an optional MLGruCellOptions. The optional parameters of the operation.

~~bias :~~ Returns: an MLOperand. The ~~1-D input bias~~ 2-D tensor of shape ~~[3 * hidden_size]. The ordering of~~ [batch_size, hidden_size], the ~~bias vectors in~~ cell output hidden state of a single time step of the ~~first dimension~~ recurrent network.

The


gruCell(input,
weight,
recurrentWeight,
hiddenState,
hiddenSize,
options)

steps are:

If input,weight or recurrentWeight is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If the ~~tensor shape~~ rank of input or weight is ~~specified according to~~ not 3, then throw a " DataError " DOMException and stop.
If the ~~options.layout argument.~~ rank of weight or recurrentWeight is not 2, then throw a " DataError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.bias exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 1, then throw a " DataError " DOMException and stop.
If options.recurrentBias : exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 1, then throw a " DataError " DOMException and stop.
If options.resetAfter is undefined, set it to true. ~~The 1-D recurrent bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape~~
If options.layout is ~~specified according~~ undefined, set it to ~~the options.layout argument.~~ "zrn".
If options.resetAfter : layout is not one of MLGruWeightLayout, then throw a " boolean TypeError ~~indicating whether to apply the reset gate after~~ " DOMException and stop.
If options.activations exists and is not an array of size 2, or ~~before matrix multiplication. Default~~ if any of its elements is not an instance of MLActivation, then throw a " TypeError " DOMException and stop.
Let desc a new MLOperandDescriptor.
Set desc.dimensions to ~~true.~~ [ input.dimensions [0], hiddenSize ].
Set desc.layout : an type to input. MLGruWeightLayout [[descriptor]]. ~~The ordering~~ type.
If any of the ~~weight~~ following sub-steps fail, throw an " OperationError " DOMException and ~~bias vectors for~~ stop.
1. Let output be the ~~internal gates~~ result of ~~GRU, specifically~~ invoking the ~~update (z) , reset (r) ,~~ create MLOperand steps given this and ~~new (n) gate, as indicated in the first dimension of~~ desc.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for "gruCell", given weight, recurrentWeight,hiddenState,hiddenSize and ~~bias tensor shapes. When not specified, the default layout is "zrn" .~~ options as parameters.
  2. ~~activations :~~ Store a ~~sequence~~ reference of opImpl in output. MLActivation [[operator]]. ~~A pair of activation functions with the first function used for the update (z) and reset (r) gate, and the second used for the new (n) gate. When not specified, it’s default~~
  3. Create an implementation-defined platform operand outputImpl to represent the ~~sigmoid ( "sigmoid" )~~ output, given output and ~~the hyperbolic tangent ( "tanh" ) function respectively.~~ opImpl.
  4. ~~Returns: an~~ Store a reference to outputImpl in output. MLOperand [[operand]]. ~~The 2-D tensor of shape [batch_size, hidden_size], the cell~~
3. Connect input.[[operand]] as input to opImpl.
4. Connect output hidden state of a single time step of the recurrent network.. [[operand]] as output to opImpl.
Return output.

The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default "zrn" layout, and the activation functions of the update/reset gate and new gate are of the operator types sigmoid and tanh respectively.

const one = builder.constant(1);
const zero = builder.constant(0);
// update gate (z)
let z = builder.sigmoid(
  builder.add(
    builder.add(
      (options.bias ? builder.slice(options.bias, [0], [hiddenSize]) : zero),
      (options.recurrentBias ? builder.slice(options.recurrentBias, [0], [hiddenSize]) : zero)
      ),
    builder.add(
      builder.matmul(
        input,
        builder.transpose(builder.slice(weight, [0, 0], [hiddenSize, input_size]))
        ),
      builder.matmul(
        hiddenState,
        builder.transpose(builder.slice(recurrentWeight, [0, 0], [hiddenSize, hidden_size]))
        )
      )
    )
  );
// reset gate (r)
let r = builder.sigmoid(
  builder.add(
    builder.add(
      (options.bias ? builder.slice(options.bias, [hiddenSize], [hiddenSize]) : zero),
      (options.recurrentBias ? builder.slice(options.recurrentBias, [hiddenSize], [hiddenSize]) : zero)
      ),
    builder.add(
      builder.matmul(
        input,
        builder.transpose(builder.slice(weight, [hiddenSize, 0], [hiddenSize, input_size]))
        ),
      builder.matmul(
        hiddenState,
        builder.transpose(builder.slice(recurrentWeight, [hiddenSize, 0], [hiddenSize, hidden_size]))
        )
      )
    )
  );
// new gate (n)
let n;
if (resetAfter) {
  n = builder.tanh(
    builder.add(
      (options.bias ? builder.slice(options.bias, [2 * hiddenSize], [hiddenSize]) : zero),
      builder.add(
        builder.matmul(
          input,
          builder.transpose(builder.slice(weight, [2 * hiddenSize, 0], [hiddenSize, input_size]))
          ),
        builder.mul(
          r,
          builder.add(
            (options.recurrentBias ? builder.slice(options.recurrentBias, [2 * hiddenSize], [hiddenSize]) : zero),
            builder.matmul(
              hiddenState,
              builder.transpose(builder.slice(recurrentWeight, [2 * hiddenSize, 0], [hiddenSize, hidden_size]))
              )
            )
          )
        )
      )
    );
}
else {
  n = builder.tanh(
    builder.add(
      builder.add(
        (options.bias ? builder.slice(options.bias, [2 * hiddenSize], [hiddenSize]) : zero),
        (options.recurrentBias ? builder.slice(options.recurrentBias, [2 * hiddenSize], [hiddenSize]) : zero)
        ),
      builder.add(
        builder.matmul(
          input,
          builder.transpose(builder.slice(weight, [2 * hiddenSize, 0], [hiddenSize, input_size]))
          ),
        builder.matmul(
          builder.mul(r, hiddenState),
          builder.transpose(builder.slice(recurrentWeight, [2 * hiddenSize, 0], [hiddenSize, hidden_size]))
          )
        )
      )
    );
}
// compute the new hidden state
return builder.add(builder.mul(z, hiddenState), builder.mul(n, builder.sub(one, z)));

7.6.15. 7.6.16. The hardSigmoid() method

Calculate the non-smooth hard sigmoid function on the input tensor, used ~~in place~~ instead of a the sigmoid function ~~on the input tensor.~~ for faster computation.

dictionary MLHardSigmoidOptions {
   = 0.2;
   = 0.5;

  float alpha = 0.2;
  float beta = 0.5;
};
 {
   = {});
   = {});

partial interface MLGraphBuilder {
  MLOperand hardSigmoid(MLOperand input, optional MLHardSigmoidOptions options = {});
  MLActivation hardSigmoid(optional MLHardSigmoidOptions options = {});
};

return builder.max(
           builder.min(
               builder.add(
                   builder.mul(builder.constant(options.alpha), x),
                   builder.constant(options.beta)),
               builder.constant(1)),
           builder.constant(0));

MLHardSigmoidOptions has the following members:

alpha, of type float , defaulting to 0.2: A float scalar multiplier. The default value is 0.2.
beta, of type float , defaulting to 0.5: A float point scalar addition. The default value is 0.5.

To check hard-sigmoid options given options, run the following steps:

If options is not an object that implements MLHardSigmoidOptions, then return false.
If options.alpha is undefined, set options.alpha to 0.2.
Else if options.alpha is not a numeric type , then then return false.
If options.beta is undefined, set options.beta to 0.5.
Else if options.beta is not a numeric type , then then return false.
Return true.

7.6.16.1. The `hardSigmoid(input, options)` method

Arguments:

x input : an MLOperand. The input tensor.
options : an optional MLHardSigmoidOptions. The optional parameters of the operation.

Returns:

an MLOperand. The output tensor of the same shape as ~~alpha :~~ input .

The


hardSigmoid(input,
options)

method steps are:

Let input be the first argument.
Let options be the second argument.
1. If running the check hard-sigmoid options steps with options returns false, then throw a " float TypeError ~~scalar multiplier, default~~ " DOMException and abort these steps.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to ~~0.2.~~ the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the hard sigmoid operation, given options.
  2. Store a reference of opImpl in output.beta : [[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output. float [[operand]].
3. Connect input.[[operand]] ~~scalar addition, default~~ as input to ~~0.5.~~ opImpl.
4. Connect output.Returns: [[operand]] as output to opImpl.
Return output.

7.6.16.2. The `hardSigmoid(options)` method

Arguments:

options : an optional MLOperand MLHardSigmoidOptions. The ~~output tensor~~ optional parameters of the ~~same shape as x .~~ operation.

Returns:

an MLActivation. The activation function representing the hard sigmoid operation.

The ~~behavior of this operation can~~



hardSigmoid(options)

method steps are:

Let options be ~~generically emulated from~~ the ~~usage of other operations as follow. However, user agents typically have~~ first argument.
1. If running the check hard-sigmoid options steps with options returns false, then throw a ~~more efficient implementation for it, therefore its usage is encouraged from~~ " TypeError " DOMException and abort these steps.
Let op be the ~~performance standpoint. builder builder builder builder builder builder~~ result of invoking the create MLActivation steps with "hardSigmoid" and options.
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.16. 7.6.17. The hardSwish() method

Computes the nonlinear function


y
=
x
*
max(0,
min(6,
(x
+
3)))
/
6

that is introduced by [MobileNetV3] on the input tensor element-wise. ~~{ ); ();~~

partial interface MLGraphBuilder {
  MLOperand hardSwish(MLOperand input);
  MLActivation hardSwish();
};

return builder.div(
           builder.mul(
               x,
               builder.max(
                   builder.constant(0),
                   builder.min(
                       builder.constant(6),
                       builder.add(x, builder.constant(3))))),
           builder.constant(6));

7.6.17.1. The `hardSwish(input)` method

Arguments:

x input : an MLOperand. The input tensor.

Returns:

an MLOperand. The output tensor of the same shape as x input .

The


hardSwish(input)

method steps are:

Let input be the first argument.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the hard-swish operation.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.17.2. The `hardSwish()` method

Arguments:

None.

Returns:

an MLActivation. The activation function representing the hard-swish operation.

The ~~behavior of this operation can~~



hardSwish()

method steps are:

Let op be ~~generically emulated from~~ the ~~usage~~ result of ~~other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from~~ invoking the performance standpoint. builder x builder builder builder builder builder builder create MLActivation steps with "hardSwish".
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.17. 7.6.18. The instanceNormalization() method

Normalize the input features using [Instance-Normalization] . Unlike ~~§ 7.6.4~~ § 7.6.5 The batchNormalization() method where the mean and variance values used in the calculation are previously computed across the batch dimension during the model training phase, the mean and variance values used in the calculation of an instance normalization are computed internally on the fly per input feature.

dictionary MLInstanceNormalizationOptions {
  ;
  ;
   = 1e-5;
   = "nchw";

  MLOperand scale;
  MLOperand bias;
  float epsilon = 1e-5;
  MLInputOperandLayout layout = "nchw";
};
 {
  ,

partial interface MLGraphBuilder {
  MLOperand instanceNormalization(MLOperand input,
                                optional MLInstanceNormalizationOptions options = {});
};

~~Arguments:~~

~~input : an MLOperand .~~ The ~~input 4-D tensor. options : an optional~~ MLInstanceNormalizationOptions . The optional parameters of the operation. members are:

scale : an , of type MLOperand: An MLOperand. ~~The~~ Specifies he 1-D tensor of the scaling values whose length is equal to the number of channels, i.e. the size of the feature dimension of the ~~input e.g.~~ input. For example, for ~~the~~ an input tensor with nchw layout, the ~~feature dimension~~ length is 1. the value of input.[[descriptor]].dimensions [1].
bias : an , of type MLOperand: An MLOperand. ~~The~~ Specifies the 1-D tensor of the bias values whose length is equal to the size of the feature dimension of the ~~input e.g.~~ input. For example, for ~~the~~ an input tensor with nchw layout, the ~~feature dimension~~ length is 1. the value of input.[[descriptor]].dimensions [1].
epsilon : a , of type float , defaulting to 1e-5: A float scalar. A Specifies a small value to prevent computational error due to divide-by-zero. ~~The default value is 0.00001 when not specified.~~
layout : an , of type MLInputOperandLayout , defaulting to "nchw": An MLInputOperandLayout. ~~This option specifies~~ Specifies the layout format of the input.

Arguments:

input : an MLOperand. The ~~default value is~~ input 4-D tensor.
~~"nchw" .~~ options : an optional MLInstanceNormalizationOptions. The optional parameters of the operation.

Returns: an MLOperand. The instance-normalized 4-D tensor of the same shape as the input tensor.

The


instanceNormalization(input,
options)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If the rank of input is not 4, then throw a " DataError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.scale is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If the rank of options.scale is not equal to the size of the channel dimension of input, then throw a " DataError " DOMException and stop.
If options.bias is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If the rank of options.bias is not equal to the size of the channel dimension of input, then throw a " DataError " DOMException and stop.
If options.epsilon is undefined, let it be 0.00001.
If options.layout is undefined, let it be "nchw".
Otherwise if options.layout is not one of MLInputOperandLayout, then throw a " DataError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the instance normalization operation, given options.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

The behavior of this operation when the input tensor is 4-D of the "nchw" layout can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.

// The mean reductions happen over the spatial dimensions of the input
// e.g. axis 2 and 3 of the input tensor.
const reduceOptions = { axes: [2,3], keepDimensions: true };
const mean = builder.reduceMean(input, reduceOptions);
const variance = builder.reduceMean(
  builder.pow(
    builder.sub(input, mean),
    buider.constant(2)),
  reduceOptions
  );
// The scale and bias values are applied per input feature
// e.g. axis 1 of the input tensor.
const shape = [1,null,1,1];
return builder.add(
  builder.mul(
    builder.reshape(options.scale, shape),
    builder.div(
      builder.sub(input, mean),
      buidler.pow(
        builder.add(variance, options.epsilon),
        builder.constant(0.5))
      )
    ),
  builder.reshape(options.bias, shape)
  );

7.6.18. 7.6.19. The leakyRelu() method

Calculate the leaky version of rectified linear function on the input tensor element-wise. The calculation follows the expression


max(0,
x)
+
alpha
∗
min(0,
x)

dictionary MLLeakyReluOptions {
   = 0.01;

  float alpha = 0.01;
};
 {
   = {});
   = {});

partial interface MLGraphBuilder {
  MLOperand leakyRelu(MLOperand input, optional MLLeakyReluOptions options = {});
  MLActivation leakyRelu(optional MLLeakyReluOptions options = {});
};

return builder.add(builder.max(builder.constant(0), x),
          builder.mul(builder.constant(options.alpha), builder.min(builder.constant(0), x)));

MLLeakyReluOptions has the following members:

alpha, of type float , defaulting to 0.01: A float scalar multiplier. The default value is 0.01.

To check leaky-relu options given options, run the following steps:

If options is not an object that implements MLLeakyReluOptions, then return false.
If options.alpha is undefined, set options.alpha to 1.
Else if options.alpha is not a numeric type , then then return false.
Return true.

7.6.19.1. The `leakyRelu(input, options)` method

Arguments:

x input : an MLOperand. The input tensor.
options : an optional MLLeakyReluOptions. The optional parameters of the operation.

Returns:

an MLOperand. The output tensor of the same shape as ~~alpha :~~ input .

The


leakyRelu(input,
options)

method steps are:

Let input be the first argument.
Let options be the second argument.
1. If options is undefined, let options be a new float MLLeakyReluOptions ~~scalar multiplier, default~~ object.
2. If running the check leaky-relu options steps with options returns false, then throw a " TypeError " DOMException and abort these steps.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to ~~0.01.~~ the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the Leaky RELU operation, given options.
  2. Store a reference of opImpl in output.Returns: [[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.19.2. The `leakyRelu(options)` method

Arguments:

options : an optional MLOperand MLLeakyReluOptions. The ~~output tensor~~ optional parameters of the ~~same shape as x .~~ operation.

Returns:

an MLActivation. The activation function representing the leaky relu operation.

The


elu(options)

method steps are:

Let options be the first argument.
1. If options is undefined, let options be a new MLLeakyReluOptions object.
2. If running the check leaky-relu options steps with options returns false, then throw a " TypeError " DOMException and abort these steps.
Let op be the result of invoking the create MLActivation steps with "leakyRelu" and options.
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.20. The linear() method

Calculate a linear function


y
=
alpha
*
x
+
beta

on the input tensor.

dictionary MLLinearOptions {  float alpha = 1;  float beta = 0;
};
partial interface MLGraphBuilder {  MLOperand linear(MLOperand input, optional MLLinearOptions options = {});  MLActivation linear(optional MLLinearOptions options = {});
};

return builder.add(
          builder.mul(x, builder.constant(options.alpha)),
          builder.constant(options.beta));

MLLinearOptions has the following members:

alpha, of type float , defaulting to 1 ~~7.6.19.~~: A float scalar multiplier. The ~~linear() method~~ default value is 1.
beta, of type float , defaulting to 0: A float Calculate a scalar addition. The default value is 0.

To check linear ~~function~~ options given options, run the following steps:

If options is not an object that implements MLLinearOptions, then return y = false.
If options.alpha * x + is undefined, set options.alpha to 1.
Else if options.alpha is not a numeric type , then then return false.
If options.beta ~~on the input tensor. { = 1; = 0; };~~ is undefined, set options.{ = {}); = {}); }; beta to 0.
Else if options.beta is not a numeric type , then then return false.
Return true.

7.6.20.1. The `linear(input, options)` method

Arguments:

x input : an MLOperand. The input tensor.
options : an optional MLLinearOptions. The optional parameters of the operation.

Returns:

an MLOperand. The output tensor of the same shape as ~~alpha :~~ x .

The


linear(input,
options)

method steps are:

Let input be the first argument.
Let options be the second argument.
1. If running the check linear options steps with options returns false, then throw a " float TypeError ~~scalar multiplier, default~~ " DOMException and abort these steps.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to 1. the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the linear operation, given options.
  2. Store a reference of opImpl in output.beta : [[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output. float [[operand]].
3. Connect input.[[operand]] ~~scalar addition, default~~ as input to 0. opImpl.
4. Connect output.Returns: [[operand]] as output to opImpl.
Return output.

7.6.20.2. The `linear(options)` method

Arguments:

options : an optional MLOperand MLLinearOptions. The ~~output tensor~~ optional parameters of the ~~same shape as x .~~ operation.

Returns:

an MLActivation. The activation function representing the linear operation.

The ~~behavior of this operation can~~



linear(options)

method steps are:

Let options be ~~generically emulated from~~ the ~~usage of other operations as follow. However, user agents typically have~~ first argument.
1. If running the check linear options steps with options returns false, then throw a ~~more efficient implementation for it, therefore its usage is encouraged from~~ " TypeError " DOMException and abort these steps.
Let op be the ~~performance standpoint. builder builder~~ result of invoking the create MLActivation steps with "linear" and options.
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.20. 7.6.21. The lstm() method

Long Short-Term Memory [LSTM] recurrent network uses an input, output, forget, and cell gate to compute the output state that rolls into the output across the temporal sequence of the network.

enum MLLstmWeightLayout {
  "iofg", // input-output-forget-cell gate ordering
  "ifgo"  // input-forget-cell-output gate ordering
};
dictionary MLLstmOptions {
  ;
  ;
  ;
  ;
  ;
  ;
   = "forward";
   = "iofg";
  ;

  MLOperand bias;
  MLOperand recurrentBias;
  MLOperand peepholeWeight;
  MLOperand initialHiddenState;
  MLOperand initialCellState;
  boolean returnSequence = false;
  MLRecurrentNetworkDirection direction = "forward";
  MLLstmWeightLayout layout = "iofg";
  sequence<MLActivation> activations;
};
 {
  ,
                           ,

partial interface MLGraphBuilder {
  sequence<MLOperand> lstm(MLOperand input, MLOperand weight, MLOperand recurrentWeight,
                           unsigned long steps, unsigned long hiddenSize,
                           optional MLLstmOptions options = {});
};

~~Arguments:~~

~~input : an~~ MLOperand MLLstmOptions . The input 3-D tensor of shape [steps, batch_size, input_size]. has the following members:

~~weight : an~~ bias, of type MLOperand . The 3-D input weight tensor of shape [num_directions, 4 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout argument.: ~~recurrentWeight : an~~ An MLOperand. ~~The 3-D recurrent weight~~ Specifies the 2-D input bias tensor of shape [num_directions, 4 * ~~hidden_size,~~ hidden_size]. The ordering of the ~~weight~~ bias vectors in the second dimension of the tensor shape is specified according to ~~the options.layout argument. steps : an~~ unsigned long layout scalar. The number of time steps in the recurrent network. The value must be greater than 0..
~~hiddenSize : an~~
~~unsigned long~~ recurrentBias ~~scalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state. options : an optional~~ ~~MLGruOptions~~ , of type MLOperand ~~. The optional parameters of the operation.~~: ~~bias : an~~ An MLOperand. ~~The~~ Specifies the 2-D ~~input~~ recurrent bias tensor of shape [num_directions, 4 * hidden_size]. The ordering of the bias vectors in the ~~second~~ first dimension of the tensor shape is specified according to ~~the options.layout argument. recurrentBias : an~~ MLOperand layout. The 2-D recurrent bias tensor of shape [num_directions, 4 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the options.layout argument.
peepholeWeight : an , of type MLOperand: An MLOperand. ~~The~~ Specifies the 2-D weight tensor for peepholes of shape [num_directions, 3 4 * hidden_size]. The pack ordering of the weight vectors is for the input (i) ,, output (o) ,, and forget (f) gate gate, respectively.
initialHiddenState : an , of type MLOperand: An MLOperand. ~~The~~ Specifies the 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, ~~it’s assumed to be~~ implementations SHOULD use a tensor filled with zero.
initialCellState : an , of type MLOperand: An MLOperand. ~~The~~ Specifies the 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, ~~it’s assumed to be~~ implementations SHOULD use a tensor filled with zero.
returnSequence : a , of type boolean , defaulting to false: A boolean indicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step. ~~Default to false.~~
direction : an , of type MLRecurrentNetworkDirection , defaulting to "forward": An MLRecurrentNetworkDirection. ~~The~~ Specifies the processing direction of the input sequence. When set to "both" ,, the size of the first dimension of the weight and the bias tensor shapes must be 2, 2, and the input is processed in both directions.
layout : an , of type MLLstmWeightLayout , defaulting to "iofg": An MLLstmWeightLayout. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the input (i) ,, output (o) ,, forget (f) ,, and cell (g) gate, as indicated in the ~~second~~ first dimension of the weight and bias tensor shapes. When not specified, the default layout is "iofg" ..
activations : a , of type sequence< MLActivation >: A sequence of MLActivation. A sequence of three activation functions, the first one is used for the input (i) ,, forget (f) ,, and output (o) gate, the second one is used for the cell (g) gate, and the last used for filtering the output cell state before combining it with the result of the output gate to form the output hidden state. When not specified, ~~they are assumed to be~~ implementations SHOULD use the sequence of the sigmoid function ( "sigmoid" ) followed by two hyperbolic tangent functions ( "tanh" ) respectively.

Arguments:

input : an MLOperand. The input 3-D tensor of shape [steps, batch_size, input_size].
weight : an MLOperand. The 3-D input weight tensor of shape [num_directions, 4 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout.
recurrentWeight : an MLOperand. The 3-D recurrent weight tensor of shape [num_directions, 4 * hidden_size, hidden_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the options.layout argument.
steps : an unsigned long scalar. The number of time steps in the recurrent network. The value must be greater than 0.
hiddenSize : an unsigned long scalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state.
options : an optional MLLstmOptions. The optional parameters of the operation.

Returns: a sequence of MLOperand. The first element of the sequence is a 3-D tensor of shape [num_directions, batch_size, hidden_size], the output hidden state from the last time step of the network. The second element is a 3-D tensor of shape [num_directions, batch_size, hidden_size], the output cell state from the last time step of the network. Additionally, if ~~options.returnSequence~~ options.returnSequence is set to true, the third element is the 4-D output tensor of shape [steps, num_directions, batch_size, hidden_size] containing every output from each time step in the temporal sequence.

The


lstm(input,
weight,
recurrentWeight,
steps,
hiddenSize,
options)

steps are:

If options is undefined, let options be an empty object .
If options.direction is undefined, set it to "forward".
If options.direction is not one of MLRecurrentNetworkDirection, then throw a " TypeError " DOMException and stop.
Let num_directions be 1 if options.direction is "forward", or otherwise let it be 2.
If input,weight or recurrentWeight is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.

The shape of input,weight or recurrentWeight could be also checked here.
If input.[[descriptor]].dimensions [0] is not equal to steps, then throw a " DataError " DOMException and stop.
Let batch_size be input.[[descriptor]].dimensions [1].
If options.bias exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 2, then throw a " DataError " DOMException and stop.
3. If options.bias.[[descriptor]].dimensions [0] is not num_directions, then throw a " DataError " DOMException and stop.
4. If options.bias.[[descriptor]].dimensions [1] is not 4 * hiddenSize, then throw a " DataError " DOMException and stop.
If options.recurrentBias exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 2, then throw a " DataError " DOMException and stop.
3. If options.recurrentBias.[[descriptor]].dimensions [0] is not num_directions, then throw a " DataError " DOMException and stop.
4. If options.recurrentBias.[[descriptor]].dimensions [1] is not 4 * hiddenSize, then throw a " DataError " DOMException and stop.
If options.peepholeWeight exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 2, then throw a " DataError " DOMException and stop.
3. If options.peepholeWeight.[[descriptor]].dimensions [0] is not num_directions, then throw a " DataError " DOMException and stop.
4. If options.peepholeWeight.[[descriptor]].dimensions [1] is not 4 * hiddenSize, then throw a " DataError " DOMException and stop.
If options.initialHiddenState exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 3, then throw a " DataError " DOMException and stop.
3. If options.initialHiddenState.[[descriptor]].dimensions [0] is not num_directions, then throw a " DataError " DOMException and stop.
4. If options.initialHiddenState.[[descriptor]].dimensions [1] is not equal to batch_size, then throw a " DataError " DOMException and stop.
5. If options.initialHiddenState.[[descriptor]].dimensions [2] is not hiddenSize, then throw a " DataError " DOMException and stop.
If options.initialCellState exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 3, then throw a " DataError " DOMException and stop.
3. If options.initialCellState.[[descriptor]].dimensions [0] is not num_directions, then throw a " DataError " DOMException and stop.
4. If options.initialCellState.[[descriptor]].dimensions [1] is not equal to batch_size, then throw a " DataError " DOMException and stop.
5. If options.initialCellState.[[descriptor]].dimensions [2] is not hiddenSize, then throw a " DataError " DOMException and stop.
If options.returnSequence is undefined, set it to false.
If options.layout is undefined, set it to "iofg".
If options.layout is not one of MLLstmWeightLayout, then throw a " TypeError " DOMException and stop.
If options.activations exists :
1. If it is not an array of size 3, then throw a " TypeError " DOMException and stop.
2. If any of its elements is not an instance of MLActivation, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let desc a new MLOperandDescriptor.
2. Set desc.dimensions to [ nume_directions,batch_size,hiddenSize ].
3. Set desc.type to input.[[descriptor]].type.
4. Let output0 be the result of invoking the create MLOperand steps given this and desc.
5. Let output1 be the result of invoking the create MLOperand steps given this and desc.
6. Set desc.dimensions to [ steps,nume_directions,batch_size,hiddenSize ].
7. Let output2 be the result of invoking the create MLOperand steps given this and desc.
8. Let output be the array [ output0,output1, |output2 ].
9. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the LSTM operation, given weight,recurrentWeight,steps,hiddenSize and options.
  2. Store a reference of opImpl in output0.[[operator]],output1.[[operator]] and output2.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output0.[[operand]],output1.[[operand]] and output2.[[operand]].
10. Connect input.[[operand]] as input to opImpl.
11. Connect output as output to opImpl.
Return output.

const numDirections = (options.direction == "both" ? 2 : 1);
let hiddenState = options.initialHiddenState;
let cellState = options.initialCellState;
if (!hiddenState) {
  const desc = { type: 'float32', dimensions: [numDirections, 1, hiddenSize] };
  const totalSize = numDirections * hiddenSize;
  hiddenState = builder.constant(desc, new Float32Array(totalSize).fill(0));
}
if (!cellState) {
  const desc = { type: 'float32', dimensions: [numDirections, 1, hiddenSize] };
  const totalSize = numDirections * hiddenSize;
  cellState = builder.constant(desc, new Float32Array(totalSize).fill(0));
}
let sequence = null;
let currentWeight = [];
let currentRecurrentWeight = [];
let currentBias = [];
let currentRecurrentBias = [];
let currentPeepholeWeight = [];
for (let dir = 0; dir < numDirections; ++dir) {
  currentWeight.push(builder.squeeze(builder.slice(weight, [dir, 0, 0], [1, 4 * hidden_size, input_size]), { axes: [0] }));
  currentRecurrentWeight.push(builder.squeeze(builder.slice(recurrentWeight, [dir, 0, 0], [1, 4 * hidden_size, hidden_size]), { axes: [0] }));
  currentBias.push(options.bias ? (builder.squeeze(builder.slice(options.bias, [dir, 0], [1, 4 * hidden_size]), { axes: [0] })) : null);
  currentRecurrentBias.push(options.recurrentBias ?
    (builder.squeeze(builder.slice(options.recurrentBias, [dir, 0], [1, 4 * hidden_size]), { axes: [0] })) : null);
  currentPeepholeWeight.push(options.peepholeWeight ?
    (builder.squeeze(builder.slice(options.peepholeWeight, [dir, 0], [1, 3 * hidden_size]), { axes: [0] })) : null);
}
for (let step = 0; step < steps; ++step) {
  let currentHidden = [];
  let currentCell = [];
  let nextHidden = null;
  let nextCell = null;
  for (let dir = 0; dir < numDirections; ++dir) {
    currentHidden.push(builder.squeeze(builder.slice(hiddenState, [dir, 0, 0], [1, batch_size, hidden_size]), { axes: [0] }));
    currentCell.push(builder.squeeze(builder.slice(cellState, [dir, 0, 0], [1, batch_size, hidden_size]), { axes: [0] }));
  }
  for (let dir = 0; dir < numDirections; ++dir) {
    let slice = (dir == 1 || options.direction == "backward" ? steps - step - 1 : step);
    let currentInput = builder.squeeze(builder.slice(input, [slice, 0, 0], [1, batch_size, input_size]), { axes: [0] });
    let results = builder.lstmCell(
      currentInput, currentWeight[dir], currentRecurrentWeight[dir],
      currentHidden[dir], currentCell[dir], hiddenSize, { bias: currentBias[dir],
      recurrentBias: currentRecurrentBias[dir], peepholeWeight: currentPeepholeWeight[dir],
      layout: options.layout, activations: options.activations });
    let output = builder.reshape(results[0], [1, null, hiddenSize]);
    let cell = builder.reshape(results[1], [1, null, hiddenSize]);
    nextHidden = (nextHidden ? builder.concat([nextHidden, output], 0) : output);
    nextCell = (nextCell ? builder.concat([nextCell, cell], 0) : cell);
  }
  hiddenState = nextHidden;
  cellState = nextCell;
  if (options.returnSequence) {
    nextHidden = builder.reshape(nextHidden, [1, numDirections, null, hiddenSize]);
    sequence = (sequence ? builder.concat([sequence, nextHidden], 0) : nextHidden);
  }
}
return (sequence ? [hiddenState, cellState, sequence] : [hiddenState, cellState]);

7.6.21. 7.6.22. The lstmCell() method

A single time step of the Long Short-Term Memory [LSTM] recurrent network using a cell state, an input, output, and forget gate to compute the cell state and the hidden state of the next time step that rolls into the output across the temporal sequence of the network.

dictionary MLLstmCellOptions {
  ;
  ;
  ;
   = "iofg";
  ;

  MLOperand bias;
  MLOperand recurrentBias;
  MLOperand peepholeWeight;
  MLLstmWeightLayout layout = "iofg";
  sequence<MLActivation> activations;
};
 {
  ,
                               ,

partial interface MLGraphBuilder {
  sequence<MLOperand> lstmCell(MLOperand input, MLOperand weight, MLOperand recurrentWeight,
                               MLOperand hiddenState, MLOperand cellState, unsigned long hiddenSize,
                               optional MLLstmCellOptions options = {});
};

~~Arguments:~~

~~input : an~~ MLOperand MLLstmCellOptions . The input 2-D tensor of shape [batch_size, input_size]. has the following members:

~~weight : an~~ bias, of type MLOperand . The 2-D input weight tensor of shape [4 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument.: ~~recurrentWeight : an~~ An MLOperand. The ~~2-D recurrent weight~~ 1-D input bias tensor of shape [4 * ~~hidden_size,~~ hidden_size]. The ordering of the ~~weight~~ bias vectors in the first dimension of the tensor shape is specified according to the ~~options.layout argument. hiddenState : an~~ MLOperand . The 2-D input hidden state tensor of shape [batch_size, hidden_size]. cellState : an MLOperand layout . The 2-D input cell state tensor of shape [batch_size, hidden_size]. argument.
~~hiddenSize : an~~
~~unsigned long~~ recurrentBias ~~scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.~~ ~~options : an optional MLLstmCellOptions~~ , of type MLOperand ~~. The optional parameters of the operation.~~: ~~bias : an~~ An MLOperand. The 1-D ~~input~~ recurrent bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the ~~options.layout argument. recurrentBias : an~~ MLOperand layout . The 1-D recurrent bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the options.layout argument.
peepholeWeight : an , of type MLOperand: An MLOperand. The 1-D weight tensor for peepholes of shape [3 * hidden_size]. The pack ordering of the weight vectors is for the input (i) ,, output (o) ,, and forget (f) gate gate, respectively.
layout : an , of type MLLstmWeightLayout , defaulting to "iofg": An MLLstmWeightLayout. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the input (i) ,, output (o) ,, forget (f) ,, and cell (g) gate, as indicated in the first dimension of the weight and bias tensor shapes. When not specified, the default layout is "iofg" ..
activations : a , of type sequence< MLActivation >: A sequence of MLActivation. A sequence of three activation functions, the first one is used for the input (i) ,, forget (f) ,, and output (o) gate, the second one is used for the cell (g) gate, and the last used for filtering the output cell state before combining it with the result of the output gate to form the output hidden state. When not specified, they are assumed to be of the sigmoid function ( "sigmoid" ) followed by two hyperbolic tangent functions ( "tanh" ) respectively.

Arguments:

input : an MLOperand. The input 2-D tensor of shape [batch_size, input_size].
weight : an MLOperand. The 2-D input weight tensor of shape [4 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument.
recurrentWeight : an MLOperand. The 2-D recurrent weight tensor of shape [4 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the options.layout argument.
hiddenState : an MLOperand. The 2-D input hidden state tensor of shape [batch_size, hidden_size].
cellState : an MLOperand. The 2-D input cell state tensor of shape [batch_size, hidden_size].
hiddenSize : an unsigned long scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.
options : an optional MLLstmCellOptions. The optional parameters of the operation.

Returns: a sequence of MLOperand. The first element of the sequence is the output hidden state of the current time step of the recurrent network. The following element is the output cell state. Both elements are 2-D tensors of shape [batch_size, hidden_size].

The


lstmCell(input,
weight,
recurrentWeight,
hiddenState,
cellState,
hiddenSize,
options)

steps are:

If input,weight,recurrentWeight,hiddenState or cellState is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If the rank of input,weight,recurrentWeight,hiddenState or cellState is not 2, then throw a " DataError " DOMException and stop.
Let batch_size be input.[[descriptor]].dimensions [0].
If options is undefined, let options be an empty object .
If options.bias exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 1, then throw a " DataError " DOMException and stop.
3. If options.bias.[[descriptor]].dimensions [0] is not 4 * hiddenSize, then throw a " DataError " DOMException and stop.
If options.recurrentBias exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 1, then throw a " DataError " DOMException and stop.
3. If options.recurrentBias.[[descriptor]].dimensions [0] is not 4 * hiddenSize, then throw a " DataError " DOMException and stop.
If options.peepholeWeight exists .
1. If it is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
2. If its rank is not 1, then throw a " DataError " DOMException and stop.
3. If options.peepholeWeight.[[descriptor]].dimensions [0] is not 3 * hiddenSize, then throw a " DataError " DOMException and stop.
If options.layout is undefined, set it to "iofg".
If options.layout is not one of MLLstmWeightLayout, then throw a " TypeError " DOMException and stop.
If options.activations exists :
1. If it is not an array of size 3, then throw a " TypeError " DOMException and stop.
2. If any of its elements is not an instance of MLActivation, then throw a " TypeError " DOMException and stop.
Let desc a new MLOperandDescriptor.
Set desc.dimensions to [ batch_size,hiddenSize ].
Set desc.type to input.[[descriptor]].type.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output0 be the result of invoking the create MLOperand steps given this and desc.
2. Let output1 be the result of invoking the create MLOperand steps given this and desc.
3. Let output be the array [ output0,output1 ].
4. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the LSTM cell operation, given weight,recurrentWeight,hiddenState,cellState,hiddenSize and options.
  2. Store a reference of opImpl in output0.[[operator]] and output1.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output0.[[operand]] and output1.[[operand]].
5. Connect input.[[operand]] as input to opImpl.
6. Connect output as output to opImpl.
Return output.

The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default "iofg" layout, and the activation functions of the input/forget/output gate and the cell gate/the cell state’s filter for the output hidden state are of the operator types sigmoid and tanh respectively.

const zero = builder.constant(0);
// input gate (i)
let i = builder.sigmoid(
  builder.add(
    builder.mul(
      cellState,
      (options.peepholeWeight ? builder.slice(options.peepholeWeight, [0], [hiddenSize]) : zero)
    ),
    builder.add(
      builder.add(
        (options.bias ? builder.slice(options.bias, [0], [hiddenSize]) : zero),
        (options.recurrentBias ? builder.slice(options.recurrentBias, [0], [hiddenSize]) : zero)
      ),
      builder.add(
        builder.matmul(
          input,
          builder.transpose(builder.slice(weight, [0, 0], [hiddenSize, input_size]))
        ),
        builder.matmul(
          hiddenState,
          builder.transpose(builder.slice(recurrentWeight, [0, 0], [hiddenSize, hidden_size]))
        )
      )
    )
  )
);
// forget gate (f)
let f = builder.sigmoid(
  builder.add(
    builder.mul(
      cellState,
      (options.peepholeWeight ? builder.slice(options.peepholeWeight, [2 * hiddenSize], [hiddenSize]) : zero)
    ),
    builder.add(
      builder.add(
        (options.bias ? builder.slice(options.bias, [2 * hiddenSize], [hiddenSize]) : zero),
        (options.recurrentBias ? builder.slice(options.recurrentBias, [2 * hiddenSize], [hiddenSize]) : zero)
      ),
      builder.add(
        builder.matmul(
          input,
          builder.transpose(builder.slice(weight, [2 * hiddenSize, 0], [hiddenSize, input_size]))
        ),
        builder.matmul(
          hiddenState,
          builder.transpose(builder.slice(recurrentWeight, [2 * hiddenSize, 0], [hiddenSize, hidden_size]))
        )
      )
    )
  )
);
// cell gate (g)
let g = builder.tanh(
  builder.add(
    builder.add(
      (options.bias ? builder.slice(options.bias, [3 * hiddenSize], [hiddenSize]) : zero),
      (options.recurrentBias ? builder.slice(options.recurrentBias, [3 * hiddenSize], [hiddenSize]) : zero)
    ),
    builder.add(
      builder.matmul(
        input,
        builder.transpose(builder.slice(weight, [3 * hiddenSize, 0], [hiddenSize, input_size]))
      ),
      builder.matmul(
        hiddenState,
        builder.transpose(builder.slice(recurrentWeight, [3 * hiddenSize, 0], [hiddenSize, hidden_size]))
      )
    )
  )
);
// output gate (o)
let o = builder.sigmoid(
  builder.add(
    builder.mul(
      cellState,
      (options.peepholeWeight ? builder.slice(options.peepholeWeight, [hiddenSize], [hiddenSize]) : zero)
    ),
    builder.add(
      builder.add(
        (options.bias ? builder.slice(options.bias, [hiddenSize], [hiddenSize]) : zero),
        (options.recurrentBias ? builder.slice(options.recurrentBias, [hiddenSize], [hiddenSize]) : zero)
      ),
      builder.add(
        builder.matmul(
          input,
          builder.transpose(builder.slice(weight, [hiddenSize, 0], [hiddenSize, input_size]))
        ),
        builder.matmul(
          hiddenState,
          builder.transpose(builder.slice(recurrentWeight, [hiddenSize, 0], [hiddenSize, hidden_size]))
        )
      )
    )
  )
);
// output cell state (ct)
let ct = builder.add(builder.mul(f, cellState), builder.mul(i, g));
// output hidden state (ht)
let ht = builder.mul(o, builder.tanh(ct));
return [ht, ct];

7.6.22. 7.6.23. The matmul() method

Compute the matrix product of two input tensors. ~~{ );~~

partial interface MLGraphBuilder {
  MLOperand matmul(MLOperand a, MLOperand b);
};

Arguments:

a : an MLOperand. The first N-dimensional input ~~N-D~~ tensor.
b : an MLOperand. The second N-dimensional input ~~N-D~~ tensor.

Returns: an MLOperand. The output ~~N-D~~ tensor that contains the matrix product of two input tensors.

~~Compute~~

Computes the matrix product of two input ~~tensors. It behaves~~ tensors as ~~following:~~ follows:

If both a and b are ~~2-D,~~ 2-dimensional, they are multiplied like conventional matrices and produce a ~~2-D~~ 2-dimensional tensor as the output.
If either a or b is ~~N-D,~~ N -dimensional where N > 2, 2, it is treated as a stack of matrices with dimensions corresponding to the last two indices. The matrix multiplication will be broadcasted accordingly by following the [numpy-broadcasting-rule] . The output is a ~~N-D~~ N -dimensional tensor whose rank is the maximum rank of the input tensors. For each dimension, except the last two, of the output tensor, its size is the maximum size along that dimension of the input tensors.
If a is ~~1-D,~~ 1-dimensional, it is converted to a ~~2-D~~ 2-dimensional tensor by prepending a 1 to its dimensions.
If b is ~~1-D,~~ 1-dimensional, it is converted to a ~~2-D~~ 2-dimensional tensor by by appending a 1 to its dimensions.
If both a and b are ~~1-D,~~ 1-dimensional, the operation is a vector dot-product, which produces a scalar output.

To calculate matmul output sizes , given a and b run the following steps:

Let shapeA be a.[[descriptor]].dimensions and sizeA the size of shapeA.
Let shapeB be a.[[descriptor]].dimensions and sizeB the size of shapeB.
If sizeA and sizeB is 1, return [ 1 ].
If | sizeA| is 1 and sizeB is not, then insert 1 in the front of shapeA to become [ 1 | shapeA ] and let sizeA be 2.
If | sizeB| is 1 and sizeA is not, then insert 1 in the front of shapeB to become [ 1 | shapeB ] and let sizeB be 2.
Let shape be an array whose size size is the maximum of sizeA and sizeB.
For each index between 0 and size:
Set shape [ index ] to the maximum of shapeA [ index ] and shapeB [ index ].
Return shape.

The


matmul(a,
b)

steps are:

If a or b is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
Let desc a new MLOperandDescriptor.
Set desc.dimensions to the result of invoking the calculate matmul output sizes given a and b.
Set desc.type to a.[[descriptor]].type.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps given this and desc.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the matrix multiplication operation.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect a.[[operand]] and b.[[operand]] as inputs to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.23. 7.6.24. The pad() method

Inflate the tensor with constant or mirrored values on the edges.

enum MLPaddingMode {
  "constant",
  "edge",
  "reflection",
  "symmetric"
};
dictionary MLPadOptions {
   = "constant";
   = 0;

  MLPaddingMode mode = "constant";
  float value = 0;
};
 {
  ,
                ,
                ,

partial interface MLGraphBuilder {
  MLOperand pad(MLOperand input,
                sequence<unsigned long> beginningPadding,
                sequence<unsigned long> endingPadding,
                optional MLPadOptions options = {});
};

MLPadOptions has the following members:

mode, of type MLPaddingMode , defaulting to "constant": An MLPaddingMode string . Specifies the different ways to pad the tensor. The default value is "constant".
value, of type float , defaulting to 0: A float. Specifies the padding value when mode is set to "constant". The default value is 0.

Arguments:

input : an MLOperand. The input tensor.
beginningPadding : a sequence of unsigned long. The sequence of unsigned integer values indicating the number of padding values to add at the beginning of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , beginningPadding[d] indicates how many values to add before the content in that dimension.
endingPadding : a sequence of unsigned long. The sequence of unsigned integer values indicating the number of padding values to add at the ending of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , endingPadding[d] indicates how many values to add after the content in that dimension.
options : an optional MLPadOptions. The optional parameters of the operation.

~~mode :~~ Returns: an MLPaddingMode MLOperand. The ~~different ways to pad the~~ padded output tensor. ~~When not set, it’s assumed to~~ Each dimension of the output tensor can be ~~"constant".~~ calculated as follow:

~~value :~~ output size = beginning padding + input size + ending padding

To calculate padding output sizes , given input,beginningPadding and endingPadding, run the following steps:

Let shape be a copy of input. float [[descriptor]]. ~~The pad~~ dimensions.
For index between 0 and the rank of shape:
1. Add to shape [ index ] the value ~~when~~ of beginningPadding [ index ].
2. Add to shape [ index ] the ~~options.mode~~ value of endingPadding [ index ].
Return shape.

The


pad(input,
beginningPadding,
endingPadding,
options)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If beginningPadding or endingPadding is not a sequence of unsigned long, then throw a " TypeError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.mode is undefined, set it to "constant" . When.
1. Otherwise, if options.mode is not ~~set, it’s assumed~~ one of MLPaddingMode, then throw a " TypeError " DOMException and stop.
If options.value is undefined, set it to 0.
Let desc be 0. a copy of input.[[descriptor]].
Set desc.Returns: dimensions to the result of invoking the calculate padding output sizes given input,beginningPadding and endingPadding.
If any of the following sub-steps fail, throw an " MLOperand OperationError . The padded " DOMException and stop.
1. Let output tensor. Each dimension be the result of invoking the ~~output tensor can~~ create MLOperand steps given this and desc.
2. Make a request to the underlying platform to:
  1. Let opImpl be ~~calculated as follow:~~ an implementation-defined platform operator for the padding operation, given beginningPadding,endingPadding and options.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output size = beginning padding + and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input size + ending padding. [[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

Examples for constant, edge, reflection and symmetric padding:

// input: [[1,2,3], [4,5,6]]
const input = builder.constant(
  { type: 'float32', dimensions: [2,3] }, new Float32Array([1,2,3,4,5,6]));
const beginningPadding = [1,2];
const endingPadding = [1,2];
// "constant" padded:
//    [[0,0,0,0,0,0,0],
//     [0,0,1,2,3,0,0],
//     [0,0,4,5,6,0,0],
//     [0,0,0,0,0,0,0]]
builder.pad(input, beginningPadding, endingPadding);
// "edge" padded:
//    [[1,1,1,2,3,3,3],
//     [1,1,1,2,3,3,3],
//     [4,4,4,5,6,6,6],
//     [4,4,4,5,6,6,6]]
builder.pad(input, beginningPadding, endingPadding, { mode: "edge" });
// "reflection" padded:
//    [[6,5,4,5,6,5,4],
//     [3,2,1,2,3,2,1],
//     [6,5,4,5,6,5,4],
//     [3,2,1,2,3,2,1]]
builder.pad(input, beginningPadding, endingPadding, { mode: "reflection" });
// "symmetric" padded:
//    [[2,1,1,2,3,3,2],
//     [2,1,1,2,3,3,2],
//     [5,4,4,5,6,6,5],
//     [5,4,4,5,6,6,5]]
builder.pad(input, beginningPadding, endingPadding, { mode: "symmetric" });

7.6.24. 7.6.25. Pooling operations

Compute a mean , L2 norm , or max reduction operation across all the elements within the moving window over the input tensor. See the description of each type of reduction in ~~§ 7.6.26~~ § 7.6.27 Reduction operations .

enum MLRoundingType {
  "floor",
  "ceil"
};
dictionary MLPool2dOptions {
  ;
  ;
  ;
  ;
   = "explicit";
   = "nchw";
   = "floor";
  ;

  sequence<unsigned long> windowDimensions;
  sequence<unsigned long> padding;
  sequence<unsigned long> strides;
  sequence<unsigned long> dilations;
  MLAutoPad autoPad = "explicit";
  MLInputOperandLayout layout = "nchw";
  MLRoundingType roundingType = "floor";
  sequence<unsigned long> outputSizes;
};
 {
   = {});
   = {});
   = {});

partial interface MLGraphBuilder {
  MLOperand averagePool2d(MLOperand input, optional MLPool2dOptions options = {});
  MLOperand l2Pool2d(MLOperand input, optional MLPool2dOptions options = {});
  MLOperand maxPool2d(MLOperand input, optional MLPool2dOptions options = {});
};

Arguments:

input : an MLOperand. The input 4-D tensor. The logical shape is interpreted according to the value of options.layout .
options : an optional MLPool2dOptions. The optional parameters of the operation.

Returns: an MLOperand. The output 4-D tensor that contains the result of the reduction. The logical shape is interpreted according to the value of ~~windowDimensions~~ layout . More specifically, if the options.roundingType is "floor" , the spatial dimensions of the output tensor can be calculated as follow:

output size = floor(1 + (input size - filter size + beginning padding + ending padding) / stride)

or if options.roundingType is "ceil" :

output size = ceil(1 + (input size - filter size + beginning padding + ending padding) / stride)

A global pooling operation such as one for the max pooling operation is a variant of pooling where the window dimensions is the spatial dimensions (last two dimensions) of the input shape, as follow.

// 'global' max pooling
builder.maxPool2d(input);

MLPool2dOptions has the following members:

windowDimensions, of type sequence<unsigned long>

A sequence of unsigned long of length ~~2. The~~ 2: [window_height, window_width]. Specifies the dimensions of the sliding ~~window, [window_height, window_width]. If not present,~~ window. The default value for the window dimensions are ~~assumed to be~~ the height and width dimensions of the input shape.

padding : a , of type sequence<unsigned long>

A sequence of unsigned long of length ~~4. The~~ 4: [beginning_height, ending_height, beginning_width, ending_width]. Specifies the additional rows and columns added to the beginning and ending of each spatial dimension of ~~input , [beginning_height, ending_height, beginning_width, ending_width]. If not present,~~ the ~~values are assumed to be~~ convolution input. The default value is [0,0,0,0].

strides : a , of type sequence<unsigned long>

dilations : a , of type sequence<unsigned long>

A sequence of unsigned long of length ~~2. The~~ 2: [dilation_height, dilation_width]. Specifies the dilation factor for each spatial dimension ~~of input , [dilation_height, dilation_width]. If not present,~~ applied on the ~~values are assumed to be~~ convolution filter (kernel). The default value is [1,1].

autoPad : an , of type MLAutoPad , defaulting to "explicit"

An MLAutoPad . The string ]. Specifies the automatic input padding options. ~~By default, this argument~~ The default value is ~~set to~~ "explicit" , which means that the values in the ~~options.padding~~ padding array should be used for input padding. When the option is set other than "explicit" , the values in the ~~options.padding~~ padding array are ignored.

The "same-lower" option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.

layout : an , of type MLInputOperandLayout , defaulting to "nchw"

An MLInputOperandLayout . The default value is "nchw" . This option specifies string . Specifies the layout format of the input and output tensor as ~~follow:~~ follows:

~~"nchw":~~ "nchw"
- input tensor: [batches, ~~channels,~~ input_channels, height, width]
- output tensor: [batches, ~~channels,~~ output_channels, height, width]
~~"nhwc":~~ "nhwc" :
- input tensor: [batches, height, width, ~~channels]~~ input_channels]
- output tensor: [batches, height, width, ~~channels]~~ output_channels]

The default value is "nchw" .

roundingType : an , of type MLRoundingType , defaulting to "floor"

An MLRoundingType . The option specifies string . Specifies the rounding function used to compute the output shape.

outputSizes : a , of type sequence<unsigned long>

A sequence of unsigned long of length 2. ~~The~~ Specifies the sizes of the two spacial dimensions of the output tensor. When the output sizes are explicitly specified, the ~~options.roundingType~~ roundingType is ignored.

If not specified, the output sizes are automatically computed.

To create pooling operation given op,input and options, run the following steps:

~~Returns:~~ Assert :op is one of "averagePool2d", "l2Pool2d", "maxPool2d".
If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If options is undefined, let options be a new MLPool2dOptions object.
If options.outputSizes exists , or if options.padding is undefined, set options.padding to [0, 0, 0, 0]. ~~The output 4-D tensor that contains the result of the reduction. The logical shape~~
If options.strides is ~~interpreted according~~ undefined, set options.strides to ~~the value of~~ [1, 1].
If options.dilations is undefined, set options.dilations to [1, 1].
If options.autoPad is undefined, set options.autoPad to "explicit.
If options.autoPad is not "explicit", set options.padding to [0, 0, 0, 0].
If options.layout . More specifically, if the options.roundingType is undefined, set options.layout to "nchw".
If options.roundingType is undefined, set options.roundingType to "floor" ,.
Let desc be a copy of input.[[descriptor]].
If any of the ~~spatial~~ following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Make a request to the underlying platform to:
  1. Calculate the output dimensions of given input and options. Let desc.dimensions be the result of that.
  2. Let output tensor can be ~~calculated as follow:~~ the result of invoking the create MLOperand steps given this and desc.
  3. Let opImpl be an implementation-defined platform operator for the op pooling operation, given options.
  4. Store a reference of opImpl in output size = floor(1 + (input size - filter size + beginning padding + ending padding) / stride). [[operator]].
  5. ~~or if options.roundingType is "ceil" :~~ Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  6. Store a reference to outputImpl in output.[[operand]].
2. Connect input.[[operand]] as input to opImpl.
3. Connect output size = ceil(1 + (input size - filter size + beginning padding + ending padding) / stride). [[operand]] as output to opImpl.
  ~~A global~~
Return output.

The following pooling ~~operation such as one for~~ algorithms are supported.

The


averagePool2d(input,
options)

steps are:

Let output be the ~~max~~ result of running the create pooling operation ~~is a variant~~ given "averagePool2d",input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The


l2Pool2d(input,
options)

steps are:

Let output be the result of running the create pooling ~~where~~ operation given "l2Pool2d",input and options.
1. If that throws an error, then re-throw the ~~window dimensions is~~ error and stop.
Return output.

The


maxPool2d(input,
options)

steps are:

Let output be the ~~spatial dimensions (last two dimensions)~~ result of running the create pooling operation given "maxPool2d",input shape, as follow. builder and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

7.6.25. 7.6.26. The prelu() method

Calculate the parametric version of rectified linear function (Parametric ~~Relu)~~ ReLU) on the input tensor element-wise. Parametric ~~Relu~~ ReLU is a type of leaky ReLU that, instead of having a scalar slope like 0.01, making the slope (coefficient of leakage) into a parameter that is learned during the model training phase of this operation. The calculation follows the expression


max(0,
x)
+
slope
∗
min(0,
x)

. ~~{ );~~

partial interface MLGraphBuilder {
  MLOperand prelu(MLOperand input, MLOperand slope);
};

Arguments:

x input : an MLOperand. The input tensor.
slope : an MLOperand. The slope tensor. Its shape is either the same as, or unidirectionally broadcastable to the shape of input tensor x input according to [numpy-broadcasting-rule] .

Returns:

an MLOperand. The output tensor of the same shape as x .

The


prelu(input,
slope)

steps are:

If input or slope is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
Let descriptor be a new MLOperandDescriptor.
Set descriptor.dimensions.type to input.[[descriptor]].type.
Let descriptor.dimensions be the result of running the broadcast-shapes steps given input.[[descriptor]].dimensions and slope.[[descriptor]].dimensions.
1. If that throws an error, re-throw the error and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps given this and descriptor.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the PreLU operation, given slope.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

return builder.add(builder.max(builder.constant(0), x),
                   builder.mul(slope, builder.min(builder.constant(0), x)));

7.6.26. 7.6.27. Reduction operations

Reduce the input tensor along all dimensions, or along the ~~dimensions given~~ axes specified in the


axes
.

array parameter. For each specified axis, the dimension with that index is reduced, i.e. the resulting tensor will not contain it, unless the


keepDimensions

option is specified. The values of the resulting tensor are calculated using the specified reduction function that takes as parameters all the values across the reduced dimension.

dictionary MLReduceOptions {
  ;
  ;

  sequence<unsigned long> axes = null;
  boolean keepDimensions = false;
};
 {
   = {});
   = {});
   = {});
   = {});
   = {});
   = {});
   = {});
   = {});
   = {});
   = {});

partial interface MLGraphBuilder {
  MLOperand reduceL1(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceL2(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceLogSum(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceLogSumExp(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceMax(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceMean(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceMin(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceProduct(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceSum(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceSumSquare(MLOperand input, optional MLReduceOptions options = {});
};

Arguments:

input : an MLOperand. The input tensor.
options : an optional MLReduceOptions. The optional parameters of the operation.
- axes : a sequence of unsigned long. The dimensions to reduce. The values in the sequence must be in the range [0, N-1] where N is the rank of input tensor. If not present, all dimensions are reduced.
- keepDimensions : a boolean. If true, retains reduced dimensions with size of 1. The default value is false.

Returns: an MLOperand. The reduced output tensor.

Reduction types:

L1 : Compute the L1 norm of all the input values along the axes.
L2 : Compute the L2 norm of all the input values along the axes.
LogSum : Compute the log value of the sum of all the input values along the axes.
LogSumExp : Compute the log value of the sum of the exponent of all the input values along the axes.
Max : Compute the maximum value of all the input values along the axes.
Mean : Compute the average value of all the input values along the axes.
Min : Compute the minimum value of all the input values along the axes.
Product : Compute the product of all the input values along the axes.
Sum : Compute the sum of all the input values along the axes.
SumSquare : Compute the sum of the square of all the input values along the axes.

To create reduce operation given op,input and options, run the following steps:

Assert :op is one of "reduceL1", "reduceL2", "reduceLogSum", "reduceLogSumExp", "reduceMax", "reduceMean", "reduceMin", "reduceProduct", "reduceSum", "reduceSumSquare".
If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If options is undefined, let options be a new MLReduceOptions object with options.keepDimensions set to false and options.axes set to null.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the op reduce operation, given options.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

The following reduce algorithms are supported.

The


reduceL1(input,
options)

steps are:

Let output be the result of running the create reduce operation given "reduceL1", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceL2(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceL2", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceLogSum(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceLogSum", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceLogSumExp(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceLogSumExp", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceMax(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceMax", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceMean(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceMean", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceMin(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceMin", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceProduct(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceProduct", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceSum(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceSum", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

The reduceSumSquare(input, options) steps are:

Let output be the result of running the create reduce operation given "reduceSumSquare", input and options.
1. If that throws an error, then re-throw the error and stop.
Return output.

7.6.27. 7.6.28. The relu() method

Compute the rectified linear function of the input tensor. ~~{ ); ();~~

partial interface MLGraphBuilder {
  MLOperand relu(MLOperand input);
  MLActivation relu();
};

return builder.max(builder.constant(0), x);

7.6.28.1. The `relu(input)` method

Arguments:

x input : an MLOperand. The input tensor.

Returns:

an MLOperand. The output tensor of the same shape as x .

The


relu(input)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the ReLU operation.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.28.2. The `relu()` method

Arguments:

None.

Returns:

an MLActivation. The activation function representing the relu operation.

The ~~behavior of this operation can~~



relu()

method steps are:

Let op be ~~generically emulated from~~ the ~~usage~~ result of ~~other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from~~ invoking the ~~performance standpoint.~~ create MLActivation steps with "relu".
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.28. 7.6.29. The resample2d() method

Resample the tensor values from the source to the destination spatial dimensions according to the scaling factors.

enum MLInterpolationMode {
  "nearest-neighbor",
  "linear"
};
dictionary MLResample2dOptions {
   = "nearest-neighbor";
  ;
  ;
  ;

  MLInterpolationMode mode = "nearest-neighbor";
  sequence<float> scales;
  sequence<unsigned long> sizes;
  sequence<unsigned long> axes;
};
 {
   = {});

partial interface MLGraphBuilder {
  MLOperand resample2d(MLOperand input, optional MLResample2dOptions options = {});
};

Arguments:

input : an MLOperand. The input 4-D tensor.
options : an optional MLResample2dOptions. The optional parameters of the operation.

~~mode :~~ Returns: an MLInterpolationMode MLOperand. The output 4-D tensor.

MLResample2dOptions has the following members:

mode, of type MLInterpolationMode , defaulting to "nearest-neighbor": An MLInterpolationMode string . Specifies the interpolation algorithm used to fill the output tensor values. ~~If not set, it~~ The default value is ~~assumed to be the~~ "nearest-neighbor", standing for Nearest Neighbor interpolation.
scales : a , of type sequence< float >: A sequence of float of length 2. ~~Each value represents~~ Specifies the scaling factor ~~used to scale~~ in each spatial dimensions of ~~input,~~ the input: [scale_height, scale_width]. ~~If not set, the values are assumed to be~~ The default value is [1.0, 1.0].
sizes : a , of type sequence<unsigned long>: A sequence of unsigned long of length 2. ~~The~~ Specifies the target sizes for each spatial dimensions of ~~input,~~ the input: [size_height, size_width]. When the target sizes are specified, the ~~options.scales~~ scales argument is ~~ignored as~~ ignored, since the scaling factor values are derived from the target sizes of each spatial dimension of the input.
axes : a , of type sequence<unsigned long>: A sequence of unsigned long of length 2. ~~The~~ Specifies the two consecutive dimensions of the input tensor to which the interpolation algorithm applies. The valid values in the sequence are [0, 1], [1, 2] or [2, 3]. ~~When~~ The default value is [2, 3].

To check resample options given options, run the following steps:

If options is undefined, let options be a new MLResample2dOptions object.
If options.mode exists :
1. If its value is not ~~specified,~~ one of "nearest-neighbor" or "linear", return null.
Otherwise, set options.mode to "nearest-neighbor".
If options.scales exists :
1. If its size is not 2, or if any of its values is not greater than 0, return null.
Otherwise, set options.scales to [1.0, 1.0].
If options.sizes exists : if its size is not 2, or if any of its values is not greater than 0, return null.
If options.axes exists :
1. If its value is not one of [0, 1], [1, 2], [2, 3], return null.
Otherwise, set options.axes to [2, 3].
Return options.

To resample output sizes given input and options, run the ~~sequence~~ following steps:

Let desc be an MLOperandDescriptor initialized to input.[[descriptor]].
If options.sizes exists , then set desc.[[descriptor]].dimensions to options.sizes and return desc.
For index between 0 and the rank of desc.[[descriptor]].dimensions:
1. Let inputSize be the size of input.[[descriptor]].dimensions [ index ].
2. Let outputSize be inputSize multiplied by options.scales.
  1. If that fails or outputSize is ~~assumed~~ not a positive number , then throw a " DataError " DOMException and stop.
3. Set desc.dimensions [ index ] to outputSize.
Return desc.

The


resample2d(input,
options)

steps are:

Check if the input is a 4-dimensional tensor: if the size of input.[[descriptor]].dimensions is not 4, throw a " DataError " DOMException and stop.
Let options be ~~[2, 3].~~ the result of running the check resample options steps given options.
1. If that returns null, then throw a " Returns: DataError " DOMException and stop.
Let desc be the result of running the resample output sizes steps given options.
1. If that throws an error, re-throw the error and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps given this and desc.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the resample 2D operation, given options.
  2. Store a reference of opImpl in output.[[operator]]. ~~The~~
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output 4-D tensor. and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.29. 7.6.30. The reshape() method

Alter the shape of a tensor to a new shape. Reshape does not copy or change the content of the tensor. It just changes the tensor’s logical dimensions for the subsequent operations. ~~{ );~~

partial interface MLGraphBuilder {
  MLOperand reshape(MLOperand input, sequence<unsigned long?> newShape);
};

Arguments:

input : an MLOperand. The input tensor.
newShape : a sequence of nullable unsigned long. The shape of the output tensor. The number of elements implied by newShape must be the same as the number of elements in the input tensor. Only one component of newShape can be the special value of null. The size of the dimension with the value null is computed so that the total size remains constant.

Returns: an MLOperand. The output tensor. The values of the output tensor are the same as values of the input tensor. The shape of the output tensor is specified by the newShape argument.

The


reshape(input,
newShape)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
Let outputShape be an empty array of unsigned long.
If newShape is a scalar number , set outputShape to [ 1 ].
Otherwise, if newShape is an array of unsigned long:
1. If the size of newShape is 0, set outputShape to [ 1 ] (reshaping to scalar).
2. If newShape contains more than one null value, then throw a " DataError " DOMException and stop.
3. If any value in newShape is 0, then throw a " DataError " DOMException and stop.
4. Let inputElementCount be the product of all elements in inputs.[[descriptor]].dimensions.
5. If newShape contains a null value, set that value to inputElementCount divided by the product of all other values in newShape.
  1. If that value is too large for unsigned long, then throw a " DataError " DOMException and stop.
6. If product of all values in newShape is not equal to inputElementCount, then throw a " DataError " DOMException and stop.
Let desc be a copy of input.[[descriptor]].
Set desc.dimensions to newShape.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the create MLOperand steps given this and desc.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the reshape operation.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.30. 7.6.31. The sigmoid() method

Compute the sigmoid function of the input tensor. The calculation follows the expression


1
/
(exp(-x)
+
1)

. ~~{ ); ();~~

partial interface MLGraphBuilder {
  MLOperand sigmoid(MLOperand input);
  MLActivation sigmoid();
};

return builder.div(
          builder.constant(1),
          builder.add(
            builder.exp(builder.neg(x)),
            builder.constant(1)));

7.6.31.1. The `sigmoid(input)` method

Arguments:

x input : an MLOperand. The input tensor.

Returns:

an MLOperand. The output tensor of the same shape as x input .

The


sigmoid(input)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the sigmoid operation.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.31.2. The `sigmoid()` method

Arguments:

None.

Returns:

an MLActivation. The activation function representing the sigmoid operation.

The ~~behavior of this operation can~~



sigmoid()

method steps are:

Let op be ~~generically emulated from~~ the ~~usage~~ result of ~~other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from~~ invoking the ~~performance standpoint. builder builder builder builder~~ create MLActivation steps with "sigmoid".
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.31. 7.6.32. The slice() method

Produce a slice of the input tensor. ~~{ );~~

partial interface MLGraphBuilder {
  MLOperand slice(MLOperand input, sequence<unsigned long> starts, sequence<unsigned long> sizes);
};

Arguments:

input : an MLOperand. The input tensor.
starts : a sequence of unsigned long. The sequence of unsigned integer values indicating the starting index to slice of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , starts[d] indicates the starting index to slice in that dimension. The starting index must be in the range [0, input size - 1] in that dimension.
sizes : a sequence of unsigned long. The sequence of unsigned integer values indicating the number of elements to slice of each input dimension, of length N where N is the rank of the input tensor. For each dimension d of input , sizes[d] indicates the number of elements to slice in that dimension. The size must not be 0 and must satisfy the constraint starting index + size <= input size in that dimension.

Returns: an MLOperand. The output tensor of the same rank as the input tensor with tensor values stripped to the specified starting and ending indices in each dimension.

The


slice(input,
starts,
sizes)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If starts or sizes is not a sequence of long, then throw a " TypeError " DOMException and stop.
If sizes.size is 0, then throw a " TypeError " DOMException and stop.
Further validation of starts and sizes given input is left implementation-defined .
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the slice operation, given starts and sizes.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.32. 7.6.33. The softmax() method

Compute the softmax values of the 2-D input tensor along axis 1. ~~{ ); ();~~

partial interface MLGraphBuilder {
  MLOperand softmax(MLOperand input);
  MLActivation softmax();
};

// This sample deploys a well-known implementation trick [1] to compute the// exponentials of the distances to the max value, instead of the exponentials// of the input values itself, in order to increase the numerical stability of// the result.// [1]: https://cs231n.github.io/linear-classify/#softmaxconst max_x = builder.reduceMax(x, { axes: [1], keepDimensions: true });const exp_x = builder.exp(builder.sub(x, max_x));return builder.div(exp_x, builder.reduceSum(exp_x, { axes: [1], keepDimensions: true }));

7.6.33.1. The `softmax(input)` method

Arguments:

x input : an MLOperand. The input 2-D tensor.

Returns:

an MLOperand. The output 2-D tensor that contains the softmax results, of the same shape as the input tensor.

The


softmax(input)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the softmax operation.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.33.2. The `softmax()` method

Arguments:

None.

Returns:

an MLActivation. The activation function representing the softmax operation.

The


softmax()

method steps are:

Let op be the result of invoking the create MLActivation steps with "softmax".
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.34. The softplus() method

Compute the softplus function of the input tensor. The calculation follows the expression


ln(1
+
exp(steepness
*
x))
/
steepness

dictionary MLSoftplusOptions {  float steepness = 1;
};
partial interface MLGraphBuilder {  MLOperand softplus(MLOperand input, optional MLSoftplusOptions options = {});  MLActivation softplus(optional MLSoftplusOptions options = {});
};

return builder.div(
          builder.log(
            builder.add(
              builder.exp(builder.mul(x, builder.constant(options.steepness))),
              builder.constant(1))),
          builder.constant(options.steepness));

MLSoftplusOptions has the following members:

steepness, of type float , defaulting to 1: A float scalar parameter. The default value is 1.

To check softplus options given options, run the following steps:

If options is not an object , then return false.
If options.steepness is undefined, set options.steepness to 1.
Else if options.steepness is not a numeric type , then then return false.
Return true.

7.6.33. 7.6.34.1. The softplus() `softplus(input, options)` method Compute the softplus function of the input tensor. The calculation follows the expression ln(1 + exp(steepness * x)) / steepness . { = 1; }; { = {}); = {}); };

Arguments:

x input : an MLOperand. The input tensor.
options : an optional MLSoftplusOptions. The optional parameters of the operation.

Returns:

an MLOperand. The output tensor of the same shape as x .

The


softplus(input,
options)

method steps are:

Let input be the first argument.
Let options be the second argument.
1. If running the check softplus options steps with options returns false, then throw a " TypeError " DOMException and abort these steps.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the softplus operation, given options.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.34.2. The `softplus(options)` method

Arguments:

options : an optional MLSoftplusOptions. The optional parameters of the operation.

Returns:

an MLActivation. The activation function representing the softplus operation.

The


softplus(options)

method steps are:

Let options be the first argument.
1. If running the check softplus options steps with options returns false, then throw a " TypeError " DOMException and abort these steps.
Let op be the result of invoking the create MLActivation steps with "softplus" and options.
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.35. The softsign() method

Compute the softsign function of the input tensor. The calculation follows the expression


x
/
(1
+
|x|)

partial interface MLGraphBuilder {  MLOperand softsign(MLOperand input);  MLActivation softsign();
};

return builder.div(x, builder.add(builder.constant(1), builder.abs(x)));

7.6.34. 7.6.35.1. The softsign() `softsign(input)` method Compute the softsign function of the input tensor. The calculation follows the expression x / (1 + |x|) . { ); (); };

Arguments:

x input : an MLOperand. The input tensor.

Returns:

an MLOperand. The output tensor of the same shape as x .

The


softsign(input)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the softsign operation, given options.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.35.2. The `softsign()` method

Arguments:

None.

Returns:

an MLActivation. The activation function representing the softsign operation.

The ~~behavior of this operation can~~



softsign()

method steps are:

Let op be ~~generically emulated from~~ the ~~usage~~ result of ~~other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from~~ invoking the ~~performance standpoint.~~ create MLActivation steps with "softsign".
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.35. 7.6.36. The split() method

Split the input tensor into a number of sub tensors along the given axis.

dictionary MLSplitOptions {
   = 0;

  unsigned long axis = 0;
};
 {
  ,
                          (,

partial interface MLGraphBuilder {
  sequence<MLOperand> split(MLOperand input,
                          (unsigned long or sequence<unsigned long>) splits,

                          optional MLSplitOptions options = {});
};

Arguments:

input : an MLOperand. The input tensor.
splits : an unsigned long or a sequence of unsigned long. If an unsigned long, it specifies the number of output tensors along the axis. The number must evenly divide the dimension size of input along options.axis . If a sequence of unsigned long, it specifies the sizes of each output tensor along the options.axis . The sum of sizes must equal to the dimension size of input along options.axis .
options : an optional MLSplitOptions. The optional parameters of the operation.
~~axis : an unsigned long scalar. The dimension along which to split. Its value must be in the range [0, N-1] where N is the rank of input tensor. Default to 0.~~

Returns: a sequence of MLOperand. The splitted output tensors. If splits is an unsigned long, the length of the output sequence equals to splits . The shape of each output tensor is the same as input except the dimension size of axis equals to the quotient of dividing the dimension size of input along axis by splits . If splits is a sequence of unsigned long, the length of the output sequence equals to the length of splits . The shape of the i-th output tensor is the same as as input except along axis where the dimension size is splits[i] .

MLSplitOptions has the following members:

axis, of type unsigned long , defaulting to 0: An unsigned long scalar. The dimension along which to split. Its value must be in the range [0, N-1] where N is the rank of input tensor. The default value is 0.

The


split(input,
splits,
options)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.axis is undefined, let options.axis be 0.
If splits is not unsigned long or a sequence of unsigned long, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the split operation, given splits and options.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

// This sample shows the case that the splits parameter is an array.
const outputs = [];
let starts = Array(input_rank).fill(0);
let sizes = input_shape;
let start = 0;
for (const size of splits) {
  starts[options.axis] = start;
  sizes[options.axis] = size;
  outputs.push(builder.slice(input, starts, sizes));
  start += size;
}
return outputs;

7.6.36. 7.6.37. The squeeze() method

Reduce the rank of a tensor by eliminating dimensions with size 1 of the tensor shape. Squeeze only affects the tensor’s logical dimensions. It does not copy or change the content in the tensor.

dictionary MLSqueezeOptions {
  ;

  sequence<unsigned long> axes;
};
 {
   = {});

partial interface MLGraphBuilder {
  MLOperand squeeze(MLOperand input, optional MLSqueezeOptions options = {});
};

Arguments:

input : an MLOperand. The input tensor.
options : an optional MLSqueezeOptions. The optional parameters of the operation.

Returns: an MLOperand. The output tensor of the same or reduced rank with the shape dimensions of size 1 eliminated.

MLSqueezeOptions has the following members:

axes : a , of type sequence<unsigned long>: A sequence of unsigned long. ~~Indices~~ Specifies the indices to the shape dimensions of size 1 to eliminate. The values in the sequence must be in the range [0, N-1] where N is the rank of input tensor. When not specified, every shape dimensions of size 1 in the tensor are eliminated.

The


Returns:


squeeze(input,
options)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If options is undefined, let options be an empty object .
If options.axes exists , then:
1. Let dimensions be input.[[descriptor]]. ~~The~~ dimensions.
2. For index between 0 and the size of options.axes:
  1. Let oneDimIndex be options.axes [ index ].
  2. If dimensions [ oneDimIndex ] is not 1, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output tensor be the result of invoking the ~~same or reduced rank with~~ copy MLOperand steps given input.
2. Make a request to the ~~shape dimensions~~ underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the squeeze operation, given options.
  2. Store a reference of ~~size 1 eliminated.~~ opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.37. 7.6.38. The tanh() method

Compute the hyperbolic tangent function of the input tensor. The calculation follows the expression


(exp(2
*
x)
-
1)
/
(exp(2
*
x)
+
1)

. ~~{ ); ();~~

partial interface MLGraphBuilder {
  MLOperand tanh(MLOperand input);
  MLActivation tanh();
};

return builder.div(
          builder.sub(builder.exp(builder.mul(builder.constant(2), x)), builder.constant(1)),
          builder.add(builder.exp(builder.mul(builder.constant(2), x)), builder.constant(1)));

7.6.38.1. The `tanh(input)` method

Arguments:

x input : an MLOperand. The input tensor.

Returns:

an MLOperand. The output tensor of the same shape as x .

The


tanh(input)

steps are:

If input is not an instance of MLOperand, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the hyperbolic tangent operation.
  2. Store a reference of opImpl in output.[[operator]].
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

7.6.38.2. The `tanh()` method

Arguments:

None.

Returns:

an MLActivation. The activation function representing the tanh operation.

The ~~behavior of this operation can~~



tanh()

method steps are:

Let op be ~~generically emulated from~~ the ~~usage~~ result of ~~other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from~~ invoking the ~~performance standpoint. builder builder~~ create MLActivation steps with "tanh".
1. If that throws an error, re-throw the error and abort these steps.
Return op.

7.6.38. 7.6.39. The transpose() method

Permute the dimensions of the input tensor according to the permutation argument.

dictionary MLTransposeOptions {
  ;

  sequence<unsigned long> permutation;
};
 {
   = {});

partial interface MLGraphBuilder {
  MLOperand transpose(MLOperand input, optional MLTransposeOptions options = {});
};

Arguments:

input : an MLOperand. The input N-D tensor.
options : an optional MLTransposeOptions. The optional parameters of the operation.

Returns: an MLOperand. The permuted or transposed N-D tensor.

MLTransposeOptions has the following members:

permutation : a , of type sequence<unsigned long>: A sequence of unsigned long values. ~~The~~ Specifies the values used to permute the output shape. ~~When it’s not specified, it’s set to~~ The default value is [N-1, ..., 0], where N is the rank of the input tensor, e.g. [2,1,0] for a 3-D tensor. These default values cause the output to become a transposed tensor of the input. When specified, the number of values in the sequence must be the same as the rank of the input tensor, and the values in the sequence must be within the range from 0 to N-1 with no two or more same values found in the sequence.

The


transpose(input,
options)

steps are:

If input is not an instance of Returns: MLOperand, then throw a " TypeError " DOMException and stop.
If options is undefined, let options be an empty object .
If options. permutation is undefined, let options.permutation be the reversed sequence of all indices for input.[[descriptor]].dimensions.
Otherwise if options.permutation exists :
1. If options.permutation is not a sequence of unsigned long, then throw a " TypeError " DOMException and stop.
2. If the rank of options.permutation is not the same as the rank of input.[[descriptor]].dimensions, then throw a " TypeError " DOMException and stop.
3. If the values in options.permutation are not between 0 and the rank of input.[[descriptor]].dimensions minus 1, then throw a " TypeError " DOMException and stop.
4. If the values in options.permutation contain duplicate value, then throw a " TypeError " DOMException and stop.
If any of the following sub-steps fail, throw an " OperationError " DOMException and stop.
1. Let output be the result of invoking the copy MLOperand steps given input.
2. Make a request to the underlying platform to:
  1. Let opImpl be an implementation-defined platform operator for the transpose operation, given options.
  2. Store a reference of opImpl in output.[[operator]]. ~~The permuted or transposed N-D tensor.~~
  3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
  4. Store a reference to outputImpl in output.[[operand]].
3. Connect input.[[operand]] as input to opImpl.
4. Connect output.[[operand]] as output to opImpl.
Return output.

8. Examples

The following code gets the MLContext object.

const context = await navigator.ml.createContext({powerPreference: 'low-power'});

~~The~~ Given the following ~~code builds a graph as:~~ build graph:

constant1 ---+
             +--- Add ---> intermediateOutput1 ---+

            +--- Add ---> intermediateOutput1 ---+

input1    ---+                                    |
                                                  +--- Mul---> output

                                                +--- Mul---> output

constant2 ---+                                    |
             +--- Add ---> intermediateOutput2 ---+

            +--- Add ---> intermediateOutput2 ---+

input2    ---+

The following code implements the graph:

// Use tensors in 4 dimensions.
const TENSOR_DIMS = [1, 2, 2, 2];
const TENSOR_SIZE = 8;
const builder = new MLGraphBuilder(context);
// Create MLOperandDescriptor object.
const desc = {type: 'float32', dimensions: TENSOR_DIMS};
// constant1 is a constant MLOperand with the value 0.5.
const constantBuffer1 = new Float32Array(TENSOR_SIZE).fill(0.5);
const constant1 = builder.constant(desc, constantBuffer1);
// input1 is one of the input MLOperands. Its value will be set before execution.
const input1 = builder.input('input1', desc);
// constant2 is another constant MLOperand with the value 0.5.
const constantBuffer2 = new Float32Array(TENSOR_SIZE).fill(0.5);
const constant2 = builder.constant(desc, constantBuffer2);
// input2 is another input MLOperand. Its value will be set before execution.
const input2 = builder.input('input2', desc);
// intermediateOutput1 is the output of the first Add operation.
const intermediateOutput1 = builder.add(constant1, input1);
// intermediateOutput2 is the output of the second Add operation.
const intermediateOutput2 = builder.add(constant2, input2);
// output is the output MLOperand of the Mul operation.
const output = builder.mul(intermediateOutput1, intermediateOutput2);

Compile the graph up to the output operand.

// Compile the constructed graph.
const graph = await builder.build({'output': output});

The following code executes the compiled graph.

// Setup the input buffers with value 1.
const inputBuffer1 = new Float32Array(TENSOR_SIZE).fill(1);
const inputBuffer2 = new Float32Array(TENSOR_SIZE).fill(1);
const outputBuffer = new Float32Array(TENSOR_SIZE);
// Execute the compiled graph with the specified inputs.
const inputs = {
  
  

'input1': inputBuffer1,
'input2': inputBuffer2,
};
const outputs = {'output': outputBuffer};
const result = await context.compute(graph, inputs, outputs);
console.log('Output value: ' + result.outputs.output);
// Output value: 2.25,2.25,2.25,2.25,2.25,2.25,2.25,2.25

9. Appendices

9.1. `MLOperandType` and `ArrayBufferView` compatibility

`MLOperandType`	`ArrayBufferView`
`float32`	`Float32Array`
`float16`	`Float16Array`
`int32`	`Int32Array`
`uint32`	`Uint32Array`
`int8`	`Int8Array`
`uint8`	`Uint8Array`

Float16Array is at ECMA Stage 3 signaling its design is finished. Implementers wanting to enable this type ahead native implementations can emulate the type by passing raw bits via Uint16Array. [Issue webnn#373]

10. Acknowledgements

This specification follows the concepts of the Android Neural Networks API C API.

Thanks to Tomoyuki Shimizu, Ningxin Hu, Zhiqiang Yu and Belem Zhang for the use cases.

Thanks to Nikhil Thorat, Daniel Smilkov, Ganesan Ramalingam, Rafael Cintron and Benjamin Poulain for their contributions to the API specification.

Thanks to Sangwhan Moon and the W3C Technical Architecture Group for review of this specification for web architecture fit, design consistency and developer ergonomics.

Thanks to W3C Privacy Interest Group for privacy and security review and feedback.

Thanks to Alex Gough and the Chrome Security team for security review and questions.

Thanks to Michal Karzynski for sharing practical guidelines and learnings from ONNX.

Thanks to Kaustubha Govind and Chrome privacy reviewers for feedback and privacy considerations.

Thanks to Jiewei Qian for Chromium implementation review and feedback.

Web Neural Network API

Abstract

Status of this document

1. Introduction

2. Use cases

2.1. Application Use Cases

2.1.1. Person Detection

2.1.2. Semantic Segmentation

2.1.3. Skeleton Detection

2.1.4. Face Recognition

2.1.5. Facial Landmark Detection

2.1.6. Style Transfer

2.1.7. Super Resolution

2.1.8. Image Captioning

2.1.9. Machine Translation

2.1.10. Emotion Analysis

2.1.11. Video Summarization

2.1.12. Noise Suppression

2.1.13. Detecting fake video

2.2. Framework Use Cases

2.2.1. Custom Layer

2.2.2. Network Concatenation

2.2.3. Performance Adaptation

2.2.4. Operation Level Execution

2.2.5. Integration with real-time video processing

3. Security Considerations

3.1. Guidelines for new operations

4. Privacy Considerations

5. Ethical Considerations

6. Programming Model

6.1. Overview

6.2. Device Selection

7. API

7.1. The navigator.ml interface

7.2. The ML interface

7.2.1. Permissions Policy Integration

7.2.2. The createContext() method

7.2.3. The createContextSync() method

7.3. The MLGraph interface

7.3.1. The MLOperandDescriptor dictionary

7.3.2. The MLOperand interface

7.3.2.1. Creating MLOperand

7.3.3. The MLActivation interface

7.3.3.1. Creating MLActivation

7.4. The MLContext interface

7.4.1. The MLContext validation algorithm

7.4.2. Synchronous Execution

7.4.2.1. Examples

7.4.3. The MLNamedArrayBufferViews transfer algorithm

7.4.4. Asynchronous Execution

7.4.4.1. Examples

7.4.5. WebGPU Interoperability

7.5. The MLCommandEncoder interface

7.5.1. Graph Initialization

7.5.2. Dispatch Execution Commands

7.5.3. Generate GPU Command Buffer

7.6. The MLGraphBuilder interface

7.6.1. The MLGraphBuilder constructor

7.6.2. The input() method

7.6.3. The build() method

7.6.3.1. The build(outputs) method

7.6.3.2. The buildSync(outputs) method

7.6.3. 7.6.4. The constant() method

7.6.3.1. 7.6.4.1. The constant(descriptor, bufferView) method

7.6.3.2. 7.6.4.2. The constant(value, type) method

7.6.4. 7.6.5. The batchNormalization() method

7.6.5. 7.6.6. The clamp() method

7.6.6.1. The clamp(operand, options) method

7.6.6.2. The clamp(options) method

7.6.6. 7.6.7. The concat() method

7.6.7. 7.6.8. The conv2d() method

7.6.8. 7.6.9. The convTranspose2d() method

7.6.9. 7.6.10. Element-wise binary operations

7.6.10. 7.6.11. Element-wise unary operations

7.6.11. 7.6.12. The elu() method

7.6.12.1. The elu(input, options) method

7.6.12.2. The elu(options) method

7.6.12. 7.6.13. The gemm() method

7.6.13. 7.6.14. The gru() method

7.6.14. 7.6.15. The gruCell() method

7.2.2. The `createContext()` method

7.2.3. The `createContextSync()` method

7.3.2.1. Creating `MLOperand`

7.3.3.1. Creating `MLActivation`

7.4.1. The `MLContext` validation algorithm

7.4.3. The `MLNamedArrayBufferViews` transfer algorithm

7.6.1. The `MLGraphBuilder` constructor

7.6.2. The `input()` method

7.6.3.1. The `build(outputs)` method

7.6.3.2. The `buildSync(outputs)` method

7.6.3.1. 7.6.4.1. The `constant(descriptor, bufferView)` method

7.6.3.2. 7.6.4.2. The `constant(value, type)` method

7.6.6.1. The `clamp(operand, options)` method

7.6.6.2. The `clamp(options)` method

7.6.12.1. The `elu(input, options)` method

7.6.12.2. The `elu(options)` method

7.6.16.1. The `hardSigmoid(input, options)` method

7.6.16.2. The `hardSigmoid(options)` method

7.6.17.1. The `hardSwish(input)` method

7.6.17.2. The `hardSwish()` method

7.6.19.1. The `leakyRelu(input, options)` method

7.6.19.2. The `leakyRelu(options)` method

7.6.20.1. The `linear(input, options)` method

7.6.20.2. The `linear(options)` method

7.6.28.1. The `relu(input)` method

7.6.28.2. The `relu()` method

7.6.31.1. The `sigmoid(input)` method

7.6.31.2. The `sigmoid()` method

7.6.33.1. The `softmax(input)` method

7.6.33.2. The `softmax()` method

7.6.33. 7.6.34.1. The softplus() `softplus(input, options)` method Compute the softplus function of the input tensor. The calculation follows the expression ln(1 + exp(steepness * x)) / steepness . { = 1; }; { = {}); = {}); };

7.6.34.2. The `softplus(options)` method

7.6.34. 7.6.35.1. The softsign() `softsign(input)` method Compute the softsign function of the input tensor. The calculation follows the expression x / (1 + |x|) . { ); (); };

7.6.35.2. The `softsign()` method

7.6.38.1. The `tanh(input)` method

7.6.38.2. The `tanh()` method

9.1. `MLOperandType` and `ArrayBufferView` compatibility