Neural Network proposal for WASI
wasi-nn
A proposed WebAssembly System Interface API for machine learning (ML).
Current Phase
wasi-nn
is currently in Phase 2.
Champions
Phase 4 Advancement Criteria
wasi-nn
must have at least two complete independent implementations.
Table of Contents
Introduction
wasi-nn
is a WASI API for performing ML inference. Its name derives from the fact that ML models
are also known as neural networks (nn
). ML models are typically trained using a large data set,
resulting in one or more files that describe the model's weights. The model is then used to compute
an "inference," e.g., the probabilities of classifying an image as a set of tags. This API is
concerned initially with inference, not training.
Why expose ML inference as a WASI API? Though the functionality of inference can be encoded into
WebAssembly, there are two primary motivations for wasi-nn
:
wasi-nn
is designed to make it easy to use existing model formats as-isWebAssembly programs that want to use a host's ML capabilities can access these capabilities through
wasi-nn
's core abstractions: backends, graphs, and tensors. A user selects a backend for
inference and loads a model, instantiated as a graph, to use in the backend. Then, the user
passes tensor inputs to the graph, computes the inference, and retrieves the tensor outputs.
wasi-nn
backends correspond to existing ML frameworks, e.g., Tensorflow, ONNX, OpenVINO, etc.
wasi-nn
places no requirements on hosts to support specific backends; the API is purposefully
designed to allow the largest number of ML frameworks to implement it. wasi-nn
graphs can be
passed as opaque byte sequences to support any number of model formats. This makes the API
framework- and format-agnostic, since we expect device vendors to provide the ML backend and
support for their particular graph format.
Users can find language bindings for wasi-nn
at the wasi-nn bindings repository; request
additional language support there. More information about wasi-nn
can be found at:
Goals
The primary goal of wasi-nn
is to allow users to perform ML inference from WebAssembly using
existing models (i.e., ease of use) and with maximum performance. Though the primary focus is
inference, we plan to leave open the possibility to perform ML training in the future (request
training in an issue!).
Another design goal is to make the API framework- and model-agnostic; this allows for implementing
the API with multiple ML frameworks and model formats. The load
method will return an error
message when an unsupported model encoding scheme is passed in. This approach is similar to how a
browser deals with image or video encoding.
Non-goals
wasi-nn
is not designed to provide support for individual ML operations (a "model builder" API).
The ML field is still evolving rapidly, with new operations and network topologies emerging
continuously. It would be a challenge to define an evolving set of operations to support in the API.
Instead, our approach is to start with a "model loader" API, inspired by WebNN’s model loader
proposal.
API walk-through
The following example describes how a user would use wasi-nn
:
// Load the model. let encoding = wasi_nn::GRAPH_ENCODING_...; let target = wasi_nn::EXECUTION_TARGET_CPU; let graph = wasi_nn::load(&[bytes, more_bytes], encoding, target);// Configure the execution context. let context = wasi_nn::init_execution_context(graph); let tensor = wasi_nn::Tensor { ... }; wasi_nn::set_input(context, 0, tensor);
// Compute the inference. wasi_nn::compute(context); wasi_nn::get_output(context, 0, &mut output_buffer, output_buffer.len());
Note that the details above will depend on the model and backend used; the pseudo-Rust simply illustrates the general idea, minus any error-checking. Consult the AssemblyScript and Rust bindings for more detailed examples.
Detailed design discussion
For the details of the API, see wasi-nn.wit.
Should wasi-nn
support training models?
Ideally, yes. In the near term, however, exposing (and implementing) the inference-focused API is sufficiently complex to postpone a training-capable API until later. Also, models are typically trained offline, prior to deployment, and it is unclear why training models using WASI would be an advantage over training them natively. (Conversely, the inference API does make sense: performing ML inference in a Wasm deployment is a known use case). See associated discussion here and feel free to open pull requests or issues related to this that fit within the goals above.
Should wasi-nn
support inspecting models?
Ideally, yes. The ability to inspect models would allow users to determine, at runtime, the tensor shapes of the inputs and outputs of a model. As with ML training (above), this can be added in the future.
Considered alternatives
There are other ways to perform ML inference from a WebAssembly program:
Stakeholder Interest & Feedback
TODO before entering Phase 3.
References & acknowledgements
Many thanks for valuable feedback and advice from:
Twice a month we will interview people behind open source businesses. We will talk about how they are building a business on top of open source projects.
We'll never share your email with anyone else.