WebAssembly Module - Sections

sendilkumarn

Sendil Kumar

Posted on January 14, 2020

WebAssembly Module - Sections

WebAssembly Module

The simplest WebAssembly Module is

Alt Text

The first four bytes 00 61 73 6d represent the header, that translates to \0asm. This denotes the asm.js. The asm.js is the predecessor of the WebAssembly.

The next four bytes 01 00 00 00 represent the version. Currently, the WebAssembly is in its version 1.

Every WebAssembly module has this mandatory header information. Followed by the following sections:

Alt Text

  1. Function
  2. Code
  3. Start
  4. Table
  5. Memory
  6. Global
  7. Import
  8. Export
  9. Data

All the above-mentioned sections are optional except for the magic header and version.

The JavaScript engine upon receiving the WebAssembly module, decode and validate the WebAssembly module.


Check out my book on Rust and WebAssembly here


The validated modules are then compiled and instantiated. During the instantiation phase, the JavaScript engine produces an instance. The instance is a record that holds all the accessible state of the module. The instance is a tuple of section and its contents.


How WebAssembly module is constructed

The WebAssembly module is split into sections. Each section contains a sequence of instructions or statements.

Header Information (MagicHeader Version)
    - function [function definitions]
    - import [import functions]
    - export [export functions]
Enter fullscreen mode Exit fullscreen mode

Each of the section has a unique ID. The WebAssembly module uses this ID to refer to the respective function.

Header Information (MagicHeader Version)
    - (function section id)  [function definitions]
    - (import section id)  [import functions]
    - (export section id)  [export functions]
Enter fullscreen mode Exit fullscreen mode

For example, the function section consists of a list of the function definition.

Header Information
    - function [add, subtract, multiply, divide]
Enter fullscreen mode Exit fullscreen mode

Inside the module, the function is called using the list index. To call add function, the module refer the function in index 0 of function section.

Section format

WebAssembly module contains a set of sections. In the binary format, each section is in the following structure:

<section id> <u32 section size> <Actual content of the section>
Enter fullscreen mode Exit fullscreen mode

The first byte of every section is its unique section id.

Followed by an unsigned 32-bit integer, that defines the module's size in bytes. Since it is a u32 integer, the maximum size of any section is limited to approximately 4.2 Gigabytes of memory (that is 2^32 - 1).

The remaining bytes are the content of the section. For most of the sections, the <Actual content of the section> is a vector.


Function

The function section have a list of functions. The function section is of the following format:

0x03 <section size> vector<function>[]
Enter fullscreen mode Exit fullscreen mode

The unique section id of the function section is 0x03. Followed by an u32 integer, it denotes the size of the function section. Vector<function>[] holds the list of function.

The WebAssembly module instead of using function names uses the index of the function to call the function. This optimises the binary size.

Every function in the Vector<function> is defined as follows:

<type signature> <locals> <body>
Enter fullscreen mode Exit fullscreen mode

The <type signature> holds the type of the parameters and their return type. The type signature specifies the function signature i.e., type of parameters and return value.

WebAssembly is size optimised. All the type signature used in the module is defined in the type section. Refer more about type section below. The function only uses the index of the type section here.

The <locals> is a vector of values that are scoped inside the function. The function section collates the locals to the parameters that we pass to the function.

The <body> is a list of expressions. When evaluated the expressions should result in the function's return type.

Note the expressions here are not pure always. The globals of the WebAssembly module are mutable and the shared memory is mutable too.

To call a function, use $call <function index> (represented by an opcode). The arguments are type validated based on the type signature. Then the local types are inferred. The arguments of the function are then concatenated with the locals.

The expression of the function is then set to the result type defined in the type definition. The expression type is then validated with the signature defined in the type section.

The spec specifies the locals and body fields are encoded separately into the code section. Then in the code section, the expressions are identified by the index.

The order of the types and function sections matters. While hacking on the raw bytecode, proper care should be taken to preserve this order. Refer code section below.


Type

A WebAssembly module with one or more functions starts with a type section.

Everything is strictly typed in WebAssembly. The function should have a type signature attached to it.

To make it size efficient, WebAssembly module creates a vector of type signatures and uses the index in the function section.

The type section is of the following format:

0x01 vector<type>[]
Enter fullscreen mode Exit fullscreen mode

The unique section id of the type section is 0x01. Followed by the Vector<type>[] holds the list of type.

Every type in the Vector<type> is defined as follows:

0x60 [vec-for-parameter-type] [vec-for-return-type]
Enter fullscreen mode Exit fullscreen mode

The 0x60 represents the type of information is for the functions. Followed by the vector of parameter and return types.

The type section also holds the type for values, result, memory, table, global. They are differentiated by the first byte.

The type is one of f64, f32, i64, i32. That is the numbers. Internally inside the WebAssembly module, they are represented by 0x7C, 0x7D, 0x7E, 0x7F respectively.

Note: The type information might change in the future when WebAssembly starts to support other types.


Code

The code section holds a list of code entries. The code entries are a pair of value types and Vector<expressions>[].

The code-section is of the following format:

0x0A Vector<code>[]
Enter fullscreen mode Exit fullscreen mode

Every code in the Vector<code> is defined as follows:

<section size> <actual code>
Enter fullscreen mode Exit fullscreen mode

The <actual code> is of the following format:

vector<locals>[] <expressions>
Enter fullscreen mode Exit fullscreen mode

The vector<locals>[] here refer to the concatenated list of parameters and local scoped inside the function. The <expression> evaluates to the return type.


Start

The start section is a section in the WebAssembly module which will be called as soon as the WebAssembly module is loaded.

The start function is similar to other functions, except that it is not classified into any type. The types may or may not be initialized at the time of its execution.

The start section of a WebAssembly module points to a function index (the index of the location of the function inside the function section).

The section id of the start function is 8. When decoded the start function represents the start component of the module.

At this moment Webpack, does not support the start section. The start section is rewritten into a normal function call and it is called when the JavaScript is initialised by the bundler itself.


Import section - contains the vector of imported functions.

Export section - contains the vector of exported functions.



If you have enjoyed the post, then you might like my book on Rust and WebAssembly. Check them out here


Discussions 🐦 Twitter // 💻 GitHub // ✍️ Blog // 🔸 HackerNews

If you like this article, please leave a like or a comment. ❤️


💖 💪 🙅 🚩
sendilkumarn
Sendil Kumar

Posted on January 14, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related