- The
sqlite3.wasm
Namespace - Memory Management
- Bridging JS/WASM Functions
- Generic Utility Functions
- WASM-to-JS Peculiarities
The sqlite3.wasm
namespace1, abbreviated as wasm
for the remainder of this page, holds a
number of routines for working with WASM-side constructs. They
include APIs for such tasks as...
- Memory management.
- Allocating and freeing memory.
- Helpers for working with WASM heap memory, e.g. getting and setting primitive values from/to the WASM heap.
- Configurable result value and argument type conversion for WASM-exported functions.
- JS/C String conversions.
- Binding JS functions into the WASM runtime, so that they may be called from WASM code (i.e. from C).
In short, if a WASM-specific feature has been needed during the development of the sqlite3 JS API, it's been added to this namespace. For the most part, high-level client code will rarely need to make use of more than a few of these, whereas clients using the C-style APIs may make heavy use of them.
The sqlite3.wasm.exports
namespace
The sqlite3.wasm.exports
namespace object is a WASM-standard part of
the WASM module file and contains all "exported" C functions which are
built into the WASM module, as well as certain non-function values
which are part of the WASM module. The functions which live in this
object are as low-level as it gets, in terms of JS/C bindings. They
perform no automatic type conversions on their arguments or result
values and many, perhaps most, are cumbersome to use from JS because
of that. This level of the API is not generally recommended for
client use but is available for those who want to make use of it. The
functions in this object which are intended for client-side use are
re-exported into the sqlite3.capi
namespace and have automatic type
conversions applied to them (where applicable). Some small handful of
the functions get re-exported into the sqlite3.wasm
namespace.
The only symbols in exports
which are part of this project's
APIs are:
- Functions named
sqlite3_...()
, with the following exceptions:- All functions with two underscores after
sqlite3
, e.g.sqlite3__wasm_...()
are internal-use APIs, subject to change or removal at any time. - Functions named
sqlite3_wasm_...()
are not part of the client-side API unless they are re-exported into thesqlite3.wasm
namespace. The remainder are for internal use by the JS bindings and do not have stable APIs. Similarly... - Functions named
sqlite3_wasm_test_...()
are solely for use in this project's own tests and may be elided from any given build.
- All functions with two underscores after
- Memory allocation functions: semantically
free()
andmalloc()
, but the canonical builds usesqlite3_free()
andsqlite3_malloc()
. These are exposed to clients as described below, but the C-level formsqlite3_malloc()
is useful if clients need a variant which does not throw an exception on out-of-memory conditions. - WASM Memory object (which may be optional, depending on build options):
memory
__indirect_function_table
is the de facto standard name for the exported functions table. This API exposes it viasqlite3.wasm.functionTable()
.
The build process will include other functions and objects in the
exports
namespace which are not part of this project's public
interface and should not be used by client code. They may differ in
any given build of the WASM file and will certain differ across build
environments.
Memory Management
Just like in C, WASM offers a memory "heap," and transfering values between JS and WASM often requires manipulation of that memory, including low-level allocation and deallocation of it. The following subsections describe the various memory management APIs.
Low-level Management
The lowest-level memory management works like C's standard malloc()
,
realloc()
, and free()
, the one difference being that exceptions
are used for reporting out-of-memory conditions. In order to avoid
certain API misuses caused by mixing different allocators, the
canonical sqlite3.js
builds wrap sqlite3_malloc()
,
sqlite3_realloc()
, and sqlite3_free()
instead of malloc()
,
realloc()
, and free()
, but the semantics of both pairs are
effectively identical.
Listed in alphabetical order...
alloc()
pointer alloc(n)
pointer alloc.impl(n)
(non-throwing)
Allocates n
bytes of memory from the WASM heap and returns the
address of the first byte in the block. alloc()
throws a
WasmAllocError
if allocation fails. If non-thowing allocation is
required, use alloc.impl(n)
, which returns a WASM NULL pointer (the
integer 0) if allocation fails.
Note that memory allocated this way is not automatically zeroed out. In practice that has not proven to be a problem (in JS, at least) because memory is only explicitly allocated when it has a specific use and will be populated by the code which allocates it.
allocCString()
pointer allocCString(jsString, returnWithLength=false)
Uses alloc()
to allocate enough memory for the byte-length of the
given JS string, plus 1 (for a NUL terminator), copies the given JS
string to that memory using jstrcpy()
, NUL-terminates it, and
returns the pointer to that C-string. Ownership of the pointer is
transfered to the caller, who must eventually pass the pointer to
dealloc()
to free it.
If passed a truthy 2nd argument then its return semantics change: it
returns [ptr,n]
, where ptr
is the C-string's pointer and n
is
its cstrlen()
.
allocMainArgv()
pointer allocMainArgv(list)
Creates a C-style array, using alloc()
, suitable for passing to a
C-level main()
routine. The input is a collection with a length
property and a forEach()
method. A block of memory list.length
entries long is allocated and each pointer-sized block of that memory
is populated with the allocCString()
conversion of the (''+value)
of each element. Returns a pointer to the start of the list, suitable
for passing as the 2nd argument to a C-style main()
function.
Throws if list.length
is falsy.
Note that the returned value is troublesome to deallocate but it is
intended for use with calling a C-level main()
function, where the
strings must live as long as the application. See
scopedAllocMainArgv()
for a variant which is trivial to deallocate.
allocPtr()
pointer allocPtr(howMany=1, safePtrSize=true)
Allocates one or more pointers as a single chunk of memory and zeroes them out.
The first argument is the number of pointers to allocate. The second specifies whether they should use a "safe" pointer size (8 bytes) or whether they may use the default pointer size (typically 4 but also possibly 8).
How the result is returned depends on its first argument: if passed 1, it returns the allocated memory address. If passed more than one then an array of pointer addresses is returned, which can optionally be used with "destructuring assignment" like this:
const [p1, p2, p3] = allocPtr(3);
ACHTUNG: when freeing the memory, pass only the first result value
to dealloc()
. The others are part of the same memory chunk and must
not be freed separately.
The reason for the 2nd argument is...
When one of the returned pointers will refer to a 64-bit value, e.g. a
double or int64, and that value must be written or fetched, e.g. using
poke()
or peek()
, it is important that the pointer
in question be aligned to an 8-byte boundary or else it will not be
fetched or written properly and will corrupt or read neighboring
memory. It is only safe to pass false when the client code is certain
that it will only get/fetch 4-byte values (or smaller).
dealloc()
void dealloc(pointer)
Frees memory returned by alloc()
. Results are undefined if it is
passed any value other than a value returned by alloc()
or
null
/undefined
/0
(all of which are no-ops).
alloc()
is not named malloc()
.
realloc()
pointer realloc(ptr,size)
pointer realloc.impl(ptr,size)
(non-throwing)
Semantically equivalent to realloc(3)
or sqlite3_realloc()
, this
routine reallocates memory allocated via this routine or
alloc()
. Its first argument is either 0 or a pointer returned by
this routine or alloc()
. Its second argument is the number of bytes
to (re)allocate, or 0 to free the memory specified in the first
argument. On allocation error, realloc()
throws a WasmAllocError
,
whereas realloc.impl()
will return 0 on allocation error.
Beware that reassigning the return value of realloc.impl()
is poor
practice and can lead to leaks of heap memory:
let m = wasm.realloc(0, 10); // allocate 10 bytes
m = wasm.realloc.impl(m, 20); // grow m to 20 bytes
If that reallocation fails, it will return 0, overwriting m
and
effectively leaking the first allocation.
sizeofIR()
int sizeofIR(string)
For the given IR-like string in the set ('i8'
, 'i16'
, 'i32'
,
'f32'
, 'float'
, 'i64'
, 'f64'
, 'double'
, '*'
), or any
string value ending in '*'
, returns the sizeof for that value
(wasm.ptrSizeof
in the latter case). For any other value, it returns
the undefined
value.
Some allocation routines use this to enable callers to pass them an IR value instead of an integer.
"Scoped" Allocation Management
It is often convenient to manage allocations in such a way that all allocations made in a particular block are "automatically" cleaned up when that block exits. This API provides "scoped" allocation routines which work this way.
Listed below in the typical order of their use...
scopedAllocPush()
opaque scopedAllocPush()
Opens a new "scope" for allocations. All allocations made via the
scopedAllocXyz()
APIs will store their results into the current
(most recently pushed) allocation scope for later cleanup. The
returned value must be retained for passing to scopedAllocPop()
.
Any number of scopes may be active at once, but they must be popped in reverse order of their creation. i.e. they must nest in a manner equivalent to C-style scopes.
Warnings:
- All of the other
scopedAllocXyz()
routines will throw if no scope is active. - It is never legal to pass the result of a scoped allocation to
dealloc()
, and doing so will cause a double-free when the scope is closed withscopedAllocPop()
.
This function and its relatives have only a single intended usage pattern:
const scope = wasm.scopedAllocPush();
try {
... use scopedAllocXyz() routines ...
// It is perfectly legal to use non-scoped allocations here,
// they just won't be cleaned up when...
}finally{
wasm.scopedAllocPop(scope);
}
scopedAlloc()
pointer scopedAlloc(n)
Works just like alloc(n)
but stores the result of the allocation in
the current scope.
This function's read-only level
property resolves to the current
allocation scope depth.
scopedAllocMainArgv()
pointer scopedAllocMainArgv(array)
This functions exactly like allocMainArgv()
but is scoped to the current allocation scope and its contents will be
freed when the current allocation scoped is popped.
scopedAllocCall()
any scopedAllocCall(callback)
Calls scopedAllocPush()
, calls the given callback, and then calls
scopedAllocPop()
, propagating any exception from the callback or
returning its result. This is essentially a convenience form of:
const scope = wasm.scopedAllocPush();
try { return callback() }
finally{ wasm.scopedAllocPop(scope) }
scopedAllocCString()
pointer scopedAllocCString(jsString, returnWithLength=false)
Works just like allocCString()
but stores the result of the
allocation in the current scope.
scopedAllocMainArgv()
pointer scopedAllocMainArgv(list)
Works just like allocMainArgv()
but stores the various
allocations in the current scope.
scopedAllocPtr()
pointer scopedAllocPtr(howMany=1, safePtrSize=true)
Works just like allocPtr()
but stores the result of the allocation in
the current scope.
scopedAllocPop()
void scopedAllocPush(opaque)
Given a value returned from scopedAllocPush()
, this "pops" that
allocation scope and frees all memory allocated in that scope by the
scopedAllocXyz()
family of APIs.
It is technically legal to call this without any argument, but passing an argument allows the allocator to perform sanity checking to ensure that scopes are pushed and popped in the proper order (it throws if they are not). Failing to pass an argument is not illegal but will make that sanity check impossible.
Trivia: in some regions of the U.S. this function might be better known as
scopedAllocSoda()
orscopedAllocCola()
.
"PStack" Allocation
The "pstack" (pseudo-stack) API is a special-purpose allocator intended solely for use with allocating small amounts of memory such as that needed for output pointers. It is more efficient than the scoped allocation API, and covers many of the use cases for that API, but it has a tiny static memory limit (with an unspecified total size no less than 4kb).
The pstack API is typically used like:
const pstack = sqlite3.wasm.pstack;
const stackPtr = pstack.pointer;
try {
const ptr = pstack.alloc(8);
// ==> pstack.pointer === ptr
const otherPtr = pstack.alloc(8);
// ==> pstack.pointer === otherPtr
...
}finally{
pstack.restore(stackPtr);
// ==> pstack.pointer === stackPtr
}
The pstack methods and properties are listed below in alphabetical order.
alloc()
pointer alloc(n)
Attempts to allocate the given number of bytes from the pstack. On
success, it zeroes out a block of memory of the given size, adjusts
the pstack pointer, and returns a pointer to the memory. On error,
returns throws a WasmAllocError
. The memory must eventually be
released using pstack.restore()
.
The n
may be a string accepted by
wasm.sizeofIR()
, and any string value not accepted
by that function will trigger a WasmAllocError
exception.
This method always adjusts the given value to be a multiple of 8 bytes because failing to do so can lead to incorrect results when reading and writing 64-bit values from/to the WASM heap. Similarly, the returned address is always 8-byte aligned.
allocChunks()
array allocChunks(n, sz)
alloc()
's n
chunks, each sz
bytes, as a single memory block and
returns the addresses as an array of n
element, each holding the
address of one chunk.
The sz
argument may be a string value accepted by
wasm.sizeofIR()
, and any string value not accepted
by that function will trigger a WasmAllocError
exception.
Throws a WasmAllocError
if allocation fails.
Example:
const [p1, p2, p3] = pstack.allocChunks(3,4);
allocPtr()
mixed allocPtr(n=1,safePtrSize=true)
A convenience wrapper for allocChunks()
which sizes each chunk
as either 8 bytes (safePtrSize
is truthy) or wasm.ptrSizeof
(if
safePtrSize
is falsy).
How it returns its result differs depending on its first argument: if
it's 1, it returns a single pointer value. If it's more than 1, it
returns the same as allocChunks()
.
When any returned pointers will refer to a 64-bit value, e.g. a double
or int64, and that value must be written or fetched, e.g. using
wasm.poke()
or wasm.peek()
, it is important that the
pointer in question be aligned to an 8-byte boundary or else it will
not be fetched or written properly and will corrupt or read
neighboring memory.
However, when all pointers involved point to "small" data, it is safe to pass a falsy value to save a tiny bit of memory.
pointer
This property resolves to the current pstack position pointer. This
value is intended only to be saved for passing to
restore()
. Writing to this memory without first reserving it via
pstack.alloc()
(or equivalent) leads to undefined results.
quota
This property resolves to the total number of bytes available in the pstack, including any space which is currently allocated. This value is a compile-time constant.
remaining
This property resolves to the amount of space remaining in the pstack.
restore()
void restore(pstackPtr)
Sets the current pstack position to the given pointer. Results are
undefined if the passed-in value did not come from pstack.pointer
or
if memory allocated in the space before the given pointer are used
after this call.
Getting/Setting Memory Values
The WASM memory heap is exposed to JS as a byte array of memory which is made to appear contiguous (though it's really allocated in chunks). Given a byte-oriented view of the heap, it is possible to read and write individual bytes of the heap, just like in C:
const X = wasm.heap8u(); // a uint8-oriented view of the heap
X[someAddress] = 0x2a;
console.log( X[someAddress] ); // ==> 42
Obviously, writing arbitrary addresses can corrupt the WASM heap, just like in C, so one has to be careful with the memory addresses the work with (just like in C!).
Tip: it is important never to hold on to objects returned from methods like
heap8u()
long-term, as they may be invalidated if the heap grows. It is acceptable to hold the reference for a brief series of calls, so long as those calls are guaranteed not to allocate memory on the WASM heap, but it should never be cached for later use.
Before describing the routines for manipulating the heap, we first need to look at data type descriptors, sometimes referred to as "IR" (internal representation). These are short strings which identify the specific data types supported by WASM and/or the JS/WASM glue code:
i8
: 8-bit signed integeri16
: 16-bit signed integeri32
: 32-bit signed integer. Aliases:int
,*
,**
(noting that*
and**
may be remapped dynamically to toi64
when WASM environments gain 64-bit pointer capabilities).i64
: 64-bit signed integer. APIs which use this require that the application has been built with BigInt support, and will throw if that is not the case.f32
: 32-bit floating point value. Alias:float
f64
: 64-bit floating point value. Alias:double
These are used extensively by the memory accessor APIs and need to be committed to memory.
TODO: explain how the alignment of values within the heap affects how they are accessed. In practice it's generally not an issue until/unless one allocates memory in chunks and divvies it up into sub-chunks themselves. In short: when reading or writing values of a given size, it must normally be done at a heap address which is precisely an even multiple of that size.
The following routines are available for accessing memory addresses in various ways...
peek()
and variants
number peek(address [,representation='i8'])
array peek(array-of-addresses [,representation='i8'])
The first form fetches a single value from memory. The second form fetches the value from each pointer in the given array and returns the array of values. The heap view used for reading the memory is specified by the second argument, defaulting to byte-oriented view.
If the 2nd argument ends with "*"
then the pointer-sized
representation is always used (currently always 32 bits).
Example:
let i32 = wasm.peek(myPtr, 'i32');
Several convenience forms of peek()
are available which simply
forward to peek()
with a specific 2nd argument:
peekPtr()
(deprecated alias:getPtrValue()
): Equivalent topeek(X,'*')
. Most frequently used for fetching output pointer values.peek8()
: equivalent topeek(X,'i8')
peek16()
: equivalent topeek(X,'i16')
peek32()
: equivalent topeek(X,'i32')
peek64()
: equivalent topeek(X,'i64')
. Will throw if the environment is not configured with BigInt support.peek32f()
: equivalent topeek(X,'f32')
peek64f()
: equivalent topeek(X,'f64')
heapForSize()
and Friends
TypedArray heapForSize(n [,unsigned=true])
Requires n to be one of:
- integer 8, 16, or 32.
- A integer-type TypedArray constructor: Int8Array, Int16Array, Int32Array, or their Uint counterparts.
If BigInt support is enabled, it also accepts the value 64 or a BigInt64Array/BigUint64Array, else it throws if passed 64 or one of those constructors.
Returns an integer-based TypedArray view of the WASM heap memory buffer associated with the given block size. If passed an integer as the first argument and unsigned is truthy then the "U" (unsigned) variant of that view is returned, else the signed variant is returned. If passed a TypedArray value, the 2nd argument is ignored. Note that Float32Array and Float64Array views are not supported by this function.
Be aware that growth of the heap may invalidate any references to this heap, so do not hold a reference longer than needed and do not use a reference after any operation which may allocate. Instead, re-fetch the reference by calling this function again, which automatically refreshes the view if need.
Throws if passed an invalid n
.
Use of this function in client code is very rare. In practice, one of the (faster) convenience forms is used:
heap8()
→ Int8Arrayheap8u()
→ Uint8Arrayheap16()
→ Int16Arrayheap16u()
→ Uint16Arrayheap32()
→ Int32Arrayheap32u()
→ UInt32Array
poke()
object poke(address, number [,representation='i8'])
object poke(array-of-addresses, number [,representation='i8']
Fetches the heapForSize()
for the given representation then writes
the given numeric value to it. Only numbers may be written this way,
and passing a non-number might trigger an exception. If passed an
array of pointers, it writes the given value to all of them.
Returns this
.
Several convenience forms of poke()
exist which simply forward to
that method with a specific 3rd argument:
pokePtr()
(deprecated alias:setPtrValue()
): equivalent topoke(X,Y,'*')
. Most frequently used for clearing output pointer values.poke8()
: equivalent topoke(X,Y,'i8')
poke16()
: equivalent topoke(X,Y,'i16')
poke32()
: equivalent topoke(X,Y,'i32')
poke64()
: equivalent topoke(X,Y,'i64')
. Will throw if this environment is not configured with BigInt support.poke32f()
: equivalent topoke(X,Y,'f32')
poke64f()
: equivalent topoke(X,Y,'f64')
String Conversion and Utilities
Passing strings into and out of WASM is frequently required, but how JS and C code represent strings varies significantly. The following routines are available for conversion of strings and related algorithms.
Listed below in alphabetical order...
cArgvToJs()
array cArgvToJs(int argc, pointer-to-pointer pArgv)
Expects to be given a C-style string array and its length. It returns
a JS array of strings and/or null
values: any entry in the pArgv
array which is NULL results in a null
entry in the result array. If
argc
is 0 then an empty array is returned.
Results are undefined if any entry in the first argc
entries of
pArgv
are neither 0 (NULL) nor legal UTF-format C strings.
To be clear, the expected C-style arguments to be passed to this
function are (int, char **)
(optionally const-qualified).
cstrToJs()
string cstrToJs(ptr)
Expects its argument to be a pointer into the WASM heap memory which
refers to a NUL-terminated C-style string encoded as UTF-8. This
function counts its byte length using cstrlen()
then returns a
JS-format string representing its contents. As a special case, if the
argument is falsy, null
is returned.
cstrlen()
int cstrlen(ptr)
Expects its argument to be a pointer into the WASM heap memory which
refers to a NUL-terminated C-style string encoded as UTF-8. Returns
the length, in bytes, of the string, as for strlen(3)
. As a special
case, if the argument is falsy then it it returns null
. Throws if
the argument is out of range for wasm.heap8u()
.
cstrncpy()
int cstrncpy(tgtPtr, srcPtr, n)
Works similarly to C's strncpy(3)
, copying, at most, n
bytes (not
characters) from srcPtr
to tgtPtr
. It copies until n
bytes have
been copied or a 0 byte is reached in src. Unlike strncpy()
, it
returns the number of bytes it assigns in tgtPtr
, including the
NUL byte (if any). If n
is reached before a NUL byte in srcPtr
,
tgtPtr
will not be NUL-terminated. If a NUL byte is reached
before n
bytes are copied, tgtPtr
will be NUL-terminated.
If n
is negative, cstrlen(srcPtr)+1
is used to calculate it, the
+1 being for the NUL byte.
Throws if tgtPtr
or srcPtr
are falsy. Results are undefined if:
- Either is not a pointer into the WASM heap or
srcPtr
is not NUL-terminated ANDn
is less thansrcPtr
's logical length.
ACHTUNG: when passing in a non-negative n
value, it is possible to
copy partial multi-byte characters this way, and converting such
strings back to JS strings will have undefined results.
jstrcpy()
int jstrcpy(jsString, TypedArray tgt, offset = 0, maxBytes = -1, addNul = true)
Forewarning: this API is somewhat complicated and is, in practice, never needed from client code.
Encodes the given JS string as UTF-8 into the given TypedArray tgt
(which must be a Int8Array or Uint8Array), starting at the given
offset and writing, at most, maxBytes bytes (including the NUL
terminator if addNul
is true, else no NUL is added). If it writes
any bytes at all and addNul
is true, it always NUL-terminates the
output, even if doing so means that the NUL byte is all that it
writes.
If maxBytes
is negative (the default) then it is treated as the
remaining length of tgt
, starting at the given offset.
If writing the last character would surpass the maxBytes
count because
the character is multi-byte, that character will not be written (as
opposed to writing a truncated multi-byte character). This can lead
to it writing as many as 3 fewer bytes than maxBytes
specifies.
Returns the number of bytes written to the target, including the NUL terminator (if any). If it returns 0, it wrote nothing at all, which can happen if:
jsString
is empty andaddNul
is false.offset
< 0.maxBytes
=== 0.maxBytes
is less than the byte length of a multi-bytejsString[0]
.
Throws if tgt
is not an Int8Array or Uint8Array.
In C's
strcpy()
, the destination pointer is the first argument. That is not the case here primarily because the 3rd+ arguments are all referring to the destination, so it seems to make sense to have them grouped with it.Emscripten's counterpart of this function,
stringToUTF8Array()
, returns the number of bytes written sans NUL terminator. That is, however, ambiguous: str.length===0 or maxBytes===(0 or 1) all cause 0 to be returned.
jstrlen()
int jstrlen(jsString)
Given a JS string, this function returns its UTF-8 length in
bytes. Returns null
if its argument is not a string. This is a
relatively expensive calculation and should be avoided when not
necessary.
jstrToUintArray()
Uint8Array jstrToUintArray(jsString, addNul=false)
For the given JS string, returns a Uint8Array
of its contents
encoded as UTF-8. If addNul
is true, the returned array will have a
trailing 0 entry, else it will not.
Trivia: this was written before JS's
TextEncoder
was known to this code's author. The same functionality, sans the trailing NUL option, can be achieved withnew TextEncoder().encode(str)
.
Misc. Allocation Routines
allocFromByteArray()
pointer allocFromByteArray(srcTypedArray)
wasm.alloc()
's srcTypedArray.byteLength
bytes, populates them with
the values from the source TypedArray, and returns the pointer to that
memory. The returned pointer must eventually be passed to
wasm.dealloc()
to clean it up.
The argument may be a Uint8Array, Int8Array, or ArrayBuffer, and it throws if passed any other type.
As a special case, to avoid further special cases where this routine
is used, if srcTypedArray.byteLength
is 0, it allocates a single
byte and sets it to the value 0. Even in such cases, calls must behave
as if the allocated memory has exactly srcTypedArray.byteLength
usable bytes.
Bridging JS/WASM Functions
This section documents the helper APIs related to bridging the gap between JavaScript and WebAssembly functions.
A WASM module exposes all exported functions to the user, but they are in "raw" form. That is, they perform no argument or result type conversion and only support data types supported by WASM (i.e. only numeric types). That's fine for functions which only accept and return numbers, but is generally less helpful for functions which take or return strings or have output pointers. For usability reasons, it's desirable to reduce the JS/C friction by automatically performing mundane tasks such as the allocation and deallocation of memory needed for converting strings between JS and WASM.
Additionally, it's often useful to add new functions to the WASM runtime from JS, which requires compiling binary WASM code on the fly. A common example of this is creating user-defined SQL functions. For the most part, the JS bindings of the sqlite3 API take care of such conversions for the user, but there are cases where client code will need to, or want to, perform such conversions itself.
WASM Function Table
WASM-exported functions, as well as JavaScript functions which have been bound to WASM at runtime, are exposed to clients via a WebAssembly.Table instance. The following APIs are available for working with that.
functionEntry()
mixed functionEntry(ptr)
Given a function pointer, returns the WASM function table entry if found, else returns a falsy value.
functionTable()
WebAssembly.Table functionTable()
Returns the WASM module's indirect function table.
Calling and Wrapping Functions
xCall()
any xCall(functionName, ...args)
any xCall(functionName, [args...])
Calls a WASM-exported function by name, passing on all supplied arguments (which may optionally be supplied as an array). If throws if the function is not exported or if the argument count does not match. This routine does no type conversion and is essentially equivalent to:
const rc = wasm.exports.some_func(...args)
with the exception that xCall()
throws if the argument count does
not match that of the WASM-exported function.
xCallWrapped()
any xCallWrapped(functionName, resultType, argTypes, ...args)
any xCallWrapped(functionName, resultType, argTypes, [args array...])
Functions like xCall()
but performs argument and result type
conversions as for xWrap()
.
The first argument is the name of the exported function to call. The
2nd its the name of its result type, as documented for xWrap()
. The
3rd is an array of argument type names, as documented for xWrap()
.
The 4th+ arguments are arguments for the call, with the special case
that if the 4th argument is an array, it is used as the arguments for
the call.
Returns the converted result of the call.
This is just a thin wrapper around xWrap()
. If the given function is
to be called more than once, it's more efficient to use xWrap()
to
create a wrapper, then to call that wrapper as many times as
needed. For one-shot calls, however, this variant is arguably more
efficient because it will hypothetically free the wrapper function
quickly.
xGet()
Function xGet(functionName)
Returns a WASM-exported function by name, or throws if the function is not found.
xWrap()
Function xWrap(functionName, resultType=undefined, ...argTypes)
Function xWrap(functionName, resultType=undefined, [argTypes...])
xWrap()
creates a JS function which calls a WASM-exported function,
as described for xCall()
.
Creates a wrapper for the WASM-exported function fname. It uses xGet()
to
fetch the exported function (which throws on error) and returns either
that function or a wrapper for that function which converts the
JS-side argument types into WASM-side types and converts the result
type. If the function takes no arguments and resultType is null
then
the function is returned as-is, else a wrapper is created for it to
adapt its arguments and result value, as described below.
This function's arguments are:
functionName
: the exported function's name.xGet()
is used to fetch this, so will throw if no exported function is found with that name.resultType
: the name of the result type. A literalnull
means to return the original function's value as-is (mnemonic: there is "null" conversion going on). Literalundefined
or the string"void"
mean to ignore the function's result and returnundefined
. Aside from those two special cases, it may be one of the values described below or any mapping installed by the client usingxWrap.resultAdapter()
.
If passed 3 arguments and the final one is an array, that array must contain a list of type names (see below) for adapting the arguments from JS to WASM. If passed 2 arguments, more than 3, or the 3rd is not an array, all arguments after the 2nd (if any) are treated as type names. In other words, the following usages are equivalent:
xWrap('funcname', 'i32', 'string', 'f64');
xWrap('funcname', 'i32', ['string', 'f64']);
As are:
xWrap('funcname', 'i32'); // no arguments
xWrap('funcname', 'i32', []);
Type names are symbolic names which map the function's result and arguments to an adapter function to convert, if needed, the value before passing it on to WASM or to convert a return result from WASM. The list of built-in names. The following lists describe each, noting that some apply only to arguments or return results, the two often having different semantics:
i8
,i16
,i32
(args and results): all integer conversions which convert their argument to an integer and truncate it to the given bit length.N*
(args): a type name in the formN*
, where N is a numeric type name, is treated the same as WASM pointer.*
andpointer
(args): are assumed to be opaque WASM pointers and are treated like the current WASM pointer numeric type. Non-numbers will coerce to a value of 0 and out-of-range numbers will have undefined results (as with any pointer misuse).*
andpointer
(results): are aliases for the current WASM pointer numeric type.**
(args): is simply a descriptive alias for'*'
. It's primarily intended to mark output-pointer arguments.i64
(args and results): passes the value toBigInt()
to convert it to an int64. Only available if BigInt support is enabled.f32
(float
),f64
(double
) (args and results): pass their argument toNumber()
. i.e. the adapter does not currently distinguish between the two types of floating-point numbers.number
(results): converts the result to a JS Number usingNumber(theValue).valueOf()
. Note that this is for result conversions only, as it's not possible to generically know which type of number to convert arguments to.
Non-numeric conversions include:
string
orutf8
(args): has two different semantics in order to accommodate various uses of certain C APIs...- If the arg is a JS string, a temporary C-string, UTF-8 encoded, is created to pass to the exported function, which gets cleaned up before the wrapper returns. If a long-lived C-string pointer is required, client-side code is required to create the string, then pass its pointer to the function.
- Else the arg is assumed to be a pointer to a string the client has already allocated and it's passed on as a WASM pointer.
string
orutf8
(results): treats the result value as a const C-string, encoded as UTF-8, copies it to a JS string, and returns that JS string.string:dealloc
orutf8:dealloc
(results): treats the result value as a non-const C-string, encoded as UTF-8, ownership of which has just been transfered to the caller. It copies the C-string to a JS string, frees the C-string usingdealloc()
, and returns the JS string. If such a result value is NULL, the JS result isnull
.
Achtung: when using an API which returns results from a specific allocator, this conversion is not legal. Instead, an equivalent conversion which uses the appropriate deallocator is required. An example of such is provided in the next section.string:flexible
(args): are an expanded version ofstring
described in the C-style API docs. These are widely used for SQL string inputs in the library.string:static
(args): if passed a pointer, returns it as is. Anything else: gets coerced to a JS string for use as a map key. If a matching entry is found (as described next), it is returned, elsewasm.allocCString()
is used to create a a new string, map its pointer to (''+v
) for the remainder of the application's life, and returns that pointer value for this call and all future calls which are passed a string-equivalent argument. This conversion is intended for cases which require static/long-lived string arguments, e.g.sqlite3_bind_pointer()
andsqlite3_result_pointer()
.json
(results): treats the result as a const C-string and returns the result of passing the converted-to-JS string toJSON.parse()
. Returnsnull
if the C-string is a NULL pointer. Propagates any exception fromJSON.parse()
.json:dealloc
(results): works exactly likestring:dealloc
but returns the same thing as thejson
adapter. Note the warning instring:dealloc
about the allocator and deallocator.
The type names for results and arguments are validated when
xWrap()
is called and any unknown names will trigger an
exception.
Clients may map their own result and argument adapters using
xWrap.resultAdapter()
and xWrap.argAdaptor()
, noting that not all
type conversions are valid for both arguments and result types
as they often have different memory ownership requirements. That topic
is covered in the next section...
Argument and Result Value Type Conversions
See also: api-c-style.md#type-conversion
When xWrap()
is called and evaluates function call signatures, it
looks up the argument and result type adapters for a match. It is
possible to install custom adapters for arguments and result values
using the methods listed below.
xWrap()
has two methods with identical signatures:
xWrap.argAdapter(string, function)
xWrap.resultAdapter(string, function)
Each one expects a type name string, such as the ones described for
xWrap()
, and a function which is passed a single value and must
return that value, a conversion of that value, or throw an exception.
Each of those functions returns itself so that calls may be chained.
For example's sake, let's assume we have a C-bound function which
returns a C-style string allocated using a non-default allocator,
my_str_alloc()
. The returned memory is owned by the caller and must
be freed, but needs to be freed using the allocator's deallocation
counterpart, my_str_free()
. We can create such a result value
adapter with:
wasm.xWrap.resultAdaptor('my_str_alloc*', (v)=>{
try { return v ? target.cstrToJs(v) : null }
finally{ wasm.exports.my_str_free(v) }
};
With that in place, we can make calls like:
const f = wasm.xWrap('my_function', 'my_str_alloc*', ['i32', 'string']);
const str = f(17, "hello, world");
// ^^^ the memory allocated for the result using my_str_alloc()
// is freed using my_str_free() before f() returns.
Similarly, let's assume that we have a custom JS class which has
a member property named pointer
which refers to C-side memory
of a struct which this JS class represents2. We can then make it
legal to pass such objects on to the C APIs with something like:
const argPointer = wasm.xWrap.argAdapter('*'); // default pointer-type adapter
wasm.xWrap.argAdaptor('MyType',(v)=>{
if(v instanceof MyType) v = v.pointer;
if(wasm.isPtr(v)) return argPointer(v);
throw new Error("Invalid value for MyType argument.");
});
With that in place we can wrap one of our functions like:
const f = wasm.xWrap('MyType_method', undefined, ['MyType', 'i32']);
const my = new MyType(...);
// ^^^ assume this allocates WASM memory referenced via my.pointer.
f( my /* will use my.pointer */, 17 );
Similar conversions can be done for result values, though how to do so for result values depends entirely on client-side semantics of memory management.
(Un)Installing WASM Functions
When using C APIs which take callback function pointers, one cannot simply pass JS functions to them. Instead, the JS function has to be proxied into WASM environment and that proxy has to be passed to C. That is done by compiling, on the fly, a small amount of binary WASM code which describes the function's signature in WASM terms, forwards its arguments to the provided JS function, and returns the result of that JS function. The details are ugly, but usage is simple...
installFunction()
pointer installFunction(funcSignature, function)
pointer installFunction(function, funcSignature)
Expects a JS function and signature, exactly as for
wasm.jsFuncToWasm()
. It uses that function to create a WASM-exported
function, installs that function to the next available slot of
wasm.functionTable()
, and returns the function's index in that table
(which acts as a pointer to that function). The returned pointer can
be passed to wasm.uninstallFunction()
to uninstall it and free up
the table slot for reuse.
As a special case, if the passed-in function is a WASM-exported function then the signature argument is ignored and func is installed as-is, without requiring re-compilation/re-wrapping.
This function will propagate an exception if
WebAssembly.Table.grow()
throws or wasm.jsFuncToWasm()
throws.
The former case can happen in an Emscripten-compiled environment
when building without Emscripten's -sALLOW_TABLE_GROWTH
flag.
jsFuncToWasm()
function jsFuncToWasm(function, signature)
function jsFuncToWasm(signature, function)
Creates a WASM function which wraps the given JS function and returns
the JS binding of that WASM function. The function signature string
must be in the form used by jaccwabyt or Emscripten's
addFunction()
. In short: in may have one of the following formats:
Emscripten:
"x..."
, where the first x is a letter representing the result type and subsequent letters represent the argument types. See below. Functions with no arguments have only a single letter.Jaccwabyt:
"x(...)"
wherex
is the letter representing the result type and letters in the parens (if any) represent the argument types. Functions with no arguments usex()
. See below.
Supported letters:
i
= int32p
= int32 ("pointer")j
= int64f
= float32d
= float64v
= void, only legal for use as the result type
It throws if an invalid signature letter is used.
Jaccwabyt-format signatures3 support some additional letters which have no special meaning here but (in this context) act as aliases for other letters:
s
,P
: same asp
scopedInstallFunction()
pointer scopedInstallFunction(funcSignature, function)
pointer scopedInstallFunction(function, funcSignature)
This works exactly like installFunction()
except that the
installation is scoped to the current allocation
scope and is uninstalled when the current
allocation scope is popped. It will throw if no allocation scope is
active.
uninstallFunction()
Function uninstallFunction(pointer)
Requires a pointer value previously returned from
wasm.installFunction()
. Removes that function from the WASM function
table, marks its table slot as free for re-use, and returns that
function. It is illegal to call this before installFunction()
has
been called and results are undefined if the argument was not returned
by that function. The returned function may be passed back to
installFunction()
to reinstall it.
Generic Utility Functions
isPtr()
boolean isPtr(value)
Returns true if its value is a WASM pointer type. That is, it's a a 32-bit integer greater than or equal to zero.
WASM-specific Peculiarities wrt. Mixing JS and C Code
See also: Gotchas
The transition from WASM to C is a relatively transparent one. With a small bit of glue code, the transition from C to JS is also relatively transparent for the most part. This chapter covers aspects which are not quite transparent.
Using Output-Pointer Arguments from JS
Output-pointer arguments are commonplace in C. On the contrary, they do not exist at all in JavaScript. In the sqlite3 API, one example of this is:
int sqlite3_open_v2(const char *zDbFile, sqlite3** pDb, int flags, const char *zVfs);
The two pointer qualifiers on the 2nd parameter denote that it is a so-called output parameter: the function can report a value to the caller by assigning that pointer a new value.
Using output pointers in JavaScript requires several steps:
- Allocate WASM memory to hold a pointer value.
- Use
wasm.pokePtr()
, or equivalent, to assign it an initial value (typically 0). - Call a WASM function and pass the pointer to it.
- Fetch the output pointer's new value using
wasm.peekPtr()
or equivalent. This is semantically equivalent to dereferencing the pointer in C. - Free the pointer allocated in step (1).
In its barest form, that looks something like:
const wasm = sqlite3.wasm;
const ppOut = wasm.alloc(wasm.ptrSizeof); // allocate space for a pointer
wasm.pokePtr(ppOut, 0); // zero out the memory
const rc = some_c_function( ..., ppOut ); // pass ppOut to a C function
const pOut = wasm.peekPtr(ppOut); // fetch the pointed-to value
wasm.dealloc(ppOut); // free space for the pointed-to value
// pOut now holds the output result value.
if(0===rc) { ... success ... }
else { ... error ... }
Obviously, that's a significant amount of code and has weaknesses such
as leaking the ppOut
memory if some_c_function()
throws a JS
exception. That can be cleaned up significantly with some help from
sqlite3.wasm
and a try
/finally
block:
const scope = wasm.scopedAllocPush();
try {
const ppOut = wasm.scopedAllocPtr(); // alloc and zero pointer
const rc = some_c_function( ..., ppOut );
const pOut = wasm.peekPtr(ppOut);
if(0===rc) { ... success ... }
else { ... error ... }
}finally{
// free all "scoped allocs" made in the context of `scope`,
// in our case ppOut.
wasm.scopedAllocPop(scope);
}
Or without the "scoped allocation" mechanism:
let pOut, ppOut;
try {
ppOut = wasm.allocPtr(); // alloc and zero pointer
const rc = some_c_function( ..., ppOut );
pOut = wasm.peekPtr(ppOut);
if(0===rc) { ... success ... }
else { ... error ... }
}finally{
wasm.dealloc(ppOut);
}
Or with the "pstack" allocator, which was added as a more efficient (faster) option for exactly this type of case:
const stack = wasm.pstack.pointer;
try {
const ppOut = wasm.pstack.allocPtr(); // "alloc" and zero memory
const rc = some_c_function( ..., ppOut );
const pOut = wasm.peekPtr(ppOut);
if(0===rc) { ... success ... }
else { ... error ... }
}finally{
wasm.pstack.restore(stack);
}
Noting that pstack
has a small, static memory buffer so cannot be
used for general-purpose allocations. Despite it being static memory,
it is accessed as if it were part of the WASM heap, and can thus
be accessed via the same memory accessor routines as heap memory
is.
Use of a try
/finally
block is a common idiom in the sqlite3 JS
code, used extensively for managing memory and object lifetimes. The
finally
block will be executed regardless of how the try
block is
exited: via return
, continue
, break
, throw
, or running to
completion. When it does, any "scoped" allocations made in the try
block will be freed. In this example we have only one such allocation,
but multiples are not uncommon. The scoped allocation API simplifies
freeing of memory in many common use cases over using the lower-level
alloc()
and dealloc()
routines (which are the counterparts of the
C-level sqlite3_malloc()
and sqlite3_free()
). The peekPtr()
and
pokePtr()
helpers are thin wrappers around peek()
and poke()
which eliminate the need to remember to pass some value other than the
default for the final argument of the latter functions (the default
being the wrong value needed for cases like the one demonstrated
above).
- ^
Not to be confused with the
sqlite3.wasm
file, which thesqlite3.wasm
namespace effectively wraps. - ^ Like those described in ./c-structs.md
- ^ This code is developed together with jaccwabyt, thus the support for its signature format