See also: WASM-JS peculiarities
Gotchas:
In programming, a gotcha is a valid construct in a system, program or programming language that works as documented but is counter-intuitive and almost invites mistakes because it is both easy to invoke and unexpected or unreasonable in its outcome.
Source: Wikipedia
Though every effort is made to shield users from unfortunate gotchas, some cannot be avoided...
Relative URI Resolution when Loading sqlite3.js
from a Worker
Two problems compound to create a significant gotcha for loading sqlite3 via a Worker:
- Resolution of relative URIs differs depending how the script is loaded.
- It is impossible, from JS, to determine the currently-loading
script's URI when it is loaded via
importScripts()
.
This becomes a problem when sqlite3.js
resides in a directory other
than the one in which the client application lives and the client
wants to load it using a Worker. A perfectly sensible usage pattern is
something like:
- Client code loads
new Worker('foo.js')
foo.js
usesimportScripts('path/to/sqlite3.js)
However, that will fail because sqlite3.js
will not be able to find
sqlite3.wasm
(and related files) because of how relative URIs are
resolved. The workaround for that is to tell the library where it
lives when instantiating the Worker with a URI argument:
new Worker('foo.js?sqlite3.dir=path/to');
Then, from foo.js
:
importScripts('path/to/sqlite3.js');
When sqlite3.js
is loaded via importScripts()
, the only URL the JS
environment exposes to it is the one from which the containing Worker
is loaded, which leads to sqlite3.js
being unable to resolve
sqlite3.wasm
.
As a workaround, sqlite3.js
inspects the URL arguments for the one
URL it can see (the one passed to the Worker's constructor). If it
finds sqlite3.dir
, it will attempt to load other sqlite3-related
files from that directory. If it does not find that, it will do its
best to figure out the correct path, falling back to the current
directory (which only works if it is in the same directory as the
client application).
Unfortunately, the sqlite3.dir
path must be duplicated in the URI to
foo.js
and the importScripts()
call, and eliminating that
duplication in the latter requires a great deal more code than that
duplication. For example:
let sqlite3Js = 'sqlite3.js';
const urlParams = new URL(self.location.href).searchParams;
if(urlParams.has('sqlite3.dir')){
sqlite3Js = urlParams.get('sqlite3.dir') + '/' + sqlite3Js;
}
importScripts(sqlite3Js);
Also unfortunately, URL arguments passed along with importScripts()
arguments are simply ignored, as the URI provided to importScripts()
is not available to the script being loaded that way. Its current URL
will resolve to the Worker script which loads it (in the above
example, foo.js
), so the sqlite3.dir
URL argument needs to be
applied when loading that script.
Note that the above is not an issue when loading sqlite3.js
, or
one of its supplemental JS files, via a <script>
tag, as an ES6
module, or directly via the Worker constructor:
const w = new Worker('path/to/sqlite3-worker1.js');
Will do the right thing because it has enough state to figure out
which directory it needs to load sqlite3.js
and sqlite3.wasm
from.
For more details about how relative URIs are resolved in different contexts, see:
https://zzz.buzz/2017/03/14/relative-uris-in-web-development/
WASM Heap Corruption is Easy!
WASM's view of memory is a simple flat byte array. With only a small handful of exceptions, that view is completely devoid of data type safety. Unlike C compilers, which offer compile-time warnings when attempting to apply data type X to memory which has been declared as type Y, WASM's view of the memory is completely devoid of type information.
What does this mean? It means that it's absolutely trivial to corrupt the WASM heap without intending to do so:
sqlite3.wasm.poke( 42, 0x1234, 'i32' );
We've just overwritten 4 bytes of the WASM heap, at address 42, with the value 0x1234. That might or might not result in misbehavior at some indeterminate point later on. Such corruption, just like heap corruption in C, will have entirely unpredictable effects with entirely unpredictible timing. Unlike C, however, we do not have heap-analysis tools like the indispensible valgrind to help us in WASM.
To be clear: heap corruption in WASM is limited to the memory inside the WASM environment's sandbox. It is impossible, barring serious bugs in the host WASM engine, to corrupt memory outside of the WASM environment from within the WASM environment.
In short, if a JS application starts throwing completely inexplicable errors, such as throwing an exception here:
myDb.exec("SELECT 1");
claiming that there's an SQL syntax error, the culrprit is undoubtedly memory corruption. (Yes, that particular symptom of memory corruption has in fact happened before. Another symptom seen more than once is an exception from WASM claiming that a called function has an invalid signature.)
Unfortunately, there is no good formula for tracking down such corruption, and it might not even show up until a month later. The best one can do, in terms of finding the cause, is to backrev to a version of the app which does not exhibit the problem, then "bisect" (in the SCM sense of the term), or step one version at a time, until a version is found which exhibits the problem, and then look for differences which might account for it. Anything which writes to the WASM heap is a potential suspect.
Good luck!