SQLite User Forum

Emscripten 4.0.0 incompatibility with the 3.48 release
Login

Emscripten 4.0.0 incompatibility with the 3.48 release

(1.3) By Stephan Beal (stephan) on 2025-01-15 18:38:49 edited from 1.2 [source]

This message is relevant only for the WASM users:

  1. An unforeseen misinteraction between Emscripten 4.0.0 and the SQLite build process introduces a JavaScript syntax error in the generated JS/WASM glue code, so the 3.48 wasm download was just regenerated with an older Emscripten.

  2. Anyone trying to do a "dist" wasm build of any sqlite version prior to this one using Emscripten 4.0.0 will encounter this problem. Build using Emscripten versions 3.1.7x are unaffected by this problem.

Upstream ticket: https://github.com/emscripten-core/emscripten/issues/23412

Thank you to Thomas Steiner, maintainer of the npm package, for quickly catching this.

Edit: it turns out that this is caused by a combination of our comment stripper (applied only for "dist" builds) and a regex in emsdk 4.0.0's generated code which contains /*. Our comment-stripper then sees that as a C-style comment and strips out the regex halfway through. The comment-stripper has been disabled in the current trunk and 3.48 branches, which means that the JS code in future releases will be almost twice as large as before.

Edit: it also turns out that SQLite builds created with Emscripten 4.0.0 are broken due to as-yet-undetermined Emscripten API changes which affect our bootstrapping of the module. Tracing those down is an immediate-priority TODO.

(2) By anonymous on 2025-01-15 16:30:04 in reply to 1.2 [link] [source]

there is likely something i am missing, but comment strip prior to compile would eliminate emscript regexp from being seen by the comment stripper. there must be some reason for stripping after emscript compile , yes? what am i missing?

(3) By Stephan Beal (stephan) on 2025-01-15 17:03:52 in reply to 2 [link] [source]

there is likely something i am missing, but comment strip prior to compile would eliminate emscript regexp from being seen by the comment stripper.

When the comment stripper sees a regex like:

var foo = /\/*/;

it will recognize that as the start of a /* comment and happily strip until a matching */ is found. The stripper is aware of string literals in all of the forms JS supports, so will not mangle string literals which contain /*, but it's not aware of JS regexes (and shouldn't be because this tool is also used for stripping C, C++, and whatever else i have which needs comment-stripping).

We've found a workaround for the above case but it's not yet been applied: we'll simply ignore / characters which are preceded by a backslash.

there must be some reason for stripping after emscript compile , yes?

We need to retain certain code comments and we use a custom tool which does that for us. We very specifically avoid depending on Emscripten-specific behaviors, insofar as possible, including code minification, so that we can remain portable for use with hypothetical future non-Emscripten toolchains, should one every actually appear.

what am i missing?

For more info, please see the comments on the Emscripten ticket linked to above.

(4) By Bo Lindbergh (_blgl_) on 2025-01-15 18:52:38 in reply to 3 [link] [source]

Checking for \ is insufficient. Contrived example:

   var foo = /foo/*2;
   var bar = /bar//2;

Doing arithmetic on a regex gives you NaN, but it's still syntactically valid.

(5) By Stephan Beal (stephan) on 2025-01-15 21:21:12 in reply to 4 [link] [source]

Checking for \ is insufficient. Contrived example:

That's absolutely true but we typically have complete control over the inputs for this process (with one exception - the one which Emscripten generates, which we tripped over today), so we can live with a little bit of "it might break in this or that corner case." This particular comment stripper is 10 or 12 years old and this is the first time i've run into this oddity with it, so i'm hoping it will be another 10 years before it breaks again. (A memorable oddity it did trip over was when i learned that /*/ is legal and some people actually use that: (1 + /*/2+/*/ 3) is 4.)

Worst case, we'll just have to remove comment-stripping again. We need the licensing-related headers left in place (which our custom stripper does).