SQLite Forum

WASM OPFS-sahpool VFS filename bug
Login

WASM OPFS-sahpool VFS filename bug

(1) By franztesca on 2023-07-23 21:14:25 [source]

I was playing out with the brand new OPFS-sahpool VFS in SQLite/WASM and I found a bug: When an OPFS file is put back in the SahPool and is later reused for another file, the new name of the associated path is encoded and written to the file. However, when the path name is encoded, it doesn't include the 0x0 termination char in the encoded path name, potentially becoming cluttered by past associated names.

For example, let's say we have an OPFS pool file, first associated with someexamplename.sqlite we then delete the path in the VFS, meaning the OPFS file is returned to the pool. If that OPFS is bound to a shorter name, let's say hello.sqlite, then its encoded name in the OPFS file will be hello.sqliteame.sqlite, where ame.sqlite is from the previous name ending.

This has a lot of side effects, such as the database not being found after the page is refreshed, the sah pool getting exhausted, or even creating two OFPS files with the same encoded name, breaking assumptions and causing crashes.

I know 3.43 is not released, just reporting early feedback.

(2) By Stephan Beal (stephan) on 2023-07-23 21:36:53 in reply to 1 [link] [source]

However, when the path name is encoded, it doesn't include the 0x0 termination char in the encoded path name, potentially becoming cluttered by past associated names.

... just reporting early feedback.

Better now than after the release, so thank you for the report (and please keep them coming!). Keep an eye on the trunk and it will be patched right after the dogs are walked.

(3) By Stephan Beal (stephan) on 2023-07-23 22:18:48 in reply to 1 [link] [source]

This has a lot of side effects...

That's now fixed in the trunk. Thank you again for reporting it and please continue to report any issues.

Sidebar: Roy's original impl doesn't have that problem because he uses scope-local buffers, which start out zeroed out, whereas i chose to reuse a buffer for the life of the object. Side effects like that may well have been (just speculating) the reason he chose not to reuse the buffer.

(4) By franztesca on 2023-07-24 14:57:17 in reply to 3 [link] [source]

Thanks for the prompt fix :wow:!

We are considering using opfs-sahpool in production as soon as it is released, or even before through a custom patch. I'll definitely post anything we find!

I'm currently using the npm package @sqlite.org/sqlite-wasm as the scaffold to test opfs-sahpool changes. Currently, I have a problem where if I copy the output of the sqlite build directly in there, I get the following webpack error:

ERROR in ../../../node_modules/@sqlite.org/sqlite-wasm/sqlite-wasm/jswasm/sqlite3-bundler-friendly.mjs 18517:27
Module parse failed: Duplicate export 'default' (18517:27)
You may need an appropriate loader to handle this file type, currently no loaders are configured to process this file. See https://webpack.js.org/concepts#loaders
|   return globalThis.sqlite3InitModule /* required for ESM */;
| })();
> export { toExportForESM as default, toExportForESM as sqlite3InitModule }
| 
Effectively in the produced sqlite3-bundler-friendly.mjs there are two exports for default, by removing the first one it works :/ I'm not familiar with JS modules systems. Is this an intended behavior?

Thanks

(5) By Stephan Beal (stephan) on 2023-07-24 15:26:47 in reply to 4 [link] [source]

Effectively in the produced sqlite3-bundler-friendly.mjs there are two exports for default, by removing the first one it works :/ I'm not familiar with JS modules systems. Is this an intended behavior?

It was intended but apparently misled. My hope was to avoid inadvertently exporting the symbol toExportForESM and the above works in browsers, providing both the explicit sqlite3InitModule and default which resolves to the same thing (which was the goal of it). In any case, i'll roll that change back and rely only on "default".

(6) By Stephan Beal (stephan) on 2023-07-24 15:43:24 in reply to 4 [link] [source]

I get the following webpack error:

Please try the current trunk and let us know if it works for you. Thank you for the report.

(7) By Roy Hashimoto (rhashimoto) on 2023-07-24 21:04:04 in reply to 4 [link] [source]

We are considering using opfs-sahpool in production as soon as it is released, or even before through a custom patch.

As I understand things, the format of this VFS OPFS files is not yet final. You might want to get some guidance on that before deploying it to production.

(8) By franztesca on 2023-07-24 21:11:26 in reply to 6 [link] [source]

It still doesn't work. If you compile sqlite/wasm for dev (cd ext/wasm && make) and then open the create artifact jswasm/sqlite3-bundler-friendly.mjs you will see that there are two top-level statements export default sqlite3InitModule;, which I guess are the problem (there should be only one default export I guess).

Deleting one of them it fixes the issue. Can you replicate it?

(9) By franztesca on 2023-07-24 21:30:08 in reply to 7 [link] [source]

As I understand things, the format of this VFS OPFS files is not yet final.

Is it because of the file transparency? Or there is something more fundamentally broken?

We are following this effort closely as we would like to have SQLite (with durability) working in the browser ASAP. Are there going to be major changes in the VFS file format before the release (e.g. due to transparency)? From the documentation I read: "No filesystem transparency, i.e. names clients assign their databases are different than the names this VFS stores them under."

Our current plan is to wait for the official SQLite release, but given overlapping timelines we may need to adopt this solution sooner, potentially using a custom patch. It would be nice to have some information about the plans of this effort @stephan Thank you both!

(10) By Stephan Beal (stephan) on 2023-07-25 06:46:23 in reply to 8 [link] [source]

Can you replicate it?

No. My build comes out with the following two lines:

sqlite3InitModule = toExportForESM;
export default sqlite3InitModule;

and grepping for ^export find only that last line.

:-?

(11) By Stephan Beal (stephan) on 2023-07-25 06:55:33 in reply to 9 [link] [source]

As I understand things, the format of this VFS OPFS files is not yet final.

Is it because of the file transparency? Or there is something more fundamentally broken?

We're not aware of anything fundamentally broken but until the release it made, "tweaks may happen" without warning as the mood strikes or the need arises. Once the release it made then we have strong backwards compatibility requirements, so any changes would either involve a transparent/automatic upgrade process or would be opt-in via a configuration option or VFS flag.

We are following this effort closely as we would like to have SQLite (with durability) working in the browser ASAP.

That's available in the current OPFS VFS.

Are there going to be major changes in the VFS file format before the release (e.g. due to transparency)?

That's as yet undecided/unknown, and will remain so until we go into "pencils-down" mode in the week or two preceding the release (at which point only bug-fix changes are permitted).

I read: "No filesystem transparency, i.e. names clients assign their databases are different than the names this VFS stores them under."

It means that the VFS uses random filenames and maps the names the user provides to those filenames. The API-level user uses whatever names they want (they don't see the random names), but internally, in the filesystem itself, they're random names. The randomly-named files are also not valid database files as-is because they have a 4kb header which contains metadata for the VFS itself. That's all under-the-hood details, and won't affect users unless users try to copy a database to OPFS and hope it will work (it currently won't).

The hope was to add a level of transparency to it before the release, but conflicting project priorities are making that unlikely (but also not ruled out).

It would be nice to have some information about the plans of this effort

See above!

(12) By Stephan Beal (stephan) on 2023-07-25 07:38:45 in reply to 8 [link] [source]

Can you replicate it?

Still no but i do have a suspicion: the build process necessarily removes an export which emscripten installs (with no option to tell it not to). My guess is that you're building on a platform where where "sed -i" and/or "grep -q" are not supported.

The makefile snippet in question:

########################################################################
# SQLITE3.xJS.EXPORT-DEFAULT is part of SQLITE3-WASMFS.xJS.RECIPE and
# SETUP_LIB_BUILD_MODE, factored into a separate piece to avoid code
# duplication. $1 is 1 if the build mode needs this workaround (esm,
# bundler-friendly, node) and 0 if not (vanilla). $2 must be empty for
# all builds except sqlite3-wasmfs.mjs, in which case it must be 1.
#
# Reminder for ESM builds: even if we use -sEXPORT_ES6=0, emcc _still_
# adds:
#
#   export default $(sqlite3.js.init-func);
#
# when building *.mjs, which is bad because we need to export an
# overwritten version of that function and cannot "export default"
# twice. Because of this, we have to sed *.mjs to remove the _first_
# instance (only) of /^export default/.
#
# Upstream RFE:
# https://github.com/emscripten-core/emscripten/issues/18237
define SQLITE3.xJS.ESM-EXPORT-DEFAULT
if [ x1 = x$(1) ]; then \
		echo "Fragile workaround for emscripten/issues/18237. See SQLITE3.xJS.RECIPE."; \
		sed -i -e '0,/^export default/{/^export default/d;}' $@ || exit $$?; \
		if [ x != x$(2) ]; then \
			if ! grep -q '^export default' $@; then \
				echo "Cannot find export default." 1>&2; \
				exit 1; \
			fi; \
		fi; \
fi
endef

If the -i is the problem then that can be worked around easily enough. You can confirm that by invoking sed --help and see if it mentions the -i option. (The wasm build is only known to work properly on Linux-like environments running GNU versions of the common Unix tools.)

(13) By franztesca on 2023-07-25 12:15:31 in reply to 12 [link] [source]

I'm building from Mac OS 13.4.1 (22F82). Both sed -i and grep -q seem to be available:

usage: grep [-abcdDEFGHhIiJLlMmnOopqRSsUVvwXxZz] [-A num] [-B num] [-C[num]] [-e pattern] [-f file] [--binary-files=value] [--color=when] [--context[=num]] [--directories=action] [--label] [--line-buffered] [--null] [pattern] [file ...]

usage: sed script [-Ealnru] [-i extension] [file ...] sed [-Ealnu] [-i extension] [-e script] ... [-f script_file] ... [file ...]

The thing is I don't see any error or warning when building...

It's not really important for us, we can patch manually, and I guess it's out of scope since we are not on linux.

(14) By Stephan Beal (stephan) on 2023-07-25 12:20:20 in reply to 13 [link] [source]

-i extension

Your -i flag works differently than GNU's does:

 -i[SUFFIX], --in-place[=SUFFIX]
   edit files in place (makes backup if SUFFIX supplied)

So yours is swallowing the argument which follows the -i.

i'll rework that to not need the -i flag. (It requires writing to a temp file, then overwriting the original.) Keep an eye on the trunk for a fix within the next hour or so.

(15) By franztesca on 2023-07-25 12:30:36 in reply to 11 [link] [source]

Once the release it made then we have strong backwards compatibility requirements, so any changes would either involve a transparent/automatic upgrade process or would be opt-in via a configuration option or VFS flag.

This is definitely interesting to us. But if nothing is broken, depending on other app requirements we may even release before with a patch and eventually deal with the breaking changes later. Thanks!

That's available in the current OPFS VFS.

We investigated that, but cross-origin isolation is a pain for our app. We sometimes load cross-origin third-party resources that may not have CORP, and we definitely use embedded cross-origin iframes, which are completely broken by cross-origin isolation. The current usable solutions (COEP: credentialless and anonymous/credentialless iframes are still WIP and may be experimental on Chromium browsers only). We decided not to make our app cross-origin isolated for now.

It means that the VFS uses random filenames and maps the names the user provides to those filenames. The API-level user uses whatever names they want (they don't see the random names), but internally, in the filesystem itself, they're random names. The randomly-named files are also not valid database files as-is because they have a 4kb header which contains metadata for the VFS itself. That's all under-the-hood details, and won't affect users unless users try to copy a database to OPFS and hope it will work (it currently won't).

We are aware. We actually need to download and import SQLiteDB, but we do that by downloading in memory and importing using the convenient OPFS-sahpool utility function. We are a bit concerned by memory pressure and the long-blocking of the worker during the writing if the file is big (>100Mb). It would be nice to have either transparency (to let us customize our importing function) or an async stream-based importing function for our requirements...

Thanks for the info!

(16) By Stephan Beal (stephan) on 2023-07-25 12:56:50 in reply to 15 [link] [source]

It would be nice to have either transparency (to let us customize our importing function) or an async stream-based importing function for our requirements...

At this point transparency before the release is looking unlikely due to conflicting higher-priority project pressures1, but a streaming API which writes in chunks which are provided to it by the client via a callback is a realistic/reasonable feature. i can't yet promise that it will read from the client asynchronously, but will strive to make it so.


  1. ^ That's not to say that we're guaranteeing that transparency will ever be added to the sahpool VFS. It's a nice idea but not a strictly necessary feature and there is seemingly no end of higher-priority things to do. Admittedly, i was initially convinced that it was a necessary feature until actually trying it out, at which point i just had to shrug my shoulders and wonder what all my earlier fuss about filename transparency had been about.

(17) By Roy Hashimoto (rhashimoto) on 2023-07-25 14:10:24 in reply to 16 [link] [source]

At this point transparency before the release is looking unlikely due to conflicting higher-priority project pressures.

I think that if you are truly considering filesystem transparency as a future upgrade, then you should make metadata storage compatible with that now. Releasing the VFS in its current state will make an upgrade very difficult.

Once the release it made then we have strong backwards compatibility requirements, so any changes would either involve a transparent/automatic upgrade process

An upgrade that migrates the format will be quite difficult to make fully transparent and automatic. The reason is that for large databases the migration process could take minutes (so must happen in the background to be transparent and automatic), and must handle the case where free storage space is smaller than the database currently uses, as well as preserving data in case the migration process is interrupted. Without a migration process, implementing a new format either with a switch or a new separate VFS (i.e. not an upgrade) will leave existing applications and their OPFS stores behind.

I think that if you want to leave the door open for a filesystem transparency upgrade but don't have time even to lay the groundwork now (by moving the metadata, which might take a day of coding), then it makes more sense to put off release until a later SQLite version. I don't understand the pressure to release it as soon as possible even though it is not a development priority - perhaps those pressures do exist - but if that happens then as a practical matter you are effectively guaranteeing a filesystem transparency upgrade will never happen. And that's fine as long as everyone recognizes and understands that.

(18) By Stephan Beal (stephan) on 2023-07-25 14:54:45 in reply to 17 [link] [source]

... but if that happens then as a practical matter you are effectively guaranteeing a filesystem transparency upgrade will never happen. And that's fine as long as everyone recognizes and understands that.

If i had to bet on it right now i'd bet against it ever happening simply on grounds of perceived development cost vs perceived practical benefit, but the truth is that when the muse strikes, there's no ignoring her :). It's undeniable that she's the one who's really in charge of what gets coded and what doesn't ;).

(19) By Roy Hashimoto (rhashimoto) on 2023-07-25 15:09:37 in reply to 18 [link] [source]

I'll just note that the the practical benefit you perceive has fluctuated wildly over the course of this discussion. Are you sure that won't change again, ever? In the meantime, releasing in this state will dramatically raise the development cost, perceived or otherwise. If SQLite manages development by muse, that's okay I guess, but you are tilting the scale of costs and benefits for the future based on what the muse is telling you now.

(20) By Stephan Beal (stephan) on 2023-07-25 15:45:27 in reply to 19 [link] [source]

I'll just note that the the practical benefit you perceive has fluctuated wildly over the course of this discussion.

Absolutely, it's rotated 180 degrees. Two weeks ago i was convinced that it was required for sane operation but now, for this particular VFS, i see little genuine benefit to it compared to the time its development would take from other pending tasks.

Are you sure that won't change again, ever?

Not even remotely :).

In the meantime, releasing in this state will dramatically raise the development cost, perceived or otherwise.

The possibility of adding a separate VFS to address transparency admittedly makes the matter seem less pressing than it otherwise would. Hopefully the pending JSPI-based VFS will give us the best of both worlds, and that's one of the tasks which is preempting current changes to the SAH pool.

... but you are tilting the scale of costs and benefits for the future based on what the muse is telling you now.

That's an inherent down-side of muse-steered development, but coding without direction from a muse is generally dull and tiring :( instead of joyful and fulfilling :). A bored and tired me is an uninspired and unproductive me!

(21) By franztesca on 2023-07-25 21:23:05 in reply to 14 [link] [source]

Thanks! It still doesn't work, apparently, sed posix doesn't support replacement of the first occurrence just like GNU sed does.

This awk solution works on my Mac though (and maybe on linux too):

awk '/export default/ && !f{f=1; next} 1' $@ > $@.tmp && mv $@.tmp $@;

(22) By Stephan Beal (stephan) on 2023-07-25 21:34:15 in reply to 21 [link] [source]

This awk solution works on my Mac though (and maybe on linux too):

Thank you for that. i'm away from the computer for the night but will get that patched first thing tomorrow morning.

(23) By Roy Hashimoto (rhashimoto) on 2023-07-26 00:34:46 in reply to 20 [link] [source]

The possibility of adding a separate VFS to address transparency admittedly makes the matter seem less pressing than it otherwise would. Hopefully the pending JSPI-based VFS will give us the best of both worlds, and that's one of the tasks which is preempting current changes to the SAH pool.

JSPI is going to fix your problems in connecting synchronous calls out from SQLite to asynchronous VFS functions. But...

JSPI is going to create problems for you by returning Promises on your calls in to SQLite. This is something you consider "arguably silly":

Just to be clear: that's only true if someone makes the arguably silly decision to implement step as async. This project's JS interface uses only synchronous APIs for db activity.

Barring another 180 here, I don't see how JSPI will be your panacea.

(24) By Stephan Beal (stephan) on 2023-07-26 08:09:56 in reply to 23 [link] [source]

JSPI is going to create problems for you by returning Promises on your calls in to SQLite.

JSPI effectively turns the promises into blocking/delayed returns. From the jspi docs:

The JSPI is an API that bridges the gap between synchronous applications and asynchronous Web APIs. It does so by suspending the application when it issues a synchronous API call and resuming it when the asynchronous I/O operation is completed.

The result that the sqlite internals get is that of the resolved promise, not the promise itself. That doesn't seem to be made clear in the docs, but that it's what its developers assure me is the case.

(25) By franztesca on 2023-07-26 10:27:39 in reply to 16 [link] [source]

Other feedback: in our app, we have a logic to open (or create if not exist) a database. However, it's hard to decide on a safe minimum capacity for the VFS pool with the current API.

For example, assume we are in the case where we need to open (or create) a db and there are already 6 files in the VFS (getFileCount). Potentially the minimum VFS capacity should be:

  • 6 if the file already exists
  • 8 if the file doesn't exist (2 files for a db because one is for journal or WAL)

However, there is the possibility that the existing files are databases for which sqlite can create an additional file (because of journal/wal). So if there are 6 files currently and we need to open a new database, the worst case scenario that could happen is that SQLite will later create 6 more files for the existing 6 databases (for journal, wal) plus 2 new ones for the new database (6 * 2) + 2 = 14! So we set minimum capacity to 14!

In our app, we use mainly 4 databases (even though they can grow beyond that), which usually end up in 8 files. Thus we use a minimum capacity of 18 files when opening a database after a page refresh (to account for the worst scenario, where those 8 files are databases without journals, different from the 4 we want to open)! We can probably do better by keeping count of the open databases (probably needing only files + open databases + 2 = 12/14) on the app side. However, it would be much more convenient to:

  • have something more efficient than this: e.g. by exposing a function that returns the names of files of the VFS, we can be smarter: there are 6 files, 3 are db and 3 are corresponding journals, then if I need to open a database the minimum capacity is 6 (if db exists) or 8 (if db doesn't exist)
  • have this logic directly in the SahPoolUtil: e.g. ensureSpaceForDB(name?): Promise<Unit>

The former should be super easy to implement AFAIS.

(26.1) By Stephan Beal (stephan) on 2023-07-26 11:12:32 edited from 26.0 in reply to 25 [link] [source]

However, it's hard to decide on a safe minimum capacity for the VFS pool with the current API.

That's an inherent, and unavoidable, property of this VFS.

(2 files for a db because one is for journal or WAL)

Sidebar: WAL isn't supported in wasm because it lacks the required shared memory APIs.

by exposing a function that returns the names of files of the VFS...

i don't see any reason we can't do that.

Edit: that's been added as getFileNames().

have this logic directly in the SahPoolUtil: e.g. ensureSpaceForDB(name?)

A name is completely meaningless for the VFS. It does not, without 100% guessing, know that any given name represents a db.

i'm not convinced that such guesswork belongs in the VFS, as library-level guesswork based on client-specified filenames would be fragile, but feel free to try to convince me.

(27) By franztesca on 2023-07-26 14:08:56 in reply to 26.1 [link] [source]

Thanks!

A name is completely meaningless for the VFS. It does not, without 100% guessing, know that any given name represents a db.

i'm not convinced that such guesswork belongs in the VFS, as library-level guesswork based on client-specified filenames would be fragile, but feel free to try to convince me.

Realistically there are not many other use cases for this VFS except for sqlite. The VFS itself uses flags of SQLite to determine if a file should be persisted or not.

const PERSISTENT_FILE_TYPES =
        capi.SQLITE_OPEN_MAIN_DB |
        capi.SQLITE_OPEN_MAIN_JOURNAL |
        capi.SQLITE_OPEN_SUPER_JOURNAL |
        capi.SQLITE_OPEN_WAL
Using the same flags, it could determine which files are databases and which are journals/wal/db.

That's an inherent, and unavoidable, property of this VFS.

I'm not sure it's unavoidable given a SQLite-aware VFS and an async file opening logic.

I'm not necessarily pushing for this API to be at the VFS level, it could be a WASM-specific API like the ones for kvvfs sqlite3_js_kvvfs_

Maybe sqlite3_js_opfssahpool_open_db_safely(...): Promise<...> where the function uses the underlying VFS to know if the file already exists, if it has already an allocated journal/wal file too (leveraging the SQLite flags mentioned above), and depending on that increases the capacity if needed. Potentially the OpfsSahPool utility could even become an implementation detail of this function and not be exposed to the user (there is no real reason to manually decide the capacity if this can be managed autonomously by sqlite lib). This would make it much more user-friendly IMHO.

I'm happy with getFileNames() for my use case, I can implement the logic app side.

(28) By Roy Hashimoto (rhashimoto) on 2023-07-26 14:31:31 in reply to 24 [link] [source]

The result that the sqlite internals get is that of the resolved promise, not the promise itself.

Please read my post again. You are focusing on what happens when WASM calls back into Javascript. That is the part that will work great for you.

What happens where Javascript calls into WASM? It's this (from the same doc you're quoting, emphasis mine):

The JSPI works by intercepting the Promise returned from an asynchronous API call, suspending the main logic of the WebAssembly application, and returning a Promise from the export that was used to enter the WebAssembly.

This is the part where either you'll say, "of course sqlite3_step() (and other calls) should be asynchronous," or "JSPI isn't a match for SQLite."

JSPI is like a magic trick. You're marveling at the cool stuff happening at the top of the WASM stack, and you're not paying attention to what you don't want to see at the bottom of the WASM stack.

(29) By Stephan Beal (stephan) on 2023-07-26 15:49:00 in reply to 27 [link] [source]

Realistically there are not many other use cases for this VFS except for sqlite.

The vfs will effectively delete any files except for sqlite db/journal files when it starts up, so it's fair to say that db files and journals are the only files it supports.

The VFS itself uses flags of SQLite to determine if a file should be persisted or not.

That's a detail i had not considered. We could, after we open all of the handles, go through and read all of their flags, count the number of db files, and then ensure that the capacity is a minimum of double that number, however...

  • Being able to guess the number of file handles needed only works once your databases are already in storage, i.e. the storage has already been appropriately sized by the client.

  • That ^^^ size is persistent. Having the VFS go through the trouble of calculating what "should" be, at most, that same value seems superfluous.

  • Recall, too, that sqlite may create temporary files for purposes other than journals if its TEMP_STORE setting permits it. A simple (db count times two) heuristic only suffices if the TEMP_STORE is set such that sqlite is prohibited from creating temp files. That's a per-db client-level setting which the VFS has no insight into (the setting can't be set until after the VFS xOpen()s a db). Whether or not it will create extra temp files depends entirely on how the db is actually used, and only the client knows how the db is used. (Even then, the client may not realize they're doing something which might trigger creation of a temp file, as those decisions are sqlite-internal details.)

In short, the VFS could only guess and has less information about how to do so than the client does.

Potentially the OpfsSahPool utility could even become an implementation detail of this function and not be exposed to the user

i'm admittedly not terribly happy about exposing the pool utilities, but it is a necessary evil required for adding an sqlite3.oo1-based wrapper and for storage maintenance, e.g. deleting individual files, importing/exporting dbs, or removing all traces of the VFS from OPFS (the alternative being to strand the files there forever).

The alternative to exposing the pool utils would be to create a bunch of sqlite3_js_opfssah...() or sqlite3.opfsSah.some_func() APIs, which is even less appealing in terms of interface (see below for why).

One of the tradeoffs of this VFS is that the client has to make a decision about how much capacity the VFS will have. There's no fool-proof way that it can guess a best-fit value. To the best of my knowledge there is no fundamental problem with over-allocating 5 or 10 or 20 handles, but doing so is not a decision the library has any business making on the client's behalf.

(there is no real reason to manually decide the capacity if this can be managed autonomously by sqlite lib). This would make it much more user-friendly IMHO.

If acquisition of files was synchronous, it could be managed automatically, but OPFS makes that asynchronous.

Maybe sqlite3_js_opfssahpool_open_db_safely(...)...

During initial development i plugged in new functions and classes in that way for this purpose but they ended up being extremely clumsy to use because the VFS supports multiple distinct instances. For example, we ended up with:

let db = new sqlite3.oo1.OpfsSahPoolDb['opfs-sahpool']();
db = sqlite3.oo1.OpfsSahPoolDb['my-opfs-sahpool']();
db = sqlite3.oo1.OpfsSahPoolDb['another-sahpool']();
sqlite3.capi.sqlite3_js_opfs_sahpool_unlink['yet-another-vfs'](...);
sqlite3.opfsSahPool.unlink['yet-another-vfs'](...);

// alternately, but nearly as ugly:
db = sqlite3.oo1.OpfsSahPoolDb('my-opfs-sahpool', ...);
sqlite3.capi.sqlite3_js_opfs_sahpool_unlink('yet-another-vfs', ...);
sqlite3.opfsSahPool.unlink('yet-another-vfs', ...);

(Sidebar: the reason multiple instances are supported is so that multiple pages within a single HTTP origin can use this VFS concurrently, albeit each with their own OPFS sandbox.)

The far more usable approach turned out to be passing the pool utility wrapper to the client who installed the VFS, giving them a pool instance which is specific to that one VFS instance. They're of course encouraged to squirrel that instance away somewhere "more globally" if needed, e.g. sqlite3.MyVfs.

I'm happy with getFileNames() for my use case, I can implement the logic app side.

Great, i'm glad to hear that :). If you come up with snippets which might be of general use, please post them here and we can either integrate them into the API, where appropriate, or add them to the docs.

(30) By Stephan Beal (stephan) on 2023-07-26 16:37:48 in reply to 28 [link] [source]

What happens where Javascript calls into WASM? It's this (from the same doc you're quoting, emphasis mine):

The JSPI works by intercepting the Promise returned from an asynchronous API call, suspending the main logic of the WebAssembly application, and returning a Promise from the export that was used to enter the WebAssembly.

What little i know of JSPI is based primarily on discussions with one of its developers, and that direction of the equation has never come up (and i've rather naively assumed that that direction is unaffected). i don't currently know enough with certainty to comment on that.

This is the part where either you'll say, "of course sqlite3_step() (and other calls) should be asynchronous," ...

Async is, in the context of binding a 100% synchronous C API, a necessary evil to be used only where 3rd-party integration (OPFS) makes it unavoidable (IMHO, of course).

... or "JSPI isn't a match for SQLite."

JSPI may well turn out to be a no-go for sqlite, for any number of reasons which i currently am not qualified to speculate on. In any case i'm obligated, per a long-standing agreement, to make a best-effort attempt to figuring that out.

JSPI is like a magic trick. You're marveling at the cool stuff happening at the top of the WASM stack, and you're not paying attention to what you don't want to see at the bottom of the WASM stack.

Fair enough. At this point it's very much a "throw it all against the wall and see what sticks" kind of thing, with no commitment to keep it if not enough of it sticks in interesting ways. Work began on it today, but the lib completely fails to start up when compiled with the newly-required emcc flag for as-yet-undetermined reasons.

(31.1) By Roy Hashimoto (rhashimoto) on 2023-07-26 21:30:42 edited from 31.0 in reply to 29 [link] [source]

A simple (db count times two) heuristic only suffices [...]

Even with temp_store set to MEMORY, I don't think 2N (where N is the number of databases) is the right baseline.

First of all, if a transaction writes to multiple databases then you need a super-journal file, so add one if N>1.

Second, I'm not that familiar with statement journals, but the docs say that a statement journal is used under certain conditions, and statement journal location is not controlled by temp_store configuration (emphasis mine):

Whether or not temporary files other than the rollback, super, and statement journals are written to disk or stored only in memory depends on the SQLITE_TEMP_STORE compile-time parameter, the temp_store pragma, and on the size of the temporary file.

So I think the minimum heuristic is:

  • 3N if N=1
  • 3N + 1 if N>1

And a multiplier higher than 3 is needed with a temp_store setting that can put other temporary files on the VFS.

Otherwise I agree with Stephan that pool capacity is best specified by the application and while overallocation is probably not a problem, the library has no reasonable way to guess what the capacity should be.

Update: If you're worried about your pool size, your application can dynamically handle a SQLITE_CANTOPEN error by increasing the pool size and retrying the SQLite operation that produced the error (if it happens in sqlite3_step(), note that you should sqlite3_reset() your statement and restart from there).

(32) By franztesca on 2023-07-31 16:02:41 in reply to 31.1 [link] [source]

I see. My low knowledge of SQLite was making me assume that there can be only 2 files per database. If this is not true, then the solution clearly would not work, as the worst-case scenario would require an unrealistic number of files allocated for nothing.

Update: If you're worried about your pool size, your application can dynamically handle a SQLITE_CANTOPEN error by increasing the pool size and retrying the SQLite operation that produced the error (if it happens in sqlite3_step(), note that you should sqlite3_reset() your statement and restart from there).

That sounds reasonable, but requires us to make methods that are normally synchronous, asynchronous, adding some complexity.

For now, assuming that we are not using any fancy feature of SQLite and thus expect max 2 files per database, this should do the job:

function minimumSAHPoolCapacityFor(filename: string, existingFiles: string[]) {
  let capacityPotentiallyOccupied = 0;
  // By default, we need one slot for the DB and one for its journal.
  let capacityForNewDatabase = 2;
  const existingFilesSet = new Set(existingFiles);
  for (const existingFile of existingFiles) {
    if (!existingFile.endsWith('-journal') && !existingFilesSet.has(`${existingFile}-journal`)) {
      // There is a database file without its corresponding journal, so we should reserve one slot capacity for it.
      capacityPotentiallyOccupied++;
    }
    if (existingFile === filename) {
      // There is already our database, we don't need any more capacity for it. If we need some for the journal, is already accounted for in capacityPotentiallyOccupied.
      capacityForNewDatabase = 0;
    }
  }
  return existingFiles.length + capacityPotentiallyOccupied + capacityForNewDatabase;
}

async function openOrCreateDatabaseWithSAHPool(
  sqlite: SQLite3WithOpfsSAHPoolVFS,
  filename: string,
): Promise<...> {

  await opfsAccessMutex.runInMutex(async () => {
    const existingFiles = sqlite.sahPoolUtil.getFileNames();
    await sqlite.sahPoolUtil.reserveMinimumCapacity(minimumSAHPoolCapacityFor(filename, existingFiles));
    return new sqlite.oo1.DB(`file:${filename}?vfs=opfs-sahpool`, openMode);
  });
}

I'm concerned with the fact that this page mentions other temp files that seem outside of our control, such as "Statement Journal Files" and "Transient Indices". If those use an additional slot, the corresponding operation (or something else), could fail. WDYT? Is it actually risky?

(33) By Stephan Beal (stephan) on 2023-07-31 16:20:52 in reply to 32 [link] [source]

If those use an additional slot, the corresponding operation (or something else), could fail. WDYT? Is it actually risky?

The risk depends on compile-time settings (SQLITE_TEMP_STORE), the TEMP_STORE pragma, and how you use the database. Superjournals, for example, are only generated in conjunction with ATTACHed databases. If you set TEMP_STORE to 2 then it should use memory for all temp data except (per the doc you linked to) "rollback, super, and statement journals." WAL and shared-memory files don't exist in the wasm build. Until recently the canonical wasm build used SQLITE_TEMP_STORE 3 to force all temp stuff into memory but we recently, in conjunction with development of the new VFS, lowered it to 2 to give clients more control over where their temp stuff goes (see section 3 of www:/tempfiles.html).

Unfortunately, the only accurate answer is "it depends!"

Me, i'd over-allocate by a comfortable margin and not lose any sleep over it unless/until over-allocation proves to cause other problems.

(34) By franztesca on 2023-07-31 17:18:10 in reply to 4 [link] [source]

Another issue I found is that SQLite sometimes fails when using Chrome incognito or guest mode. Importing a database fails with: SQLITE_CORRUPT: sqlite3 result code 11: database disk image is malformed (I'm using the OO API). And during normal operations the DB starts failing with SQLITE_IOERR: sqlite3 result code 10: disk I/O error and then sqlite3_step() rc= 11 SQLITE_CORRUPT SQL = CREATE TABLE IF NOT EXISTS ...

The suspicion is that we may be hitting the quota limit for the storage, as we are around 100Mb. However, the quota reported by the web API is much higher {"quota":1873984234,"usage":105274869,"usageDetails":{"fileSystem":105250474,"indexedDB":24395}}.

Also, OPFS is not enabled in mozzilla private browsing.

With mozzilla we are happy as we can fallback to using an in-memory database when in private browsing. With Chrome we are quite concerned as there is no way to reliably detect incognito mode and this makes our application brittle as it can just stop working at some point, but not immediately.

In any case, an additional warning could be added to https://sqlite.org/wasm/doc/tip/persistence.md IMHO.

Do you see ways around this issue?

(35) By Stephan Beal (stephan) on 2023-07-31 17:57:09 in reply to 34 [link] [source]

In any case, an additional warning could be added to... persistence.md IMHO.

A good suggestion. i will add a new section about potential difference in incognito/guest modes.

Do you see ways around this issue?

Nope. Our JS code has little insight as yours regarding incognito mode (which is intentionally difficult to detect). Even if we recognize incognito mode, that info could not be used to change the library's behavior in any sensible way. All we can do is keep trying to write until storage runs out.

Now that you mention it, i recall reading about this long before we looked into OPFS...

https://www.bleepingcomputer.com/news/google/google-chrome-incognito-mode-can-still-be-detected-by-these-methods/

i don't know if that info is still remotely current but it may explain the potentially obfuscated quota info you're seeing.

(36) By franztesca on 2023-08-04 15:22:01 in reply to 35 [link] [source]

Another bug (or unexpected behavior) I found is that the importing database function will always be successful also when the database is not successfully imported. When I try to import a DB in an incognito window, sah.write(data) returns a well-unexpected 4294967288 even though the number of bytes I'm writing is lower (264175616). 4294967288 is actually 0xFFFFFFF8 (-8), it seems an error code for quota exceed, which is not even in the spec of the possible outputs of write. I'll file a bug report to Chromium. Don't know if you want to include some additional checks in the lib (e.g. written bytes == total bytes).

(37.2) By Stephan Beal (stephan) on 2023-08-04 20:32:40 edited from 37.1 in reply to 36 [link] [source]

Another bug (or unexpected behavior) I found is that the importing database function will always be successful also when the database is not successfully imported.

Thank you for the report. That was my fault - i naively expected an exception on write error1.

Can you please try this patch and see if that resolves the silent failure for you?

Index: ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js
==================================================================
--- ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js
+++ ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js
@@ -877,12 +877,16 @@
         }
       }
       const sah = this.#mapFilenameToSAH.get(name)
             || this.nextAvailableSAH()
             || toss("No available handles to import to.");
-      sah.write(bytes, {at: HEADER_OFFSET_DATA});
-      this.setAssociatedPath(sah, name, capi.SQLITE_OPEN_MAIN_DB);
+      const nWrote = sah.write(bytes, {at: HEADER_OFFSET_DATA});
+      if(nWrote != n){
+        toss("Expected to write "+n+" bytes but wrote "+nWrote+".");
+      }else{
+        this.setAssociatedPath(sah, name, capi.SQLITE_OPEN_MAIN_DB);
+      }
     }
 
   }/*class OpfsSAHPool*/;

Edit: an equivalent patch has since been applied to both the import and export routines in the trunk.


  1. ^ Because that's what the docs imply

(38) By franztesca on 2023-08-04 22:01:57 in reply to 37.2 [link] [source]

I think it's a bug in the implementation, the write method should throw. I filed a bug on the chromium bug tracker for that. Thanks for the patch lib side, I'll try it on Monday

(39) By Stephan Beal (stephan) on 2023-08-04 22:54:50 in reply to 38 [link] [source]

I filed a bug on the chromium bug tracker for that. Thanks for the patch lib side,

Thank you for your continued testing and feedback - it's exceedingly helpful to have someone other than myself put it through its paces.

(40) By franztesca on 2023-08-09 09:29:08 in reply to 37.2 [link] [source]

Sorry for late response. I tested it and it works as intended! Thanks

(41.1) By Stephan Beal (stephan) on 2023-08-09 09:43:19 edited from 41.0 in reply to 40 [link] [source]

I tested it and it works as intended!

Fantastic - thank you for the confirmation! The more of these annoyances which we can discover and squash before the 3.43 release, the better :).

Edit: BTW, yesterday or the day before a section was added to the docs about the possibility of different behaviors and limitations in guest/incognito mode, per your suggestion.

(42) By Stephan Beal (stephan) on 2023-08-09 09:49:52 in reply to 23 [link] [source]

JSPI is going to create problems for you by returning Promises on your calls in to SQLite.

Follow-up: after discussions with the JSPI developers, the "long-tail" of changes JSPI imposes on client code makes it incompatible with the sqlite API. We're no longer pursuing that approach.

(It did, however, lead to interesting speculation that perhaps some wildly hypothetical sqlite4 should be event-driven in order to better fit into the async world.)

(43) By Roy Hashimoto (rhashimoto) on 2023-08-09 14:14:47 in reply to 42 [link] [source]

We're no longer pursuing [JSPI].

Perhaps that will motivate rethinking this:

The possibility of adding a separate VFS to address transparency admittedly makes the matter seem less pressing than it otherwise would. Hopefully the pending JSPI-based VFS will give us the best of both worlds, and that's one of the tasks which is preempting current changes to the SAH pool.

Just to be clear, I don't think you need to fully implement transparency for an initial release (I estimate that would take maybe a week of development). You only need to make your VFS compatible with adding transparency so developers will be able to migrate their users' stores if/when that ever happens. I think that should only take a day of effort.

(44) By franztesca on 2023-09-27 14:48:52 in reply to 16 [link] [source]

We started using SQLite/WASM OPFS-sahpool in production and the results are good, thank you for the great job!

However, we found already users that cannot load the full database in memory in order to import it. They are getting NotReadableError on Blob.arrayBuffer (the requested file could not be read, typically due to permission problems that have occurred after a reference to a file was acquired.) This should happen on files larger than 2GB (al least on Chrome), but we know that the size of the file the user is trying to import is around 300Mb (maybe they are low on memory?).

This is to say that an asynchronous API or some way to import the DB in chunks would be very useful, maybe for the next release. I think we are going to use the internals of the VFS to write a workaround that allows us to do that for now, but it'd be great to have it in the API.

Also it would be great to have a VFS API to move/rename a file: in our use case we are first downloading the file, then moving it at a given location in the VFS. Currently we download the file as blob, then when we want to move it we load the blob content and call the import database function to save it to the VFS at the desired location. Ideally we could just download the file directly to the VFS and move it (rename it) when needed.

(45) By Stephan Beal (stephan) on 2023-09-27 15:25:00 in reply to 44 [link] [source]

We started using SQLite/WASM OPFS-sahpool in production and the results are good, thank you for the great job!

The credit for that goes entirely to Roy Hashimoto. Without his work, and his blessing to clone it, it wouldn't exist.

However, we found already users that cannot load the full database in memory in order to import it. They are getting NotReadableError on Blob.arrayBuffer (the requested file could not be read, typically due to permission problems that have occurred after a reference to a file was acquired.) This should happen on files larger than 2GB (al least on Chrome), but we know that the size of the file the user is trying to import is around 300Mb (maybe they are low on memory?).

Those fall into the murky realm of browser quirks against which we have no direct insights or defenses :/. Each browser's limits are opaque to us, in the same way storage space and permissions are opaque to sqlite's C API. All we can do is try and hope there's space and permissions.

This is to say that an asynchronous API or some way to import the DB in chunks would be very useful,

i'm on a phone so can't readily confirm, but i thought we added that capability to importDb() right after the 3.43 release? It was implemented but i don't recall for certain whether it was merged (i'll check when i'm back on the computer).

Also it would be great to have a VFS API to move/rename a file: in our use case we are first downloading the file, then moving it at a given location in the VFS. Currently we download the file as blob, then when we want to move it we load the blob content and call the import database function to save it to the VFS at the desired location. Ideally we could just download the file directly to the VFS and move it (rename it) when needed.

Are you saying that you want to upload it into "someplace" in OPFS and then move it into the VFS after doing so? A "move" isn't directly possible in such a case because the VFS adds a 4kb header to each file, but we could copy-then-delete for that purpose as long as the temporary storage requirements (2-times the db size plus 4kb) don't exceed the OPFS limits and the VFS itself has a slot free.

(46) By franztesca on 2023-09-27 18:54:03 in reply to 45 [link] [source]

i'm on a phone so can't readily confirm, but i thought we added that capability to importDb() right after the 3.43 release? It was implemented but i don't recall for certain whether it was merged (i'll check when i'm back on the computer).

I'm checking in 3.43.1 I don't see them, but I see them in the repo, so I guess it's going to be available from the next release?

Are you saying that you want to upload it into "someplace" in OPFS and then move it into the VFS after doing so? A "move" isn't directly possible in such a case because the VFS adds a 4kb header to each file, but we could copy-then-delete for that purpose as long as the temporary storage requirements (2-times the db size plus 4kb) don't exceed the OPFS limits and the VFS itself has a slot free.

No, to write directly the DB to the VFS (e.g. using importDb) and later "move" the imported file to another a different filename (which is basically changing its name in the VFS). In this way, no copy of the file is needed.

Currently, instead, we have to copy the file because we first save it to a temporary location outside the VFS (raw OPFS or blob doesn't make much difference) and when we want to move it we copy it to the VFS using importDb.

(47.1) By Stephan Beal (stephan) on 2023-10-22 10:22:16 edited from 47.0 in reply to 46 [link] [source]

I'm checking in 3.43.1 I don't see them, but I see them in the repo, so I guess it's going to be available from the next release?

We generally consider the trunk to be our latest release ;). It won't be included in any 3.43.X patch releases because patch releases are intended to get only bug fixes, not new features. It will be in the 3.44 release.

No, to write directly the DB to the VFS (e.g. using importDb) and later "move" the imported file to another a different filename (which is basically changing its name in the VFS). In this way, no copy of the file is needed.

What's the justification for renaming it, as opposed to uploading it with the desired name in the first place? That sounds like unnecessary hoop-jumping.

A "move" in that particular VFS would essentially be a no-op: it would only overwrite a piece of the header in the target file (which has a random/opaque/immutable name in OPFS). (Edit: that's not entirely true: the hypothetical move operation would need to deal with the case that the target name is already in use.)

My hesitance (but not outright refusal) to adding move/rename/arbitrary file management APIs is based on none of the C-level VFSes offering such a thing. The C API is completely devoid of any such APIs which would ostensibly be useful from time to time, e.g. deleting or renaming files1. As the library has been devoid of such features since its inception, i'm hesitant to go against that design decision (which predates me by a full generation).

If you can justify to me why the development effort is worth the ability to rename the file (i.e. why is a rename necessary?), i can perhaps get over that reticence.

Thank you for your continued feedback - it is welcomed and valued.


  1. ^ Presumably because any such features would be necessarily VFS-specific (to match the VFS's storage, if any).

(48.1) By franztesca on 2023-09-29 08:32:06 edited from 48.0 in reply to 47.0 [link] [source]

My hesitance (but not outright refusal) to adding move/rename/arbitrary file management APIs is based on none of the C-level VFSes offering such a thing. The C API is completely devoid of any such APIs which would ostensibly be useful from time to time, e.g. deleting or renaming files1. As the library has been devoid of such features since its inception, i'm hesitant to go against that design decision (which predates me by a full generation).

I don't have a strong argument, if not that is cheap to implement and sometimes could be useful. In our use case we download the database and replace the one of the user on the flight when ready. Now, we cannot download it directly to the VFS, at the user location, because the download could stop and the app would remain in a broken state. So we have to first download somewhere (currently we use a blob) and move the file to the VFS the file when ready. If the VFS would support a move operation, we would download the file immediately to the VFS at a tmp location, and then just rename it to the user location, which would make the move operation cheap and safe, compared to a "read the content blob, write the content to the VFS", which is slow and can fail for reasons.

On another note, we got already two users (out of 2k) who started to get SQLITE_CORRUPT: sqlite3 result code 11: database disk image is malformed. Looking at the possible cause of corruption, I don't see any that we do in our web app that could cause that. One user is on Firefox 117, one on Safari 16.6. The users are not doing anything fancy with the DB (we use the same queries/code on SQLite desktop and we don't get corrupted). We just open the database with the default options, never close it, we have only one reference per database... Does any possible reason for corruption come to your mind?

Thank you

(49) By Stephan Beal (stephan) on 2023-09-29 09:52:07 in reply to 48.1 [link] [source]

In our use case we download the database and replace the one of the user on the flight when ready.

That sounds like a very reasonable thing to want to do in a web app.

What if we instead made a slight change to importDb() such that it could support the following workflow:

  1. You call importDb('/myapp.db', func), where func is a callback which returns the binary content of the db in chunks of arbitrary sizes (one chunk per call and it returns undefined at EOF). importDb() would internally use a temporary name during the import to enable the app to keep using /myapp.db.

  2. When func reaches the end of its input, before returning to importDb() it would close your current connection to /myapp.db (if any).

  3. importDb() receives your EOF notification and renames the being-imported temporary file to /myapp.db.

  4. Your app re-opens /myapp.db.

That sounds like it would be generic enough to serve your use case and would not require exposing the file-rename op to the public API. Folks who are importing to a new db, as opposed to replacing an in-use one, would simply skip the 2nd step.

Would that be a suitable solution for you?

On another note, we got already two users (out of 2k) who started to get SQLITE_CORRUPT: sqlite3 result code 11: database disk image is malformed. ... Does any possible reason for corruption come to your mind?

Nothing specific, no :(.

The VFS does not have any known cases where it will cause corruption "on its own," but OPFS is a black box and there are three(?) different implementations. Being black boxes, we unfortunately have zero insight into what happens between the time we call an OPFS API and the time the data reaches or arrives from the underlying storage.

That's not to just hand-wave the problem away, but without a reproducible example which demonstrates a bug on our end (which is of course a possibility!), there's not a whole lot we can do to resolve it.

Abstractly speaking, though, the laws of chance suggest that OPFS-housed data will occasionally fall victim to things like the split-second bad timing of a browser crash or force-close, power outages, or even interference from a virus scanners. There's a long path of electrons between the VFS and the physical storage, and many ways for things to fail (one of which, of course, is as-yet-unknown VFS bugs ;)).

(50) By franztesca on 2023-10-02 09:05:53 in reply to 49 [link] [source]

What if we instead made a slight change to importDb() such that it could support the following workflow:

In our use case we also do some additional stuff after the file is downloaded and before replacing it in-flight, so we end up having two components doing the two operations (download and import) separately. The callback-based API that you propose would require us to make those two components communicate with each other, where one starts the importing (the downloading one) and the other completes it (the importing one). We can probably do that with some promises and indirection, but sounds more cumbersome than a simple. Currently those two components "communicate" only by the fact that the first returns the download filename and the second is invoked with the download filename returned by the first.

The VFS does not have any known cases where it will cause corruption "on its own," but OPFS is a black box and there are three(?) different implementations. Being black boxes, we unfortunately have zero insight into what happens between the time we call an OPFS API and the time the data reaches or arrives from the underlying storage.

I'm all about augmenting the VFS as you can see :D I wonder whether it would be possible to add a way to get the errors generated within the VFS. Currently they are mapped to a generic SQLITE_IOERR and suppressed (logged to the console of the user), which makes it difficult to debug. Something like a "getLastError" or "setOnErrorCallback" or "pastErrors" API on the VFS would make monitoring much easier maybe. Currently we rolled out SQLite/WASM with OPFS-sahpool to around 8k users, and got the following errors:

  • SQLITE_IOERR: sqlite3 result code 10: disk I/O error (10 users)
  • SQLITE_CORRUPT: sqlite3 result code 11: database disk image is malformed (5 users)
  • SQLITE_CANTOPEN: sqlite3 result code 14: unable to open database file (1 user)
  • SQLITE_CORRUPT_INDEX: sqlite3 result code 779: database disk image is malformed (1 user)

While the number of errors is low (we are around 0.2% of users), it usually means that the user cannot use the web app anymore and lost some data potentially, which is very bad for our web app. Some of these errors (namely the corrupted ones) are recurrent (aren't solved by refreshing the page) and the user doesn't seem to get over it. We don't have many options to fix these from our side, but if we could augment the monitoring to be sure of what is going on exactly, we'd be happy to do so. WDYT?

(51) By Stephan Beal (stephan) on 2023-10-02 10:20:42 in reply to 50 [link] [source]

In our use case we also do some additional stuff after the file is downloaded and before replacing it in-flight

Then i'm torn on a solution. Exposing any storage- or VFS-specific filesystem functionality is troubling because it places a long-term maintenance and compatibility burden on us. Doing so for a still-moving-target like OPFS makes that stomach ache worse.

Let's first try the up-thread proposal, as it's client-agnostic and does not expose any such features directly. If that mechanism turns out to be too cumbersome to use then we can look into other options, but my concern right now is that this API needs to stay client-agnostic and it's currently feeling like we are very close to the border of features which are specific to your application.

Sidebar: i say this only half-jokingly, but also not joking at all, as something to keep in mind as an eventuality: one of the quirky advantages of this particular VFS is that it can be installed multiple times. That means, for example, that if a client (cough you cough) were to fork it, modify it for their specific needs, and deploy it alongside a canonical copy of the library, it would not interfere with the main library. i'd be happy to show you how to plug that into a custom build of the library (it would currently require a custom build because the VFS uses internals which are unplugged from the library after it initializes).

I wonder whether it would be possible to add a way to get the errors generated within the VFS.

That's actually what happens when the VFS reports an error. The VFS returns a non-zero code to the library and the library then has the option of fetching the associated message via the VFS's xGetLastError() method. That API, however, is limited to a string, so the VFS only propagates the Error.message value, as opposed to the stack trace and whatever other info there is.

Perhaps we could/should report full exception info out-of-band by firing them off to async event listeners.

While the number of errors is low (we are around 0.2% of users), it usually means that the user cannot use the web app anymore and lost some data potentially, which is very bad for our web app.

Though i'm entirely sympathetic to that, it's important to in mind that browser-side storage is, all wishful thinking and marketing statements to the contrary, demonstrably not as reliable as OS-level filesystem APIs1 and not long-term stable. That's unpleasant, certainly, but it's what we have.

In your app's case my primary suspects include virus scanners and use of OPFS on not-quite-suitable storage (e.g. SD cards or network-mounted corporate workstation storage), but that's of course wild speculation. If the error percentage were, say, 3% or 5%, a VFS bug would sound a lot more likely.

If the databases are not huge, it may be worth switching to a transient DB for users who repeatedly have it fail on their systems.

My TODOs from this thread:

  • Get back to you about the renaming and error reporting.

  • Look into creating an application which does nothing more than simulate VFS-like workloads on OPFS but without the VFS in the middle. This could be used to help rule out (or rule in!) environment-specific issues by demonstrating whether or not they can be reproduced without the VFS.

However i will be off of the computer much of this week so it may be the end of the week before i can get back to you on the first point and have no current estimate for the second.


  1. ^ e.g. I/O rates which exceed the garbage collector's ability to keep up can lead to browser crashes due to OOM.

(52) By franztesca on 2023-10-02 14:33:20 in reply to 51 [link] [source]

Then i'm torn on a solution. Exposing any storage- or VFS-specific filesystem functionality is troubling because it places a long-term maintenance and compatibility burden on us. Doing so for a still-moving-target like OPFS makes that stomach ache worse.

I see the pain, at the same time each VFS is quite unique and independent, having a VFS-specific API (rename) that makes the VFS "complete" and users able to work with it, seems fine. IDK what is your expectation in terms of additional changes on this particular VFS, but the rename API would live/die with the VFS. Hopefully one day this VFS will be deprecated in favour of a non-workaround VFS (even though I appreciate the fanciness and lateral thinking required for this solution :D). Your call obviously, just throwing 2 cents, we can probably work with any solution.

That means, for example, that if a client (cough you cough) were to fork it, modify it for their specific needs, and deploy it alongside a canonical copy of the library, it would not interfere with the main library. i'd be happy to show you how to plug that into a custom build of the library (it would currently require a custom build because the VFS uses internals which are unplugged from the library after it initializes).

Makes sense. We were considering forking (with the additional related costs) as last choice, but we ended up already patching the library to backport the importDbChunked from 3.44. I'll talk with the team.

That's actually what happens when the VFS reports an error. The VFS returns a non-zero code to the library and the library then has the option of fetching the associated message via the VFS's xGetLastError() method. That API, however, is limited to a string, so the VFS only propagates the Error.message value, as opposed to the stack trace and whatever other info there is.

Are we supposed to call xGetLastError ourselves on the VFS or it is a function meant to be used by SQLite? That would be already a great improvement.

Thanks a lot for the continue support and inputs :D

(53) By Stephan Beal (stephan) on 2023-10-02 15:21:07 in reply to 52 [link] [source]

Are we supposed to call xGetLastError ourselves on the VFS or it is a function meant to be used by SQLite? That would be already a great improvement.

My (mis?)understanding has always been that that's the message which clients will get via sqlite3_errmsg(), and internal docs suggest that that's the case:

...
** Not supplying an error message will have no adverse effect
** on SQLite. It is fine to have an implementation that never
** returns an error message:
**
**   int xGetLastError(sqlite3_vfs *pVfs, int nBuf, char *zBuf){
**     assert(zBuf[0]=='\0');
**     return 0;
**   }
**
** However if an error message is supplied, it will be incorporated
** by sqlite into the error message available to the user using
** sqlite3_errmsg(), possibly making IO errors easier to debug.

but a cursory look through the library sources suggest that that's only actually happening for the Windows-default VFS and the in-memory VFS.

Aside from that, after having looked through that code, i'm rather certain that the return semantics assumed by the VFS are not what the internals are looking for (which the public docs do not specify!). That will require a closer look and likely a bug fix Real Soon Now.1

Anyway...

My current thinking is to emit all errors via async event listeners, as that would (A) be relatively unobtrusive in the library, (B) would not block further I/O like an error-handling/reporting callback would, (C) would allow us to propagate all known error state instead of just a message, and (D) and would allow clients to collect or ignore such errors as they wish.

My RSI is acting up so i won't get around to this for at least a few days but it is very near the top of my TODO list, along with a first draft of using a temp name during db import.


  1. ^ It should apparently return the last known error code instead of 0 on success, but that's not what it does: it returns the error message string via the output pointer and returns integer 0 to say that it was successful in doing so.

(54) By franztesca on 2023-10-05 10:57:12 in reply to 53 [link] [source]

In the importDbChunked function, any exception is swallowed here and not re-thrown, meaning that if the import of the db fails we don't know the cause, the path is still associated to the sah, and the content of the file is partially imported. We are getting a bunch of

SQLITE_NOTADB: sqlite3 result code 26: file is not a database

errors since we started using that function, and we suspect that for some reason the importing is failing and we don't see that. The correct behavior seems to rethrow the exception at the end of te catch block, we will try to patch that way. Do you agree?

Thanks!

(55) By Stephan Beal (stephan) on 2023-10-05 11:04:59 in reply to 54 [link] [source]

In the importDbChunked function, any exception is swallowed here and not re-thrown

Indeed, that's horrible bug :/. It's patched now in the trunk:

Index: ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js
==================================================================
--- ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js
+++ ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js
@@ -898,10 +898,11 @@
         sah.write(new Uint8Array([1,1]), {
           at: HEADER_OFFSET_DATA + 18
         }/*force db out of WAL mode*/);
       }catch(e){
         this.setAssociatedPath(sah, '', 0);
+        throw e;
       }
       this.setAssociatedPath(sah, name, capi.SQLITE_OPEN_MAIN_DB);
       return nWrote;
     }

(56) By Randy (randyl) on 2023-10-20 02:31:20 in reply to 50 [link] [source]

franztesca,

Do you have any update on the corruption issues you've seen? I have a Chrome extension using SQLite Wasm (but it uses the opfs vfs, not opfs-sahpool), and I'm experiencing the same thing you've reported: a small percentage of users get a corrupted database, and it keeps recurring even after reinstalling the extension.

Thanks

(57) By Stephan Beal (stephan) on 2023-10-22 10:35:22 in reply to 51 [link] [source]

Then i'm torn on a solution. Exposing any storage- or VFS-specific filesystem functionality is troubling because it places a long-term maintenance and compatibility burden on us.

Good afternoon, franztesca,

i was finally able to sit down and look at the renaming-during-import feature. It introduces new error cases which would exist only to solve a self-induced client-level problem1, as opposed to library-level problem, which makes it very difficult to justify.

For example, renaming/moving file (A) over file (B) has deal with the case that (B) is already in use. If (B) is in active use, the library cannot do anything with it and the imported data will be lost. (We have to delay the check for this error case until the end of the import for your use case to work.) If (B) is not currently opened by a db handle but is in the filesystem, clearing its name and re-assigning its name to the newly-imported (A) handle has a race condition which could leave it susceptible to losing (B) without having the chance to rename (A) to (B), i.e. complete data loss unless we add some sort of journal support to deal with cases like that. There may well be other potential error cases there which i simply have not had enough coffee to recognize.

These problems don't exist in the current, simpler implementation and introducing them seems to be trading client-side difficulties for library-side difficulties.

Until/unless we can find ways around this type of issue, this isn't an as-yet-too-flaky capability we'd want in the library.

That said: since you are familiar with the code in question, if you can put together an implementation which does not expose new error cases, i am perfectly happy to revisit this.


  1. ^ Specifically, the ability to keep using the db while it's being imported. Considering that that's not a feature directly supported by any layer of the sqlite API, it seems to fall squarely into the category of client-level capability.

(58) By Stephan Beal (stephan) on 2023-10-22 10:47:36 in reply to 48.1 [link] [source]

On another note, we got already two users (out of 2k) who started to get SQLITE_CORRUPT: ...

A thought came to mind about this this morning: it is easy to corrupt WASM heap memory, even more-so than in C because C has many compile-time warnings to assist developers in that regard. If some piece of JS code has corrupted the WASM heap, the side effects can range from "nothing at all" to "anything at all," including misbehavior at any level of sqlite. Unfortunately, it can be next to impossible to track down the source of WASM heap corruption.

If (noting that this is a "big if") the db corruption is a side-effect of heap corruption, perhaps (this is just speculation) those clients are using some feature of the db the others are not and are triggering a WASM memory misuse bug, either in your app or this library.

Is that perhaps a possibility (however remote) in your case?