Benchmarking - slow selects on empty table

(1) By anonymous on 2023-06-25 15:41:40 [link] [source]

Hello!

I am doing a javascript based side project and have implemented the database with SQLite WASM.

Created an ActiveRecord-like ORM which messages the db worker, which then uses the OO API. OPFS is working.

So it's functional, but now I'm trying to maximize performance. It seems like individual queries are quite slow, so I added a bulk insert function to the db worker which makes use of db.transaction, and am quite happy with the performance (20,000/13,000 inserts per second for a small table in Firefox/Chrome).

What remains is select performance, which doesn't seem to get much faster than 10ms/6ms in Firefox/Chrome for a query on an empty table. I added the profiling directly around the call to db.selectObjects, so it's nothing to do with the ORM/message passing.

I've seen benchmarks floating around that show query times like this for 25000 record table full scans on non-web based SQLite, so this seems a bit slow.

Is this normal behavior or have I done something wrong?

(2) By Stephan Beal (stephan) on 2023-06-25 16:04:56 in reply to 1 [link] [source]

Is this normal behavior or have I done something wrong?

OPFS is a black-box API which will have different performance on different browsers in different environments. Much (approximately 30-35%, based on basic testing) of the OPFS driver's runtime is us waiting at JS's cross-thread communication boundaries, and there's nothing we can do to speed that up. The OPFS designers chose to give it an asynchronous API, which requires all sorts of hoop-jumping to get it working with fully-synchronous APIs like sqlite, and those hoops introduce unfortunate performance penalties.

I've seen benchmarks floating around that show query times like this for 25000 record table full scans on non-web based SQLite, so this seems a bit slow.

Random benchmarks floating around the internet mean absolutely nothing for purposes of comparison. There are high-end machines which can reportedly do 100k inserts a second and there are mobile or embedded devices which might, depending on several factors, struggle to do 15 per second. Similarly, SELECT performance varies wildly depending on the environment.

During the first 6 months of the wasm development we concentrated heavily on performance and got it as close to native speed as we apparently could. All such measurements, however, are highly environment-specific, and it's to be expected that measurements will be all over the place for different environments.

If you're expecting native-app speeds in the browser, you're going to be disappointed. That's unachievable because each layer of abstraction or transformation adds its own penalties.

(3) By anonymous on 2023-06-25 22:33:01 in reply to 2 [link] [source]

Hi Stephen

Thanks for your reply! So it sounds like the issue is to do with web browser implementation (mostly OPFS?). I don't really have any connections in the web browser community, is this something you think will improve in the future?

I have done another prototype with indexeddb and it doesn't have quite as good bulk insert performance, however the single queries have very low latency so hopefully that means OPFS has some room to improve in the future.

For others that come to the thread looking for a comparison, I'm running this on a decent desktop with SSD, hence the good bulk insert performance.

(4) By Stephan Beal (stephan) on 2023-06-25 23:46:34 in reply to 3 [link] [source]

So it sounds like the issue is to do with web browser implementation (mostly OPFS?).

Trying the same tests with an in-memory db will give you baseline for hypothetical "fastest possible" results. Adding a storage layer will always introduce latency, and the async nature of OPFS imposes it own hit because we have to wait on cross-thread delays inherent in JS engines.

During development we regularly saw it perform at approximately 3x native speeds, using a wasm build of this project's own native benchmarking tool as our basis for comparison:

https://wasm-testing.sqlite.org/speedtest1-worker.html?vfs=opfs&size=25

The caveat being that the queries in that benchmark are prepared in native code, so will be faster than preparation in JS, but the OPFS I/O goes through the same JS code as any other JS client app. That tool's speeds are well faster than the 6-10ms per SELECT you're reporting, which seems to imply that the OPFS storage, and the communication channel between it and sqlite, is not the bottleneck.

Sidebar: we have seen significant benchmarking penalties when feeding multi-MB SQL strings to wasm from JS, for reasons known only to the browsers' internals.

I don't really have any connections in the web browser community, is this something you think will improve in the future?

i don't dare speculate about what browser developers are currently doing or plan to do :/.

(5) By anonymous on 2023-06-26 10:57:20 in reply to 4 [link] [source]

Oh that link is perfect, thanks! My results:

Registered VFSes: unix-none opfs memdb unix-excl unix-dotfile unix

Running speedtest1. UI controls will be disabled until it completes.

Running speedtest1 with argv = speedtest1.wasm --big-transactions --size 25 --vfs opfs /speedtest1.sqlite3 -- Speedtest1 for SQLite 3.42.1 2023-05-18 23:28:33 0a0b7a2d3178f1aa650acd1d7295

100 - 12500 INSERTs into table with no index...................... 0.107s

110 - 12500 ordered INSERTS with one index/PK..................... 0.156s

120 - 12500 unordered INSERTS with one index/PK................... 0.149s

130 - 25 SELECTS, numeric BETWEEN, unindexed...................... 0.030s

140 - 10 SELECTS, LIKE, unindexed................................. 0.048s

142 - 10 SELECTS w/ORDER BY, unindexed............................ 0.059s

145 - 10 SELECTS w/ORDER BY and LIMIT, unindexed.................. 0.042s

150 - CREATE INDEX five times..................................... 0.177s

160 - 2500 SELECTS, numeric BETWEEN, indexed...................... 0.024s

161 - 2500 SELECTS, numeric BETWEEN, PK........................... 0.023s

170 - 2500 SELECTS, text BETWEEN, indexed......................... 0.034s

180 - 12500 INSERTS with three indexes............................ 0.167s

190 - DELETE and REFILL one table................................. 0.666s

200 - VACUUM...................................................... 29.846s

210 - ALTER TABLE ADD COLUMN, and query........................... 0.113s

230 - 2500 UPDATES, numeric BETWEEN, indexed...................... 2.898s

240 - 12500 UPDATES of individual rows............................ 4.129s

250 - One big UPDATE of the whole 12500-row table................. 4.291s

260 - Query added column after filling............................ 0.012s

270 - 2500 DELETEs, numeric BETWEEN, indexed...................... 8.288s

280 - 12500 DELETEs of individual rows............................ 6.780s

290 - Refill two 12500-row tables using REPLACE................... 17.942s

300 - Refill a 12500-row table using (b&1)==(a&1)................. 6.009s

310 - 2500 four-ways joins........................................ 0.061s

320 - subquery in result set...................................... 0.080s

400 - 17500 REPLACE ops on an IPK................................. 0.232s

410 - 17500 SELECTS on an IPK..................................... 0.030s

500 - 17500 REPLACE on TEXT PK.................................... 0.227s

510 - 17500 SELECTS on a TEXT PK.................................. 0.041s

520 - 17500 SELECT DISTINCT....................................... 0.056s

980 - PRAGMA integrity_check...................................... 0.088s

990 - ANALYZE..................................................... 0.194s

   TOTAL.......................................................   82.999s

Will continue to experiment. Thanks again.

(6) By Stephan Beal (stephan) on 2023-06-26 11:08:05 in reply to 5 [link] [source]

My results:

And a brief glance, that all looks more or less normal to me. Why vacuum takes so long is unclear - it didn't always but it certainly does now on my systems (a decade-old laptop and a low/mid-end Android phone). i'll take a closer look at that as time allows and try to determine if we can speed that up or whether it's an OPFS bottleneck. (Ah, we recently had a locking-related fix which might account for that...)

Note that by changing the "size=N" parameter you can change the size of the workload. 100 is a "full" workload, i.e. the amount of work the native version of that app does by default. The link i sent uses 25, which is 25% of the "full" workload and represents what i personally believe to be a relatively heavy workload for web apps. Experiment with that number to increase or decrease the workload size. Run-times will increase significantly for any size=N over roughly 50. A value of 5 or 10 will run in no time flat... except for that pesky vacuum which used to run quickly.

Sidebar: the wasm-testing site gets updated whenever any significant changes are made to the JS/wasm bits.

(7.2) By mlaw (tantaman) on 2023-07-05 18:20:17 edited from 7.1 in reply to 2 [link] [source]

...driver's runtime is us waiting at JS's cross-thread communication boundaries, and there's nothing we can do to speed that up

Even if you use Roy's workaround which allocates all needed handles up front? Since, iirc, the only OPFS API that does not have a synchronous equivalent is the one used for opening the file.

(8) By Stephan Beal (stephan) on 2023-07-05 18:43:54 in reply to 7.2 [source]

Even if you use Roy's workaround which allocates all needed handles up front?

That's not a viable library-level workaround, but might be useful for specific applications. From the library we can't sensibly predefine a fixed list of hard-coded filenames for clients, nor maintain a fixed-length map of client-defined filenames to well-defined internal names. That falls into the category of "unsightly workaround."

Since, iirc, the only OPFS API that does not have a synchronous equivalent is the one used for opening the file.

That might be the case now (i'm not certain), but we're stuck supporting year-old browsers where the close() op is async.

Unless i'm mistaken, which is certainly possible, obtaining a sync handle is also asynchronous and we have to be able to obtain and release those handles frequently in order to support any level of concurrency. Until tomorrow i'm limited to a phone for internet so cannot easily go dig through the code and may be misremembering, but my (mis?)recollection is that relinquishing a sync handle is also an async operation (or was in earlier OPFS versions which we are now stuck supporting).

All of that being said: there is nothing, other than development effort and maintenance burden, which prevents us from having multiple OPFS-backed VFSes which take different strategies and/or support different versions of the OPFS API. i.e. we're not stuck with only the current implementation forever.

(9) By Roy Hashimoto (rhashimoto) on 2023-07-05 21:03:50 in reply to 8 [link] [source]

From the library we can't sensibly predefine a fixed list of hard-coded filenames for clients, nor maintain a fixed-length map of client-defined filenames to well-defined internal names. That falls into the category of "unsightly workaround."

For the record, my "unsightly workaround" has no "hard-coded filenames for clients" or "well-defined internal names".

we're stuck supporting year-old browsers where the close() op is async.

Your requirements are your requirements, and don't need to make sense to me or anyone else. But I think your position really means continuing support for Chromium browsers 102 to 107. My understanding is you still don't work on Safari which now has synchronous methods, and when Firefox added OPFS support it came with synchronous methods from the start. Chromium began access handle support at 102 and synchronous methods arrived in 108 (less than 6 months later) so that's the only window of concern.

By my reading of browser market share data, Chrome and Edge 102-107 together have a browser share of 0.59%. That seems small enough to direct effort somewhere else.

(10.1) By Stephan Beal (stephan) on 2023-07-05 23:50:19 edited from 10.0 in reply to 9 [link] [source]

For the record, my "unsightly workaround" has no "hard-coded filenames for clients" or "well-defined internal names".

That reference was regarding one of your articles on the topic of pre-allocating files behind (IIRC) a virtual filesystem, and it's very likely that i was not referring to the intended article/workaround. My apologies for any inadvertent offense.

Edit: i had the first comment on this one in mind.

But I think your position really means continuing support for Chromium browsers 102 to 107.

Those were the versions we developed against and the library hasn't undergone significant development since then because of Real Life logistics which are just now finally being resolved. In the foreseeable future (possibly as little as a month) i'll once again have a proper workstation and can revisit the current state of OPFS in current browser versions.

My understanding is you still don't work on Safari which now has synchronous methods,

Safari's issue is reportedly one of sub-workers misbehaving and it's reportedly resolved for their next release. That's admittedly all hearsay, though.

By my reading of browser market share data, Chrome and Edge 102-107 together have a browser share of 0.59%. That seems small enough to direct effort somewhere else.

And though i fundamentally agree, this project has a long history of strong backwards compatibility guarantees, so any such breakage has to be weighed against that. The more likely approach for us would be an alternative implementation with a different VFS name and supporting the OPFS APIs as they are currently being deployed.

(11) By Roy Hashimoto (rhashimoto) on 2023-07-06 17:25:05 in reply to 10.1 [link] [source]

That reference was regarding one of your articles on the topic of pre-allocating files behind (IIRC) a virtual filesystem, and it's very likely that i was not referring to the intended article/workaround. My apologies for any inadvertent offense.

Edit: i had the first comment on this one in mind.

That is in fact the "unsightly workaround" under discussion, so your reference was not mistaken:

I have an idea for a less general but much simpler implementation (that would still be synchronous to SQLite). The idea is to create a pool of OPFS access handles that always remain open and can be reused as needed. This caps the total number of files in the file system (not just the number of open files) to the number of access handles, which can be configured by the user. This should work for most SQLite applications because while you can create/open an arbitrarily large number of database files in SQLite (e.g. with 1,000 ATTACH statements), users typically don't (edit: actually by default the limit is 10).

Neither the description nor the implementation specifies hard-coded names or mappings, whether above or below the filesystem abstraction.

(12) By Stephan Beal (stephan) on 2023-07-06 18:10:27 in reply to 11 [link] [source]

Neither the description nor the implementation specifies hard-coded names or mappings, ...

"Hard-coded" name mappings was a mischaracterization on my part of this snippet:

All the file system files belonging to the VFS will go into one OPFS directory - remember, we're not exposing the OPFS directory structure to SQLite. Each OPFS file will have a randomly generated name, and the number of these files will be the cap. The cap will be increased by adding more OPFS files and decreased by removing unassigned OPFS files.

SQLite doesn't do anything with directories so file paths can essentially be treated as keys. Any file that SQLite creates will be uniquely associated with one of the OPFS files. That association is made by writing the path at the beginning of the OPFS file (this header will also contain some other fields described later). After the header will be all the data SQLite reads and writes.

and the hard-coded cap was an outright misrecollection on my part.

With the caveat that it's possible that i am sorely misinterpreting something, the part which i characterize as "unsightly" is that the files which live in OPFS do not have the names the client code gives them. If a user creates foo/bar.db¹ using the equivalent of sqlite3_open() then it's important to me that a db file with that name be found in OPFS. Similarly, if a user uploads a valid db as abc.db to OPFS using either the OPFS API or a browser extension which lets them manipulate it, it's important to me that the VFS be able to read that file even though the sqlite API didn't put it there.

It was not, and is never, my intent to cause offense by disagreeing with any given code or implementation strategy, and you have my apologies if i've done so. Likewise, you have my apologies for the mislabeling/mischaracterization caused by my having worked from age-old recollection rather than refreshing my memory first.

^{^} though the C API does not do so, our OPFS VFS implementation creates directory parts on demand because it seems more webdev-friendly to do that, as opposed to requiring the client to deal directly with the OPFS API themselves to create directories.

(13) By Roy Hashimoto (rhashimoto) on 2023-07-07 14:28:15 in reply to 12 [link] [source]

It was not, and is never, my intent to cause offense by disagreeing with any given code or implementation strategy, and you have my apologies if i've done so.

Reasoned disagreement or criticism is not offensive...unless the reasoning is based on falsehoods. Which in isolation could be an honest mistake...but seems to be a recurring pattern.

With the caveat that it's possible that i am sorely misinterpreting something, the part which i characterize as "unsightly" is that the files which live in OPFS do not have the names the client code gives them.

This is a valid negative aspect of my OPFS approach, along with having to handle concurrency at the application level. I still think it will often (and perhaps usually) be worth the tradeoff for higher performance (which was your stated focus) and easier deployment (no COOP/COEP).

(14) By Stephan Beal (stephan) on 2023-07-07 16:21:46 in reply to 13 [link] [source]

Reasoned disagreement or criticism is not offensive...unless the reasoning is based on falsehoods. Which in isolation could be an honest mistake...but seems to be a recurring pattern.

i'll admit to having had a genuinely embarrassing number of misunderstandings and misrecollections regarding OPFS in general, in the contexts of your works and unrelated info/docs, which have repeatedly led to my posting what may fairly be called falsehoods (albeit never intentional ones). i recognize that and commit to making a better effort of understanding the topics in detail before spouting what amounts to misinformation.

... along with having to handle concurrency at the application level. I still think it will often (and perhaps usually) be worth the tradeoff for higher performance (which was your stated focus) and easier deployment (no COOP/COEP).

As my currently-underway relocation winds down i will soon be able to return to exploring further OPFS implementation options such as that one, WebLocks, and JSPI (JS Promise Integration (with WASM)). The lack of COOP/COEP is a particularly compelling advantage i'd not thoroughly considered.

(15.1) By Roy Hashimoto (rhashimoto) on 2023-07-07 23:48:49 edited from 15.0 in reply to 14 [link] [source]

The lack of COOP/COEP is a particularly compelling advantage i'd not thoroughly considered.

You could also get Safari compatibility back to 16.4+ (March 2023) because you wouldn't have to spawn a Worker from a Worker (I'm assuming the spawning is only done for OPFS). It sounds like the current version won't work until Safari 17 (Fall 2023). That would be notable not so much because of the 6+ month difference, but because the step from 16 to 17 means some older Macs and iPhones will be left behind. I believe that you personally (organizationally?) are somewhat indifferent to Safari support, but I think that a fair number of web developers will care.

(16) By Stephan Beal (stephan) on 2023-07-08 06:42:09 in reply to 15.1 [link] [source]

You could also get Safari compatibility back to 16.4+ (March 2023) because you wouldn't have to spawn a Worker from a Worker (I'm assuming the spawning is only done for OPFS).

Correct - only to "de-async" the OPFS/VFS communication.

It sounds like the current version won't work until Safari 17 (Fall 2023).

That's my understanding, based solely on second-hand reports.

i will definitely explore your approach as soon as my move-in is complete, but that may be another month :(. The advantages of not requiring COOP/COEP are finally settling in and there's no reason we can't offer multiple VFS implementations.

I believe that you personally (organizationally?) are somewhat indifferent to Safari support

It's organizational only in the sense that it applies to the entire single-person JS/WASM sub-team ;). An uncommonly severe case of chronic RSI forced me into perpetual medical leave/early retirement in 2016 and requires me to be particularly choosy about where my on-keyboard time goes. Any development targets/platforms not used in this household quite simply fall out of that scope. Similarly, actively following browser-specific developments and bleeding-edge/not-yet-standard web features are almost invariably left to those with the bandwidth and physical capacity to keep up with them, OPFS being the notable exception.

As this sub-project's membership evolves, so will, undoubtedly, the focus on particular target platforms :).

(18.1) By Stephan Beal (stephan) on 2023-07-13 19:34:22 edited from 18.0 in reply to 15.1 [link] [source]

The lack of COOP/COEP is a particularly compelling advantage i'd not thoroughly considered.

You could also get Safari compatibility back to 16.4+ (March 2023) because you wouldn't have to spawn a Worker from a Worker

@Roy this keeps coming back to itch me. i've been studying your pool VFS today:

https://github.com/rhashimoto/wa-sqlite/blob/master/src/examples/AccessHandlePoolVFS.js

and am very interested in re-implementing it for use in our infrastructure. What i'm not interested in, though, is any licensing headaches for following your implementation too closely ;). If you're explicitly okay with me doing a more or less 1-to-1 port to our infrastructure¹, noting that the resulting code would be Public Domain, that's something i'd be very interested in tackling over the next couple of weeks.

^{^} Obviously with credit where it is due!

(19) By Roy Hashimoto (rhashimoto) on 2023-07-13 21:14:04 in reply to 18.1 [link] [source]

If you're explicitly okay with me doing a more or less 1-to-1 port to our infrastructure, noting that the resulting code would be Public Domain, that's something i'd be very interested in tackling over the next couple of weeks.

My project is built on SQLite, so I'm happy to allow SQLite to build on this part of my work and release the result into the public domain.

(20) By Stephan Beal (stephan) on 2023-07-15 11:31:49 in reply to 19 [link] [source]

My project is built on SQLite, so I'm happy to allow SQLite to build on this part of my work and release the result into the public domain.

Good afternoon, Roy,

Introducing the opfs-sahpool VFS:

https://wasm-testing.sqlite.org/speedtest1-worker.html?vfs=opfs-sahpool&size=15

It completely lacks concurrency, but it's fast! Change the size=X param to modify the workload size. A value of 100 is the default for the native CLI speedtest1 app but values of 15-25 (percent) seem to me like reasonably high workloads for web apps.

Can you confirm or deny whether that works on Safari 16.4+? It works here in Chrome-likes and Firefox v115.

This isn't yet in trunk but will likely be within the next day or two.

The code is at:

src:/finfo?name=ext/wasm/api/sqlite3-vfs-opfs-sahpool.js

(tap the left-most hash of the top-most entry for the current version)

Aside from stylistic changes related to the different framework, it's a 1-to-1 port of your VFS and implementation was as painless as can be thanks to you having already done the hard parts :).

There's still some minor refinement to do (e.g. a way to wipe the VFS clean, as well as helpers to import/export DBs) and the plan is to get that done before the 3.43 release.

Example output from Firefox on my decade-old laptop (2.4GHz):

Running speedtest1 with argv = speedtest1.wasm --big-transactions --memdb --singlethread --size 25 --vfs opfs-sahpool /speedtest1.sqlite3
-- Speedtest1 for SQLite 3.43.0 2023-07-13 14:49:39 ce6793e954f291b6f5c29175baf7
100 - 12500 INSERTs into table with no index...................... 0.034s
110 - 12500 ordered INSERTS with one index/PK..................... 0.047s
120 - 12500 unordered INSERTS with one index/PK................... 0.074s
130 - 25 SELECTS, numeric BETWEEN, unindexed...................... 0.045s
140 - 10 SELECTS, LIKE, unindexed................................. 0.093s
142 - 10 SELECTS w/ORDER BY, unindexed............................ 0.125s
145 - 10 SELECTS w/ORDER BY and LIMIT, unindexed.................. 0.077s
150 - CREATE INDEX five times..................................... 0.080s
160 - 2500 SELECTS, numeric BETWEEN, indexed...................... 0.031s
161 - 2500 SELECTS, numeric BETWEEN, PK........................... 0.030s
170 - 2500 SELECTS, text BETWEEN, indexed......................... 0.068s
180 - 12500 INSERTS with three indexes............................ 0.099s
190 - DELETE and REFILL one table................................. 0.092s
200 - VACUUM...................................................... 0.068s
210 - ALTER TABLE ADD COLUMN, and query........................... 0.004s
230 - 2500 UPDATES, numeric BETWEEN, indexed...................... 0.032s
240 - 12500 UPDATES of individual rows............................ 0.041s
250 - One big UPDATE of the whole 12500-row table................. 0.015s
260 - Query added column after filling............................ 0.002s
270 - 2500 DELETEs, numeric BETWEEN, indexed...................... 0.073s
280 - 12500 DELETEs of individual rows............................ 0.057s
290 - Refill two 12500-row tables using REPLACE................... 0.224s
300 - Refill a 12500-row table using (b&1)==(a&1)................. 0.105s
310 - 2500 four-ways joins........................................ 0.110s
320 - subquery in result set...................................... 0.162s
400 - 17500 REPLACE ops on an IPK................................. 0.076s
410 - 17500 SELECTS on an IPK..................................... 0.024s
500 - 17500 REPLACE on TEXT PK.................................... 0.100s
510 - 17500 SELECTS on a TEXT PK.................................. 0.058s
520 - 17500 SELECT DISTINCT....................................... 0.056s
980 - PRAGMA integrity_check...................................... 0.161s
990 - ANALYZE..................................................... 0.027s
TOTAL....................................................... 2.290s

Chrome v116 is inconsistently slightly faster or slower than that.

(21) By Roy Hashimoto (rhashimoto) on 2023-07-15 14:15:39 in reply to 20 [link] [source]

Introducing the opfs-sahpool VFS:

That's great! I'm happy to hear you got it working so quickly and that the performance looks good.

Can you confirm or deny whether that works on Safari 16.4+? It works here in Chrome-likes and Firefox v115.

I have Safari 16.5.1 here. I can get it to run successfully, but it is a little testy on startup. Sometimes when I load the page I will get this in the console:

[Warning] Ignoring inability to install opfs-sahpool sqlite3_vfs: – "The object is in an invalid state." (speedtest1.js, line 17568)
InvalidStateError: The object is in an invalid state.
(anonymous function)

And of course running the benchmarks then fails because there is no VFS. I see that the original VFS is failing to load (as expected I presume). Is it possible that both old and new OPFS VFS are trying to load concurrently and there's a race condition? I'll try to poke around a bit later.

This isn't yet in trunk but will likely be within the next day or two.

I would encourage you to try some of the filesystem transparency ideas before you release. If you can at least move the VFS metadata into a separate file and put all the current files into a "hidden" subdirectory, then you can add transparency as a new feature later to this VFS instead of creating a completely new VFS (because the file structure will be the same). That means that your early adopter web developers won't have to migrate database files, and you won't have yet another VFS to wrangle.

(22) By Stephan Beal (stephan) on 2023-07-15 14:25:11 in reply to 21 [link] [source]

I see that the original VFS is failing to load (as expected I presume). Is it possible that both old and new OPFS VFS are trying to load concurrently and there's a race condition?

They do load semi-concurrently but that "shouldn't" pose any problems because they don't access the same resources except the C-level VFS registration, but they won't ever run that at exactly the same time because JS doesn't preemptively multitask.

So far i have not yet seen that precise error in FF/Chrome but a similar one: when reloading the page, it's possible that the just-disposed tab has not yet given up its hold on the SAHs, so the VFS can fail to load because at least one of the SAHs is already in use. The workaround is to close the tab altogether, wait a few seconds, and try again :/.

Similarly, having 2+ tabs open to any of the pages in the test site will lead to the VFS not loading on any but the first tab. That's "as expected," however unfortunate.

Perhaps this VFS should be made explicitly opt-in, via its own init function, to avoid that a page which isn't actually using the VFS (but has it loaded) locks another page which would like to use the VFS but can't because the first page has locked the SAHs? That wouldn't be a particular problem to do.

I would encourage you to try some of the filesystem transparency ideas before you release.

Definitely. This isn't a done deal but it seems to be at least at the "minimal viable product" stage.

(24) By Roy Hashimoto (rhashimoto) on 2023-07-15 16:06:36 in reply to 21 [link] [source]

I have Safari 16.5.1 here. I can get it to run successfully, but it is a little testy on startup. Sometimes when I load the page I will get this in the console:
InvalidStateError: The object is in an invalid state. (anonymous function)

I can get this to happen most often when I have the benchmark already up in a Safari tab, and then change the size in the URL bar and press return. It throws InvalidStateError on this line:

const dh = await navigator.storage.getDirectory();

I don't think this should ever reject, so this would be a Safari bug.

Suspiciously, the bug for how OPFS fails in a nested Worker throws the same exception on the same method. I wonder if when the original OPFS VFS fails on Safari, it somehow breaks OPFS for the whole tab. I don't suppose there is any magic URL setting to suppress attempted loading for the original OPFS VFS to see if that fixes it?

It would be interesting to see whether this goes away with the nested Worker bug fix with the Safari 17 preview, but my Mac isn't supported past 16.x.

(25) By Stephan Beal (stephan) on 2023-07-15 16:20:06 in reply to 24 [link] [source]

I don't suppose there is any magic URL setting to suppress attempted loading for the original OPFS VFS to see if that fixes it?

There is not but we can certainly add one. We already have other URL flags which influence that one, so the infrastructure is there. i'll get to that as soon as i get the breakage mentioned in my last post squared away.

(23) By Stephan Beal (stephan) on 2023-07-15 16:03:41 in reply to 20 [link] [source]

https://wasm-testing.sqlite.org/speedtest1-worker.html?vfs=opfs-sahpool&size=15

FACEPALM: that invocation is enabling the --memdb flag, and that flag trumps any --vfs flag, so of course it's fast. Removing that flag breaks it, so i obviously need to go do some reworking.

My sincere apologies for the confusion!

Sigh.

(26) By Stephan Beal (stephan) on 2023-07-15 19:22:58 in reply to 23 [link] [source]

FACEPALM: that invocation is enabling the --memdb flag, and that flag trumps any --vfs flag, so of course it's fast. Removing that flag breaks it, so i obviously need to go do some reworking.

The problem has been fixed. It boiled down to my not having set up the corresponding sqlite3_file object via xOpen(). It's roughly 3-4x as fast as the "opfs" VFS via speedtest1:

"opfs": https://wasm-testing.sqlite.org/speedtest1-worker.html?vfs=opfs&size=15 (that one's bugbear is VACUUM, for reasons as yet unknown)
"opfs-sahpool": https://wasm-testing.sqlite.org/speedtest1-worker.html?vfs=opfs-sahpool&size=15

Noting that opening both of those at once will lead to locking errors in the latter because speedtest1 automatically enables that VFS (in both instances) if it's found. That's another change in the last checkin: that VFS now requires explicit opt-in in order to avoid some unfortunate locking cases. In the simplest cases it's just:

await sqlite3.installOpfsSAHPoolVfs();

with the caveat that "await" cannot be used just anywhere. What exactly that Promise resolves to is likely to change: it currently resolves to the sqlite3 namespace object but it would be more useful to resolve to some object which contains functions for working with the SAH pool. That's TODO. Also, maybe the ability to pass on some config options to the installation is in order.

(27) By Roy Hashimoto (rhashimoto) on 2023-07-15 21:01:31 in reply to 26 [link] [source]

It's roughly 3-4x as fast as the "opfs" VFS via speedtest1

If it wasn't faster then something would be wrong, but that's even better than I hoped.

(28) By Roy Hashimoto (rhashimoto) on 2023-07-16 16:20:00 in reply to 26 [link] [source]

Because the sahpool VFS doesn't support multiple connections, there's no reason not to use exclusive locking. That should improve performance some because SQLite wouldn't have to read the first page from storage on each new transaction to check if the cache is still valid.

Since you apply other VFS-specific PRAGMAs on opening a database to optimize performance, setting exclusive locking might be something else to consider. However, for benchmarking, remember that exclusive locking should be set for each other VFS for a fair comparison. This will have the biggest effect on small read transactions (write transactions have high overhead so the percentage improvement there will be lower).

(29) By Stephan Beal (stephan) on 2023-07-16 17:04:29 in reply to 28 [link] [source]

Since you apply other VFS-specific PRAGMAs on opening a database to optimize performance, setting exclusive locking might be something else to consider.

That is a genuinely interesting idea. After looking into it, though, it might have undesired consequences...

Per the pragma docs:

When the locking_mode pragma specifies a particular database, for example PRAGMA main.locking_mode=EXCLUSIVE; then the locking mode applies only to the named database.

Which would suffice until someone ATTACHes a db in the same VFS, in which case the newly-attached db wouldn't get that locking mode (which isn't an error, it would just lose the performance benefit). i.e. it wouldn't be consistently applied to newly-attached databases. The automatic execution of SQL (the pragmas you mention above) is only applied to dbs opened via the DB class constructor, so ATTACHed dbs are unaffected, as are any opened via the C-style APIs.

The docs go on to say:

If no database name qualifier precedes the "locking_mode" keyword then the locking mode is applied to all databases, including any new databases added by subsequent ATTACH commands.

That would actually be great so long as all ATTACHes are on this VFS, but it's absolutely possible to connect to, say, a db via the other OPFS VFS using ATTACH. Though exclusively locking those isn't fundamentally a problem, implicitly doing it at the library level via a potentially non-intuitive "side door" feels iffy. It sounds like that decision needs to be made at the app level.

(17) By Roy Hashimoto (rhashimoto) on 2023-07-09 22:33:07 in reply to 12 [link] [source]

With the caveat that it's possible that i am sorely misinterpreting something, the part which i characterize as "unsightly" is that the files which live in OPFS do not have the names the client code gives them. If a user creates foo/bar.db1 using the equivalent of sqlite3_open() then it's important to me that a db file with that name be found in OPFS. Similarly, if a user uploads a valid db as abc.db to OPFS using either the OPFS API or a browser extension which lets them manipulate it, it's important to me that the VFS be able to read that file even though the sqlite API didn't put it there.

I believe filesystem transparency is achievable when the VFS is not in use. Proof left as an exercise for... well, you.