Writer Concurrency - server-process-edition

(1) By Wolfgang Oertl (w.oertl) on 2020-09-06 19:39:54 [source]

Hi,

if I'm not mistaken, SQLite allows only one writer at a time, regardless of the journal mode. I'd like to use SQLite for an application server, so concurrent writing would be a significant performance factor. At least I think so, and this might be my fallacy.

There is/was a branch called server-process-edition, which tackled this issue by doing page-level locking (instead of file-level) and using multiple journal files, one per concurrent write. While this looks promising, not much has happened since then.

I've tried to compile this server-process-edition branch, but it doesn't out-of-the-box. After some tinkering it did compile, but I failed to run the benchmark so far.

Is there any chance for this to be revived? Has development taken a different route, so that concurrency would better be increased by different methods?

Thanks, W. Oertl

(2) By Dan Kennedy (dan) on 2020-09-07 10:42:18 in reply to 1 [link] [source]

Has development taken a different route, so that concurrency would better be increased by different methods?

I think that's fair to say. That branch increases write concurrency compared to the begin-concurrent branch (which is embedded in server database Bedrockdb), but not by as much as we had hoped.

https://sqlite.org/src/doc/begin-concurrent/doc/begin_concurrent.md

(3) By phpstatic on 2020-11-19 15:11:03 in reply to 2 [link] [source]

I think that's fair to say. That branch increases write concurrency compared to the begin-concurrent branch (which is embedded in server database Bedrockdb), but not by as much as we had hoped.

On this test: https://sqlite.org/src/artifact/0c6bc6f55191b690

With 2rw 1ro, server-mode write TPS increase 140% compare to begin-concurrent. 160% read/write TPS. I think this is huge improvement.

Is the recent begin-concurrent branch get extra 40% TPS compare to 2017-07-31 version ?

(4.1) By Dan Kennedy (dan) on 2020-11-19 21:26:53 edited from 4.0 in reply to 3 [link] [source]

Is the recent begin-concurrent branch get extra 40% TPS compare to 2017-07-31 version ?

I doubt it.

The server-process-edition branch uses latches to manage readers and writers, whereas begin-concurrent is based on snapshot isolation. This means that if you tweak the locking primitives in the VFS and combine it with something like the wal2 branch, begin-concurrent scales to hundreds of concurrent readers, whereas the locking required of readers by server-process-edition means that performance of both readers and writers starts to drop off after a few dozen reader threads are added.

I suppose most users wouldn't really care - hardware that can run that many threads is quite uncommon. It is the reason begin-concurrent is preferred though.

(5) By phpstatic on 2020-11-21 20:35:44 in reply to 4.1 [link] [source]

The snapshot isolation based based begin-concurrent solution sound good.

Is there an optimal or recommend total number of connections limit for writer ? (incase add more connection will decrease the throughput, assuming that there are no page conflicts).

Will multi writer get more throughput compare to all transaction commit by one memory database in single thread mode with same schema and data (assuming no page conflicts)?

(6) By Dan Kennedy (dan) on 2020-11-23 10:50:45 in reply to 5 [link] [source]

Is there an optimal or recommend total number of connections limit for writer ? (incase add more connection will decrease the throughput, assuming that there are no page conflicts).

begin-concurrent basically serializes the "COMMIT" commands. So the answer depends on the ratio of time spent in COMMIT to the time spent in all the other statements of the transaction. That ratio could be quite low if transactions contain a lot of SELECT statements, or the thread does some other processing, or if there are read-cache misses that require access to the persistent storage. In those cases the answer might be fairly high - perhaps 16 or more. But if none of those things are true and the threads are just writing as fast as they can with no read-cache misses, I think it will be closer to 4 or thereabouts. But these are really just guesses.

Will multi writer get more throughput compare to all transaction commit by one memory database in single thread mode with same schema and data (assuming no page conflicts)?

Don't really know, but probably.

(7) By anonymous on 2021-10-12 12:34:03 in reply to 6 [link] [source]

Hi Dan Kennedy,

Thanks for the great work.

I has few question for the begin-concurrent action:

If I use singe-process mode, will the sqlite3_commit_hook/sqlite3_rollback_hook execute inside a lock for all threads (for multi connection to same database)?

I want get a 64bit unique transaction id for a database(after app reboot it still not get duplicate), is there a sqlite buildin value can be use for this purpose? (I like to read this value from sqlite3_commit_hook)