books / papers on "one sqlite per user" design ?
In traditional webapp design, we:
- have one giant database
- with lots of tables
- stuff all user data into the same database, in respective tables
I am curious about designs where we do one sqlite db per user.
We create a new sqlite database for each user.
Transactions involving multiple users will be expensive but hopefully rare.
Transactions involving only a single user should not block anyone else.
I am having difficulty finding books / articles / correct-search-term for this type of design.
Can anyone point me at books / articles / correct-search-term ?
The main focuses are (1) one sqlite db per user and (2) transactions involving multiple users are rare.
You might want to take a look at ActorDB for a working example. It builds on SQLite and splits data to many small database files called actors.
Well, actually it is much more complicated than that, see how it works. You can have one database (actor) per user, for example. It is mostly implemented in Erlang and hasn't seen much activity lately, and the latest binary release doesn't run on my system. YMMV.
One thing to be aware of, is that many small DBs waste more space than larger DBs
because the space overhead of non-full pages is better amortized;
especially if you have a large number of tables with fewer and small rows inside.
For example, I have a schema where most of the bulk ends up in only 3 tables,
and many other small lookup tables use just a single page each, with those
single-page B-Tree wasting quite of bit of space.
If I have hundreds of multi-MBs of those DBs, then I can live with that overhead.
But if I had millions of only a few KB DBs, then the overhead would become unacceptable.
For some tables, you want large pages, for performance reasons.
For others, you want small pages. And SQLite does not support that, unfortunately.
You'd be forced to use two different DBs, but then you loose referential integrity which does not cross DB boundaries.
Note also that one of the reason SQLite can be faster than the filesystem is because there are fewer
If you have millions of DBs, you are back into many fopen calls territory.
So some sharding is good (if your scale truly requires it, which is rarely the case...). But excessive sharding is not.
So do yourself a favor, and start with a single DB. And when (if really...) you actually outgrow that, look into alternative to a single DB.
Research for example how Expensify scales SQLite. There are other SQLite-derived projects for scaling too.
This discussion makes me want to spend a weekend writing a VFS that creates an SQLite database in the BLOB of another SQLite database. :-)
I saw somewhere a SQLite database which emulated a folder hierarchy and file storage. There was one table which held the folder structure and another table which held files, with their contents as BLOBs. It was used as part of a backup system and was faster than reproducing the folder structure and files using the file system.
I was toying with the idea of suggesting the author add a custom function to compress the BLOBs but never did it.