"Avoid SQLite In Your Next Firefox Feature" ten years later

(1) By anonymous on 2024-10-30 20:59:17 [link] [source]

In 2014, a Mozilla developer wrote an article called "Avoid SQLite In Your Next Firefox Feature" on the internal Mozilla Wiki. (I found it via a 2015 Jim Nelson article on VACUUM policies for a GNOME app.)

The Mozilla Wiki article has some interesting and forceful criticism of using "SQLite as a default choice for storing any non-trivial amount of data" based on experiences with SQLite in Firefox. I found the advice interesting for how it contrasted with advice from Dr. Hipp over the years:

Mozilla Wiki, 2014: "[If you need to store a small amount of data (less than 1MB), you should use JSON files, and do all your I/O off the main thread. Simple solutions with linear worst-case performance are ideal…If your workload involves a lot of blobs, don't use SQLite. Store your data in external files."
Dr. Hipp, 2006: "SQLite is very good for…configuration files…You still have a lot of little configuration files in the Unix world and Sqlite is a good place to store that sort of information.."
Dr. Hipp, 2014: "I advocate using SQLite for configuration files...Where you'd want to consider using SQLite for an application for a configuration file is where you're mixing binary data with the configuration file…"

Mozilla Wiki, 2014: "SQLite uses fsync's to guarantee transaction durability and enable recovery from crashes. fsyncs can be very expensive, and happen relatively frequently in the default rollback-journal mode. This performance/reliability trade-off might not be necessary for simple storage needs, especially if the important data can be rebuilt at any time. JSON files or log files will show better I/O patterns almost every time, especially if they're compressed and read/written in entirety each time."
Dr. Hipp, 2017: "So let your take-away be this: read/write latency for SQLite is competitive with read/write latency of individual files on disk. Often SQLite is faster. Sometimes SQLite is almost as fast."

Mozilla Wiki, 2014: "[Main-thread SQL is a known evil, and luckily there are only a few major sources of it left in the codebase, but the Slow SQL dashboard shows that developers are still getting bitten by main thread I/O from unexpected sources".
Mozilla Wiki, 2014: "It should go without saying that you should never execute SQL on the main thread".
SQLite FAQ, 2007: "Threads are evil. Avoid them."

Now it's one decade later, and Firefox (and maybe millions of other client-side app codebases) are still heavily using SQLite, where they could have used JSON or files, to store user configuration, text, and blobs.

The general advice like implementing an expiration policy for all data will stay true forever (though I don't see how ballooning file sizes without data expiration isn't a problem for JSON files either…). But I know that many of the statements in the Mozilla Wiki article are out of date, e.g., "Factory-default SQLite page size is [no longer] 1024 bytes."

I'm wondering how much of this article was good general advice in 2014, and how much of it no longer applies in 2024. Had this article always been, as Jim Nelson said in 2015, "overkill", or did it have good points back then about avoiding SQLite when possible (and never using SQLite on the main application thread)?

Most importantly, which parts of its advice might be still valid or useful today?

(2.1) By Anton Dyachenko (ahtoh_) on 2024-10-31 22:02:39 edited from 2.0 in reply to 1 [link] [source]

My current task is to move tracks album arts from the db on disk as ordinary files that were in the db for last ~10 years. There are many reasons for this:

keeping different importance/reasons of data in the same db file is a mistake
- we don't care if arts been corrupted or missed we can just restore them from the tracks
- arts are just a cache/thumbnails that save some IO and CPU for resizing to a specific dimension in UI
- arts are not searchable binary blobs you don't need sql for arts, other words the only operation you can do from sql with blobs is read/write
- however other track's metadata is searchable and consumes only 10% (actually 1% but the other 9% is another blob - waveforms) of the size of arts
arts located in a separate table (and shared/referred by foreign key among several tracks) therefore does not affect performance of other tables directly but
- vacuum/backup of the database takes a lot of time while arts only added to the db and never changes sometimes removed
- in case we need to change track table schema significantly due to FK to arts table we needs to copy its content to temp tables making such migrations very slow
- not related to the sqlite but the last drop that makes this epic task to be implemented was a bug in mac os fat driver in sonoma which causes our db to be corrupted when they are partially copied to an external drive with such FS. The real cause is unknown to me but we figured out that it seems existing more or less safe size of data we can write on disk without file corruption. So removing 90% of db size seems avoiding corruption even for most of demanding users with gigabytes (of original with arts) db sizes

there are ofcourse advantages to have arts in the db like:

it is simple to fetch arts from sql rather then a separate file IO code. So I've added user defined function that doing exactly this. You can have sql interface to a data on disk, ideal solution is to make a vtab but this is too much for us. Just a function works fine
we need to calculate sizes of data we are going to copy on extrernal drive so having arts in the db helps even here rather then performing separate file IO for getting sizes, but as before I just add another function for this. It is not as fast as just getting size of the blob but works good

So my personal advice would be do not mix blobs with real data in the same db file unless you have no choice. It is fine to have them in separate db file which is attached to the main db if this is more reasonable than ordinary files though but first consider user functions and/or vtable. So basically IMHO keeping blobs in sqlite db is the last choice on the list of alternatives.

PS Put blobs into the db when they are important enough to have a single file and trade off the performance of general db maintenance, migrations, and backup. If you put blobs into your db keep them in a separate table otherwise you ruin performance of queries that require full scan of the table but do not use these blobs.

(3) By Rowan Worth (sqweek) on 2024-11-01 05:10:14 in reply to 1 [source]

For me the comment that jumps out as the most relevant today is around fsync. This remains a relatively expensive operation and depending on the OS/filesystem/background IO can have unpredictable delays. If applications don't understand this and manage transaction lifetimes/frequency accordingly, it is likely to lead to problems.

eg. I've seen scenarios where a GUI application making heavy use of SQLite became completely unusable when the user was transferring a large amount of data to a USB disk, despite the fact that the application wasn't reading/writing to the USB itself. The impact on the application was tracked down to long fsync delays -- not because it was doing sqlite operations on the main thread, but because the GUI at some point inevitably relies on something from the database which isn't already cached in memory and this read ends up also having to wait for the fsync, because the database cannot safely be read during this stage of transaction COMMIT.

We never pinned down exactly why fsync got so slow; the leading theory was that the bulk transfer produced a large amount of dirty buffers in the linux write-back cache¹, and the kernel's I/O system had to clear these before it could service the fsync request.

¹ at the time the default write-back pool size was a percentage of memory, and this machine had a lot of RAM for its time

To be clear it's hard to blame this on sqlite -- there's no sensible reason that I/O on one device should affect fsync on a completely separate device. But the fact remains: I/O patterns which don't rely on fsync have less moving parts and more predictable performance. "Write new version to a tempfile -> atomically rename" remains a nice simple approach, with the caveat that it of course does not provide durability in the event of power loss.

(4) By Mike Castle (nexushoratio) on 2024-11-02 15:52:10 in reply to 3 [link] [source]

Because this sounds like an interesting problem to me, I'm wondering:

Were you able to reproduce this fsync interaction in a standalone toy app?
If so, were you able to test it on other operating systems to see if it was something Linux specific?

Just curious. These kinds of things fascinate me.

(5) By Anton Dyachenko (ahtoh_) on 2024-11-03 21:12:27 in reply to 4 [link] [source]

I can confirm similar but not the same behavior on Mac OS. I had spent couple of days investigating why db migrations happened in auto test sometimes took 100x longer time than most of the runs which causes AT failure by timeout. I don't know the reason but found a good enough fix for this issue - now every AT step calls explicit global fsync on Mac OS. Preparation and clearning steps do lots of IO by either copying or removing lots of files. One possible explanation could be this prep steps extensively use shared FS/disk cache because fsync is also an IO barrier then mutable operations on a db must wait when the whole queue of previous operations on the FS/disk is completed. So the fix basically moves that fsync operations out of time messusred blocks, so they still consume same amount of time as before but AT won't fail because of timeout. And yes that problem was only with external USB disks.

(6) By Rowan Worth (sqweek) on 2024-11-04 03:53:07 in reply to 4 [link] [source]

Unfortunately not -- this behaviour was reported by a client so we didn't have direct access to the machine. A work-around was put forward of "plug the USB into a different machine that isn't used interactively and export the data there" (aka "don't do that") and this was sufficient to get the client moving. For pragmatic reasons (ie. a million other issues to focus on) we never got back to this to pin down the details or isolate the behaviour.

(7) By ralf (ralfbertling) on 2024-11-04 08:13:03 in reply to 1 [link] [source]

Hi, my 2 cents:

SQLite hat big theoretical advantages because it can read and write incrementally. JSON/XML/Text-Configs ALWAYS have to be processed as a whole when reading and are usually also completely overwritten.
If the "whole file" behaviour is acceptable, you can always load the SQLite-db into memory and "backup" to disk. The manipulation logik will still be faster with SQLite.
There are advantages and disadvantages to human readable formats, namely humans tend to break more things guessing the logical structure and structuring data is easier in an extensible format like SQLite where an additional table is very unlikely to break existing workflows. JSON has (for reasons beyond me) no concept of comments.
If SQLite is "overkill" one should note that many standard XML-parsers are bigger than SQLite.
It's possible to design databases badly and people do. use-the-index-luke

I guess these were 5 cents. Sorry ;-)

ralf

(8) By ralf (ralfbertling) on 2024-11-04 08:27:34 in reply to 7 [link] [source]

Hi,

to balance my own point the BLOB storage format doesn't allow efficient random access, as it's basicly a linked list of pages. An (upwards compatible) augmentation to the file format that allows row-overflow-values to be trees would increase I/O-performance for the use of SQLite as a file system. Frequent incremental write access is a different kind of task, depending on the kind of changes one wants to perform. The suggested tree format would still be helpful to link to unchanged pages from multiple versions of the row.

Regards, ralf

(12) By Donal Fellows (dkfellows) on 2024-11-07 15:07:25 in reply to 8 [link] [source]

The one case where I've found SQLite to be very poor with BLOBs is doing frequent appends, which just had terrible performance. On the other hand, representing the "file" as a sequence of BLOBs that could be concatenated outside of SQLite worked very well. In my case, that replaced a complex mess of stuff to handle writing several 100k files in parallel (which had bad performance on all operating systems because it had to cope with hitting OS global limits on the number of open FDs) and the result was faster[*], more robust and more stable.

[*] Faster because that particular app could support disabling fsync; partial databases weren't interesting to the particular application code analysing the data. YMMV.

(9) By anonymous on 2024-11-04 19:23:42 in reply to 7 [link] [source]

Note that the "overkill" comment in the original post refers to the anti-SQLite wiki article, not to SQLite itself. The word came from this blog post by Jim Nelson:

One page on the Firefox wiki (“Avoid SQLite In Your Next Firefox Feature”—overkill, but it makes some good points)...

I agree about SQLite's incremental reading and writing being a huge advantage. It's still puzzling to me that these Mozilla developers had so much trouble with SQLite that they recommended abandoning its incremental reading and writing in favor of processing whole JSON files:

JSON files or log files will show better I/O patterns almost every time, especially if they're compressed and read/written in entirety each time

This just seems weird to me.

(10) By ralf (ralfbertling) on 2024-11-05 13:47:15 in reply to 9 [link] [source]

To tl;dr: the wiki page it should be named.

"if you don't know what you are doing, things may go wrong."

Many things are irrelevant (like fragmentation) or just wrong (like periodic rebuild is necessary).

The metrics of the wiki authors are way off. Some points might be interessting but are buried among a lot of "information" that is mislieding (probably by accident).

For many practical applications its probably helpful to have a mixture of

a backup (which may be a compressed SQLite-Dump-file),
some kind of journal containing incremental changes, and
quite replaxed handling of fsyncs and
aggressive local caching.
Don't be shy to use a reosonable amount of space of memory-storaye/temp-DBs to speed up as needed.

ralf

(11) By Rowan Worth (sqweek) on 2024-11-06 04:53:04 in reply to 9 [link] [source]

JSON files or log files will show better I/O patterns almost every time, especially if they're compressed and read/written in entirety each time

This just seems weird to me.

I interpreted this as "if your serialisation pattern involves reading/writing all relevant state every time (as opposed to mutating individual fields which have changed), you are unlikely to see significant I/O benefits by using sqlite over regular file I/O"

Overall I agree with ralf's perspective -- if you don't understand and account for the nuances of the API you are using, things may go wrong. Sqlite does introduce additional nuances over regular file I/O, and if you don't account for them you are going to have a bad time (just look at the countless examples of people new to sqlite who struggle with SQLITE_BUSY scenarios).

Similarly, if you are using regular file I/O and you don't account for the possibility of process crashes/power loss while saving your state, at some point you are going to have a bad time. Everything is a tradeoff :)