Can SQLite be used with AWS Elastic File System (EFS)?
(1) By example-user on 2020-06-17 10:28:53 [link] [source]
The SQLite docs mention not to use NFS because some implementations have broken locking implementations.
But what if the locking implementation works? Can SQLite be used then?
In particular AWS have just released Elastic File System (EFS) which is a type of NFS:
In their examples they use Pythons
fcntl.flock to demonstrate getting and releasing a lock.
Fast answer: YES
Long answer: SQLite is advising against the use of Network file systems for multiple access to a single DB file, mostly because Network file systems typically do not handle file-locking correctly - that is to say, they lie about the file being locked when it isn't. (Ok, it's more involved than that, but not in the scope of this discussion).
This means that as soon as any of those givens change, it's very safe to use. So as soon as:
- 1. You do not have a Networking file system (but a local one), or
- 2. You do not have multiple processes accessing the file, or
- 3. Your Networked file system does not lie about locking.
Then it's perfectly safe to use SQLite on. But...
The other important consideration is that the bulk of the communication happens between the DB engine and DB files. A small percentage of typical data transfer happens to the end-client who asked for the data via a Query. This means that you typically would want the Engine and file to be on ONE device. For many use cases this is not a real factor, but for any DB with significantly big tables, this will matter.
So yes, it can and will work if your file system guarantees locking semantics, but might still not be the optimal solution.
Network file systems typically do not handle file-locking correctly
...and when they do handle it correctly, it's a milliseconds-scale round-trip to the file server on each lock and unlock, rather than the few nanoseconds it takes to update a table in RAM for local I/O. Since SQLite does a lock for each transaction, this can balloon your I/O times by the same 6 orders of magnitude.
...Thus why client/server DBMSes exist: because it puts all of the locking on the host that does the I/O.
There are client/server extensions to SQLite (e.g. Bedrock), which are far better choices than putting SQLite onto a network filesystem. In this particular case, it's also well tuned for cloud operations, taking key design restrictions like the CAP theorem explicitly into account.
That's nice, but even within the same AWS zone the round-trip time to do the lock will be orders of magnitude more expensive than purely on-host locking. Then if you go and use the cloud the way it was meant to be used, distributing your operations across data centers, expect to add another zero or two to the ping time.
I kinda doubt anyone here works for Amazon, and even if they did work for Amazon, it is pretty unlikely that Amazon has tested their new filesystem with SQLite3.
Why do not you test it and see if it works? If it does not work, you can then provide feedback to Amazon so they can fix their deficiencies in implementation.