SQLite Forum

SQLite with NFS + cachefilesd
Login
> Does chachefilesd improve the odds that all readers and writers of the database "file" see/create a consistent view of it? Or is it a speed-up fixer-upper?

I'm no expert with cachefilesd but my understanding is that it's a speed-up fixer, as well as alleviating network loads. Also, I wasn't careful in my sentence, I meant SQLite read queries performance is not improved, which is what surprises me. It makes sense that write queries are not improved (cachefilesd invalidates the cached file for writes, apparently).

> Perhaps cache-improver is taking care to be sure that the underlying actual disk file on a remote machine has not changed underneath the cached copies. There is already a lot of caching and syncing effort associated with network file-emulation layers.

I'm not sure I got this correctly, but I would still expect at least some improvement on read queries. When reading or copying files from the NFS cachefilesd does improve performance (a lot). I thought perhaps the difference with SQLite is the way the file is read/fetched/searched.

> I suppose that means you have read and understood SQLite Over a Network, Caveats and Considerations and See How To Corrupt Your Database Files. Do you have any suggestions as to how such dissuasion might be made more effective?

I have read those, and they're very informative. I also read other posts in multiple places discouraging this approach. However NFSv4 and v4.1 now apparently support file locking and that makes it very tempting to go for SQLite over NFS with these versions, at least for me. The only suggestion I'd have would be to add some information about SQLite over NFSv4+ in case you've got some feedback/testing results with that.

I'm trying to have a dead simple highly available system where multiple hosts use the shared SQLite DB over NFS. I realize SQLite wasn't designed for that usage, and this may never be feasible in any safe or performant way. But still I want to try it and see for myself. I also realize that using NFS caching on each host independently would likely pose some coherence issues. I was still curious about it.

The right way would be to use a client/server SQL like MySQL or pgSQL I guess, but that would add a lot of overhead to keep the system highly available, because then I need to have multiple SQL servers in sync. I want no single point failure. My NFS is actually AWS EFS, which has three redundant endpoints. Funnily enough, AWS mentions the ability to use SQLite over EFS since EFS is compatible with NFSv4+: https://aws.amazon.com/about-aws/whats-new/2017/03/amazon-elastic-file-system-amazon-efs-now-supports-nfsv4-lock-upgrading-and-downgrading/.

In any case, I appreciate the help here. I'll have a look at a share block storage as an alternative, see if that performs better.