Most efficient way to tell if any database content has changed
(1.2) By Simon Willison (simonw) on 2020-10-12 18:55:36 edited from 1.1 [source]
I'm interested in answering the question "has the contents of this database changed in any way" as efficiently as possible.
I want to do this for caching purposes - I'd like to be able to cache various things relating to an infrequently changed database but invalidate that cache if even a single row has been inserted or updated.
I've successfully used
PRAGMA schema_version to invalidate a cache when the database schema changes, but now I'm looking to do it based on the file contents instead.
Options I'm considering:
- Calculating the MD5 hash of the file contents on disk, which runs pretty fast (but not fast enough to do it often)
- Adding triggers to every single database table which increment a counter in a dedicated
content_versiontable any time those tables have content inserted/updated/deleted
I'm pretty sure that second option would work correctly (I'm not at all worried about the additional overhead on writes), but I'm wondering if there's a solution I haven't considered that would be even better.
(2.1) By Keith Medcalf (kmedcalf) on 2020-10-12 19:13:00 edited from 2.0 in reply to 1.2 [link] [source]
The simplest way is probably to hold-open a second connection to the database and periodically execute
PRAGMA data_version. If the value changes then a different connection has updated the data in the database. Since this connection is used for nothing other than polling the data_version, you only need a very small cache_size for it.
Note that this is basically how a connection determines internally that the database file has been modified by another connection and must flush its page cache.
(3) By Simon Willison (simonw) on 2020-10-12 22:17:38 in reply to 2.1 [link] [source]
Not sure how I missed that, that looks perfect! Thanks very much.
(4) By oneeyeman on 2020-10-12 22:21:14 in reply to 3 [link] [source]
And you can just run it in a secondary thread which shouldn't give any overhead on the main application...
(5) By Kevin Youren (KevinYouren) on 2020-10-12 22:28:14 in reply to 1.2 [link] [source]
Would a simple directory listing suffice?
It is independent of file size.
I mainly use Unix terminals, so:
ls -lai cfd_20201009_0146.sqlite
gives: 6553669 -rw-r--r-- 1 root root 200912896 Oct 9 12:53 cfd_20201009_0146.sqlite
Windows has similar and Android too. (with a bit of effort)
(6) By little-brother on 2020-10-12 22:47:08 in reply to 1.2 [link] [source]