Most efficient way to tell if any database content has changed
(1.2) By Simon Willison (simonw) on 2020-10-12 18:55:36 edited from 1.1 [link]
I'm interested in answering the question "has the contents of this database changed in any way" as efficiently as possible. I want to do this for caching purposes - I'd like to be able to cache various things relating to an infrequently changed database but invalidate that cache if even a single row has been inserted or updated. I've successfully used `PRAGMA schema_version` to invalidate a cache when the database schema changes, but now I'm looking to do it based on the file contents instead. Options I'm considering: - Calculating the MD5 hash of the file contents on disk, which runs pretty fast (but not fast enough to do it often) - Adding triggers to every single database table which increment a counter in a dedicated `content_version` table any time those tables have content inserted/updated/deleted I'm pretty sure that second option would work correctly (I'm not at all worried about the additional overhead on writes), but I'm wondering if there's a solution I haven't considered that would be even better.
(2.1) By Keith Medcalf (kmedcalf) on 2020-10-12 19:13:00 edited from 2.0 in reply to 1.2 [link]
The simplest way is probably to hold-open a second connection to the database and periodically execute `PRAGMA data_version`. If the value changes then *a different connection* has updated the data in the database. Since this connection is used for nothing other than polling the data_version, you only need a very small cache_size for it. <https://sqlite.org/pragma.html#pragma_data_version> Note that this is basically how a connection determines internally that the database file has been modified by another connection and must flush its page cache.
(3) By Simon Willison (simonw) on 2020-10-12 22:17:38 in reply to 2.1 [link]
Not sure how I missed that, that looks perfect! Thanks very much.
(4) By oneeyeman on 2020-10-12 22:21:14 in reply to 3 [link]
Hi, And you can just run it in a secondary thread which shouldn't give any overhead on the main application... Thank you.
(5) By Kevin Youren (KevinYouren) on 2020-10-12 22:28:14 in reply to 1.2
Would a simple directory listing suffice? It is independent of file size. I mainly use Unix terminals, so: ls -lai cfd_20201009_0146.sqlite gives: 6553669 -rw-r--r-- 1 root root 200912896 Oct 9 12:53 cfd_20201009_0146.sqlite Windows has similar and Android too. (with a bit of effort)
(6) By little-brother on 2020-10-12 22:47:08 in reply to 1.2 [link]