PRAGMA wal_checkpoint(RESTART) unexpectedly returning SQLITE_BUSY immediately
I have a busy, very concurrent SQLite database: 36 connections open from the same process, about 1000 writes per second. As is apparently expected, the WAL file is growing without bound. To counter this, I am attempting to deliberately insert a periodic "stop-the-world" style event by calling:
As I understand it, this will grab an exclusive lock, invoking the busy handler.
All of these connections are configured with a busy_timeout of 10 seconds. But the
PRAGMA wal_checkpoint(RESTART) returns busy within a microsecond. Similarly,
SQLITE_BUSY within a microsecond.
Am I misunderstanding the documentation of sqlite3_wal_checkpoint?
Are you setting
PRAGMA journal_mode=WAL; on the connection you're using to run your application-initiated checkpoints on? When I last tried a similar setup to yours, I had to set it explicitly again on the checkpoint connection for those calls to work, even though WAL mode is meant to be persistent.
If there is another checkpointer running, sqlite3_wal_checkpoint_v2() will return SQLITE_BUSY immediately. It only uses the busy handler for database readers and writers. The PRAGMA is similar.
Any connections configured to do auto-checkpoints that might be running a checkpoint?
That's it. Setting
PRAGMA wal_autocheckpoint=0; on every connection resolves the issue and lets
PRAGMA wal_checkpoint(RESTART); succeed reliably. (I did see a single SQLITE_BUSY in my load test, but there is a lot of code involved and I am investigating if I accidentally created a connection inappropriately.)
Turns out this is well-documented, I somehow missed it.
All calls obtain an exclusive "checkpoint" lock on the database file. If any other process is running a checkpoint operation at the same time, the lock cannot be obtained and SQLITE_BUSY is returned. Even if there is a busy-handler configured, it will not be invoked in this case.
Thank you Dan!