PRAGMA wal_checkpoint(RESTART) unexpectedly returning SQLITE_BUSY immediately

(1) By David Crawshaw (crawshaw) on 2021-09-26 19:33:00 [link] [source]

Hello,

I have a busy, very concurrent SQLite database: 36 connections open from the same process, about 1000 writes per second. As is apparently expected, the WAL file is growing without bound. To counter this, I am attempting to deliberately insert a periodic "stop-the-world" style event by calling:

PRAGMA wal_checkpoint(RESTART);

As I understand it, this will grab an exclusive lock, invoking the busy handler.

All of these connections are configured with a busy_timeout of 10 seconds. But the PRAGMA wal_checkpoint(RESTART) returns busy within a microsecond. Similarly, sqlite3_wal_checkpoint_v2 returns SQLITE_BUSY within a microsecond.

Am I misunderstanding the documentation of sqlite3_wal_checkpoint?

Thank you,

David

(2) By anonymous on 2021-09-27 07:49:29 in reply to 1 [link] [source]

Are you setting PRAGMA journal_mode=WAL; on the connection you're using to run your application-initiated checkpoints on? When I last tried a similar setup to yours, I had to set it explicitly again on the checkpoint connection for those calls to work, even though WAL mode is meant to be persistent.

(3) By Dan Kennedy (dan) on 2021-09-27 15:25:43 in reply to 1 [source]

If there is another checkpointer running, sqlite3_wal_checkpoint_v2() will return SQLITE_BUSY immediately. It only uses the busy handler for database readers and writers. The PRAGMA is similar.

Any connections configured to do auto-checkpoints that might be running a checkpoint?

(4) By David Crawshaw (crawshaw) on 2021-09-27 16:02:40 in reply to 3 [link] [source]

That's it. Setting PRAGMA wal_autocheckpoint=0; on every connection resolves the issue and lets PRAGMA wal_checkpoint(RESTART); succeed reliably. (I did see a single SQLITE_BUSY in my load test, but there is a lot of code involved and I am investigating if I accidentally created a connection inappropriately.)

Turns out this is well-documented, I somehow missed it.

All calls obtain an exclusive "checkpoint" lock on the database file. If any other process is running a checkpoint operation at the same time, the lock cannot be obtained and SQLITE_BUSY is returned. Even if there is a busy-handler configured, it will not be invoked in this case.

Thank you Dan!