bug report: invalid WAL frames written after rollback
(1) By Joe Lee (jleexx) on 2025-06-17 03:02:42 [source]
While testing our application, I found a case where SQLite appears to write invalid WAL frames after rolling back a savepoint. I managed to reduce it to this test case, based on walcksum-2.1
in test/walcksum.test
(originally reproduced against version 3.44.0
, but seems to also reproduce against 3.50.1
, and presumably others):
do_test walcksum-2.2 {
forcedelete test.db test.db-wal test.db-journal
sqlite3 db test.db
execsql {
PRAGMA synchronous = NORMAL;
PRAGMA page_size = 4096;
PRAGMA auto_vacuum = FULL;
PRAGMA journal_mode = WAL;
PRAGMA cache_size = 1;
CREATE TABLE t1 (i INTEGER PRIMARY KEY, s TEXT);
PRAGMA wal_checkpoint;
/* commits pages 1,2,3,4 to frames 1-4: */
INSERT INTO t1 (i, s) VALUES (0, randomblob(4096));
/* commits pages 1,2,3,5 to frames 5-8: */
INSERT INTO t1 (i, s) VALUES (1, randomblob(4096));
SAVEPOINT one;
/* spills pages 6,7,8,9,10 to frames 9-13: */
INSERT INTO t1 (i, s) VALUES (2, randomblob(4096));
INSERT INTO t1 (i, s) VALUES (3, randomblob(4096));
INSERT INTO t1 (i, s) VALUES (4, randomblob(4096));
INSERT INTO t1 (i, s) VALUES (5, randomblob(4096));
INSERT INTO t1 (i, s) VALUES (6, randomblob(4096));
INSERT INTO t1 (i, s) VALUES (7, randomblob(4096));
/* spills pages 11,2,12,2 to frames 14,15,16, rewriting frame 15: */
INSERT INTO t1 (i, s) VALUES (8, randomblob(4096));
/* rolls back to frame 8, writes commit frame 9 with zero checksum.
* restores current checksum to frame 14's, skips frame 9 rewrite: */
ROLLBACK TO one;
RELEASE one;
/* commits pages 1,2,3,6 to frames 10-13: */
INSERT INTO t1 (i, s) VALUES (9, randomblob(4096));
}
forcecopy test.db test2.db
forcecopy test.db-wal test2.db-wal
sqlite3 db2 test2.db
execsql {
PRAGMA integrity_check;
SELECT count(*) FROM t1;
} db2
} {ok 3}
catch { db close }
catch { db2 close }
The test fails because the restored database has 2 records instead of 3, because it is missing the record inserted after the rollback.
(2) By Richard Hipp (drh) on 2025-06-17 19:37:37 in reply to 1 [link] [source]
Thanks for the bug report. Dan has a fix now on trunk and on branch-3.50. It looks like this problem goes back to the wal-overwrite-frames optimization in 2016, first released in version 3.11.0.
(3) By Joe Lee (jleexx) on 2025-06-17 20:53:32 in reply to 2 [link] [source]
Thank you!
(4) By jose isaias cabrera (jicman) on 2025-06-19 11:56:54 in reply to 2 [link] [source]
...this problem goes back to the wal-overwrite-frames optimization in 2016, first released in version 3.11.0.
This is incredible. I would expect to find bugs in later versions, because of new enhancements, add-ins, etc., but to find bugs as old as 2016, it's very interesting. I would have thought all those bugs would have been fixed/addressed because of the strenuous and complete testing suite you run on each release, and pre-release. Thanks for letting us know the details, Dr. Hipp. I appreciate history.
(5) By Joe Lee (jleexx) on 2025-06-19 16:23:59 in reply to 4 [link] [source]
For what it's worth, I think part of the reason that I stumbled across the bug is because I had been explicitly testing with PRAGMA cache_size = 1
to reproduce other conditions in our code related to spilled or rewritten WAL frames.
So far, I'm only familiar enough with the sqlite codebase to know that it does significant fuzz testing, but not explicitly how it goes about it. But if there's some way to provide hints to the fuzzer, it might be useful to tell it that a small cache size is a useful direction to explore, to make it more likely to catch bugs like this one?
(6) By Richard Hipp (drh) on 2025-06-19 16:39:40 in reply to 5 [link] [source]
The dbsqlfuzz fuzzer uses both SQL and database inputs. The input database is typically already corrupt when fuzzing starts, so it would be inappropriate for the fuzzer to report a fault if the database ended up corrupt at the end. Even if it is known that the input database was not corrupt, dbsqlfuzz enables SQL language features that can lead to database corruption (such as the ability to write to the sqlite_schema table) so ending up with a corrupt database at the end of a fuzzer invocation is not an error.
The TH3 test harness implements a testing technique wherein a transaction is started and then a snapshot is taken of the filesystem mid-way through the transaction. The snapshot is designed to mimic the kinds of filesystem damage that might be seen during a sudden power loss. Then the filesystem is restored to that snapshot and TH3 verifies that the transaction either committed correctly or rolled back completely. The tests are randomized and repeated many times in the sense that the order of pages written to disk and hence surviving the power loss is random, and some pages might contain random errors. But the SQL that was running at the time of the snapshot is not randomized. There are just a collection of SQL scripts that get run. Unfortunately, none of those scripts where of the nature that would replicate the problem found by mister Lee.