Guarantees around detection of file deletion

(1) By officialnsa on 2024-05-08 23:07:18 [link] [source]

Hi all,

I've recently been working on a corruption detection methodology for our application that uses sqlite3. Here's our use case:

Our application is an audio plugin, meaning it is instantiated (possibly multiple times) by a host and we have no way to predict whether these instantiations will be in the same process or separate processes
Every instance that is instantiated will possess a single sqlite3 connection, as we use an ORM that mandates a single, eternal connection (otherwise, thread safety is not guaranteed).

Upon detecting corruption in the database, our proposed solution is as follows:

Instance experiences any failed DB operation immediately runs a PRAGMA quick_check
If this fails, the instances database connection is closed and reopened in case another instance has already detected the corruption and remedied it.
A second quick_check is run— if this also fails, no other instance has fixed the corruption, and this instance is now responsible for performing the deletion
The "responsible" instance closes its connection, deletes the database file, then re-opens its connection (thereby recreating the file) and performs the necessary CREATE TABLE | INDEX operations to match the original schema.
Any other instances other than the "responsible" instance will detect corruption on their next read/write attempt, then reopen their database connection per step 2., which will then pass the second quick_check and proceed with operation per usual.

However, I am very uneasy with this solution. It assumes that either an open connection will always fail its next read/write if the database has been deleted/re-instantiated, or an open connection will not make any corrupted writes to a recreated database file if the database file matches the schema used by the connection.

It would be great if one of these were true, because we'd otherwise have to implement a complicated inter-process synchronization system for ensuring all connections are closed before deleting the database file. However, I've not read anything that suggests they are. Furthermore, I'm suspicious of performing any operation on a database file directly while it has sqlite3 connection open on it, because that feels like it's a quick way to cause corruption.

To put my question concisely: what happens if you delete a database file and re-create an empty database with the same schema in its location while there's open database connections pointing at that file? Is any part of that behavior deterministic, or is it fully undefined?

(2) By Richard Hipp (drh) on 2024-05-08 23:45:13 in reply to 1 [link] [source]

The SQLITE_DBCONFIG_RESET_DATABASE opcode for sqlite3_db_config() was created for purpose of resetting a corrupt database to be empty even if there are multiple database connections. I think that is what you want here, isn't it?

(3) By Anton Dyachenko (ahtoh_) on 2024-05-09 02:25:13 in reply to 2 [link] [source]

The use case is very similar to my application with the difference that I have many connections in one process instead of many process. So I would like to add that ideally it should not just reset database to an empty state but also recreate expected schema in the same transaction that other connections can't see a database in an intermediate state (empty or half recreated schema). Pseudo code of ideal solution

BEGIN EXCLUSIVE;
if (ok == PRAGMA quick_check) {
    ROLLBACK;
    return;
}

sqlite3_db_config(db, SQLITE_DBCONFIG_RESET_DATABASE, 1, 0);
VACUUM db;
sqlite3_db_config(db, SQLITE_DBCONFIG_RESET_DATABASE, 0, 0);
CREATE TABLE t1;
...
COMMIT;

But I am not sure if a transaction can contain sqlite3_db_config calls.

(4) By anonymous on 2024-05-09 04:29:13 in reply to 3 [source]

But I am not sure if a transaction can contain sqlite3_db_config calls.

A transaction definitely can't contain vacuum.

(5) By officialnsa on 2024-05-09 17:16:08 in reply to 2 [link] [source]

Thank you! I've been scouring the docs for some time now but never came across this. Our ORM sadly does not support sqlite3_db_config() but this is such a vital feature that we'll likely request it as a feature or open a PR.

(6) By officialnsa on 2024-05-09 17:17:51 in reply to 4 [link] [source]

Are you certain about this? I've found a thread where most posters seem to think it's a perfectly valid use of vacuum. However, if that's not the case (or if I'm misunderstanding the thread I linked) I'd very much appreciate if you could point me towards relevant documentation!

(7) By Adrian Ho (lexfiend) on 2024-05-09 17:38:49 in reply to 6 [link] [source]

As Simon Slavin explains in the thread you linked:

It is normal for SQL engines to prevent VACUUM inside a transaction, and to lock the database from other threads/processes/users while it operates. For instance, Postgres will issue a consistent error message if you don't close your transaction before you do a VACUUM. Since it is not a 'real' SQL command (in that it operates on file structure, not tables, rows or columns) it shouldn't be a part of a transaction.

Don't forget what VACUUM does: it doesn't operated just on data in the database, it reorganises the entire database file, and can move every thing into a different place and delete entire pages. It can even change _rowid_s. no way it can operate while allowing other SQL commands to work, so it may as well issue an explicit lock.

And the VACUUM docs are quite explicit about this:

A VACUUM will fail if there is an open transaction on the database connection that is attempting to run the VACUUM.

(8) By officialnsa on 2024-05-09 18:32:10 in reply to 7 [link] [source]

I understand—that's my bad for not reading the thread more carefully. Given that vacuum cannot be run in a transaction, is there a best practice for using the SQLITE_DBCONFIG_RESET_DATABASE flag in a way that the database can be vacuumed and then have its schema re-inserted in an atomic manner? I.e., is there some way to gain an exclusive lock on the database outside of a transaction context, such that the vacuum can be performed and immediately followed by a set of CREATE TABLE | INDEX functions?

(9.1) By Anton Dyachenko (ahtoh_) on 2024-05-09 20:55:44 edited from 9.0 in reply to 7 [link] [source]

This

Since it is not a 'real' SQL command (in that it operates on file structure, not tables, rows or columns) it shouldn't be a part of a transaction.

contradicts to this

It can even change _rowid_s.

Even though ordinary vacuum should not change a schema (set of tables, triggers, indexes) but this special db reset operation actually changes schema. It is just a coincidence that it is done via vacuum command (not exactly coincidence but definitely beyond normal vacuum) and any other normal way to reset schema will likely fail due to db corruption.

So from sql point of view this should be a transaction/normal reset database operation because it changes schema and it should also be possible to join it with other normal operations in the same transaction. I wish we had sql extension command like

BEGIN EXCLUSIVE;
DROP DATABASE schema_name;
CREATE TABLE t1;
...
COMMIT;

So this DROP DATABASE command would work under the hood as SQLITE_DBCONFIG_RESET_DATABASE + VACUUM but inside a transaction. Its fine if this DROP command would require setting some security flags similar to allowing loading extensions and doesn't work in defensive mode.