Bug report: PRAGMA query_only has side effect when compiled, not when executed

(1) By Gwendal Roué (groue) on 2025-02-05 15:10:14 [source]

Hello,

This is a report for a bug that I could witness in SQLite 3.43.2. If it has been fixed since, please accept my apologies.

PRAGMA query_only=0 puts the database in read/write mode before it is executed with sqlite3_step. Compiling it with sqlite3_prepare_v3 is sufficient (to be very precise, maybe sqlite3_reset is necessary as well).

I could witness this unexpected behavior while writing a regression test for GRDB.swift. In the tested scenario, sqlite3_step is not executed, but the invocation of sqlite3_prepare_v3 and sqlite3_reset is enough to allow writes again.

Thanks for reading!

(2.2) By Gwendal Roué (groue) on 2025-02-05 15:24:50 edited from 2.1 in reply to 1 [link] [source]

Here is a way to reproduce the problem. The program below fails on the last assert:

#include <assert.h>
#include <sqlite3.h>
#include <stdio.h>

int main() {
  sqlite3 *conn;
  sqlite3_stmt *stmt;
  int res;
  
  // Open connection
  sqlite3_open("", &conn);
  
  // Enter read-only
  sqlite3_exec(conn, "PRAGMA query_only=ON", NULL, NULL, NULL);
  // Assert that we can't create a table.
  res = sqlite3_exec(conn, "CREATE TABLE test(a)", NULL, NULL, NULL);
  assert(res == SQLITE_READONLY);
  
  // Compile and reset, but don't execute, the statement that restores write access.
  sqlite3_prepare_v3(conn, "PRAGMA query_only=OFF", -1, 0, &stmt, NULL);
  sqlite3_reset(stmt);
  // Assert that we still can't create a table.
  res = sqlite3_exec(conn, "CREATE TABLE test(a)", NULL, NULL, NULL);
  assert(res == SQLITE_READONLY); // FAIL
}

Execution (on my Mac):

$ cc test.c -lsqlite3 && ./a.out
Assertion failed: (res == SQLITE_READONLY), function main, file test.c, line 18.
zsh: abort      ./a.out

(3) By Richard Hipp (drh) on 2025-02-05 15:29:39 in reply to 1 [link] [source]

Not a bug. This is documented behavior. It has been that way for over twenty years. If I change it now, no telling how many thousands of applications would break.

See https://sqlite.org/pragma.html, the fourth and fifth bullets at the top of the page.

Some pragmas take effect during the SQL compilation stage, not the execution stage.

(4) By Gwendal Roué (groue) on 2025-02-05 15:36:32 in reply to 3 [link] [source]

Oh, thank you, Richard. Indeed I missed those paragraphs. OK. I may have to review my use of certain PRAGMAs :-) Have a nice day!

(5) By Roger Binns (rogerbinns) on 2025-02-05 15:58:29 in reply to 3 [link] [source]

Some pragmas take effect during the SQL compilation stage, not the execution stage.

This is yet another reason why SQLite implementing a statement cache would be a good thing. Pretty much every wrapper out there implements their own, even the TCL wrapper from the SQLite team. It is easy make mistakes around this.

I'm very strongly in favour of a flags to sqlite3_prepare_v3 like SQLITE_CAN_CACHE whose side effect is that behind the scenes a cache managed by the connection could be used for that statement. The SQLite code will know what to do about pragmas.

(6.2) By Gwendal Roué (groue) on 2025-02-05 16:18:00 edited from 6.1 in reply to 5 [link] [source]

In GRDB.swift, there are two (custom) statements caches: one exposed to the library user, and one for private library use.

If there were a single cache, we could have nasty scenarios where two "users" of the same database performs mutations on the same statement, in unexpected ways.

For example:

User 1 loads a statement from the cache, and sets some arguments.
User 1 does something that triggers User 2 to run some code:
User 2 loads the same statement from the cache, and sets some arguments.
User 2 yields control back to user 1
User 1 executes the statement (unaware of the actions performed by user 2)
The statement was executed with unexpected arguments, and User 1 is unhappy.

This can't happen with the two caches of GRDB.swift:

Library user loads a statement from the user cache, and sets some arguments.
Library user does something that triggers the library to run some code:
Library loads a distinct statement from the library cache, and sets some arguments.
Library yields control back to the library user
Library user executes his statement.
The statement was executed with the expected arguments, and the library user is happy.

If SQLite would one day support statement caching, then I think it should be possible to create multiple independent caches, instead of one single cache per connection. With a single cache, we'd have to keep on implementing custom caches on the side, in order to avoid the scenario described just above.

(7.2) By Gwendal Roué (groue) on 2025-02-05 16:24:40 edited from 7.1 in reply to 6.2 [link] [source]

I've skimmed through the thread about statement caching, but I'm not sure the problem of the various "users" of the database was discovered.

Each library is a "user":

SQLite itself never executes statements on its own.
An SQLite wrapper that executes statements on its own is a first "user"
A wrapper of the wrapper that executes statements on its own is a second "user"
A user of the wrapper of the wrapper is a third "user".

It's important that all those "users" have different caches so that they can set arguments and reset statements without messing with the other users.

(8) By Roger Binns (rogerbinns) on 2025-02-05 16:23:39 in reply to 6.0 [link] [source]

You are assuming that calling sqlite3_prepare_v3 SQLITE_CAN_CACHE with identical text twice in a row would return the same statement.

That would not be the case and would be an implementation error.

The simplest algorithm is:

When SQLITE_CAN_CACHE is present, check the cache for a corresponding entry,
If not found, prepare as usual.
If one is found, remove it from the cache, and return it.
When sqlite3_finalize is called, and the statement had been prepared using SQLITE_CAN_CACHE and there is space in the cache, then put the statement back in the cache (after reseting and clearing bindings).

That algorithm would not have the problems you describe because identical text would result in distinct statements being returned. It would be indistinguishable from current behaviour without a cache, and would always be correct.

And it works - it is precisely what I have implemented in my Python wrapper.

(9.1) By Gwendal Roué (groue) on 2025-02-05 16:28:01 edited from 9.0 in reply to 8 [link] [source]

Hm, that's interesting. Thanks for the insight. Yes, that might work.

(EDIT: Sorry I could have been more enthusiastic: thank you very much for sharing your great idea!)

(10) By Gwendal Roué (groue) on 2025-02-05 16:29:17 in reply to 9.1 [link] [source]

Is it https://github.com/rogerbinns/apsw?

(11) By Roger Binns (rogerbinns) on 2025-02-05 17:02:25 in reply to 10 [link] [source]

Yes. You can find the statement cache code starting with a commentary about a prior dictionary based implementation.

The actual cache is a trivial data structure - just a list with realloc used to grow it. Looking for an entry is a linear scan, first looking for a hash match then length then finally a query text compare.

I also don't cache when the query returns a NULL statement, which happens for comments, but also some pragmas like what started this thread.

(12) By Gwendal Roué (groue) on 2025-02-05 17:18:21 in reply to 11 [link] [source]

Thank you very much! I'll mention you when I improve my own cache eventually :-)