SQLite Forum

Timeline
Login

4 forum posts by user james.oldfield

2021-01-05
14:19 Edit reply: Statements outside of explicit transactions in SQLite (artifact: 1695c80438 user: james.oldfield)

Thanks for another great reply.

Yes, the Python sqlite3 wrapper sometimes fiddles with transactions automatically. I actually started this whole process by innocently wondering how to start and stop transactions and ended up much further down the rabbit hole than I intended! At some point I hope to document my findings in a StackOverflow answer which, no doubt, no one will ever read :-) I'll include a note about closing cursors for incompletely exhausted SELECT statements.

As a result, I've spent a lot of time reading the source code to sqlite3 and pysqlite recently. Sadly, I can tell you that the logic for when to automatically start a transaction in sqlite3 does still depend on textual matching: if the statement starts with INSERT, UPDATE, DELETE or REPLACE (intended to catch DML statements) then a transaction is started just before executing the statement itself. This comparison skips over whitespace but not comments (e.g. "/*foo*/INSERT ..." is not considered to start with INSERT). This only happens if isolation_mode is not None, as you said, and only if a transaction is not already started (as determined with sqlite3_get_autocommit(), so it will notice transactions started by the developer manually executing BEGIN).

It's true that there was a move in the past to away from statement parsing. Here's how the situation got (back) to this point:

In the past (before Python 3.6 and before pysqlite 2.8.0), some statements would cause an implicit COMMIT beforehand (if a transaction was open, as determined with !sqlite3_get_autocommit()). Again, this only happened if isolation_mode is not None. The rule was: if a statement started with INSERT, UPDATE, DELETE or REPLACE (intended to catch DML), BEGIN a transaction; if it started with SELECT (i.e. DQL), do nothing; otherwise (this case is meant to catch DDL like CREATE TABLE), COMMIT a transaction.

Then pysqlite (commit 94eae50) got rid of all the above logic, and replaced it with simply: if !sqlite3_stmt_readonly() then start a transaction. That got rid of all the statement parsing, and the removal of the implicit transaction COMMIT meant you could include DDL like CREATE TABLE in a transaction implicitly started by pysqlite. But this exact version of the code never made it into a release.

The next day, the logic was changed (commit 796b3af) to if !sqlite3_stmt_readonly() && !is_ddl then start a transaction, where is_ddl was defined as a statement starting with CREATE, DROP or INDEX, so there was still text matching in there. This was done to improve backwards compatibility with existing code. This version still never implicitly committed transactions. This was released as pysqlite 2.8.0. That is still how it stands to this day, and is unlikely to change given that its readme says "This project is not actively maintained any more: You are better off using the sqlite3 module in the Python standard library".

This was initially merged into Python 3.6 beta, but issue 28518 was raised about backwards compatibility: conn.execute("BEGIN IMMEDIATE"), which had previously worked, would raise an exception "cannot start a transaction within a transaction". That's because this is not a read only statement (as explained in a comment by D. Richard Hipp), so a transaction was now being started automatically just before it was executed. Such statements are better suited to isolation_mode = None, but the issued showed the change would've broken existing code, so before Python 3.6 was released the code was changed to the current logic. Compared to Python 3.5, that meant that only the "otherwise" branch had really changed, and the release notes simply said "sqlite3 no longer implicitly commits an open transaction before DDL statements".

Funnily enough, the DB API spec - which as you said is the motivation for all this funkiness - says that a transaction should be opened before any statement at all, even a SELECT. This isn't clear from the document itself but I came across an archived email from the PEP author that says this is what they intended (but I can't find the link at the moment). Given that sqlite3 doesn't satisfy the spec, all this automagic behaviour seems like wasted effort!

12:35 Reply: Statements outside of explicit transactions in SQLite (artifact: 8c2f6aebde user: james.oldfield)

Thanks for another great reply.

Yes, the Python sqlite3 wrapper sometimes fiddles with transactions automatically. I actually started this whole process by innocently wondering how to start and stop transactions and ended up much further down the rabbit hole than I intended! At some point I hope to document my findings in a StackOverflow answer which, no doubt, no one will ever read :-) I'll include a note about closing cursors for incompletely exhausted SELECT statements.

As a result, I've spent a lot of time reading the source code to sqlite3 and pysqlite recently. Sadly, I can tell you that the logic for when to automatically start a transaction in sqlite3 does still depend on textual matching: if the statement starts with INSERT, UPDATE, DELETE or REPLACE (intended to catch DML statements) then a transaction is started just before executing the statement itself. This comparison skips over whitespace but not comments (e.g. "/*foo*/INSERT ..." is not considered to start with INSERT). This only happens if isolation_mode is not None, as you said, and only if a transaction is not already started (as determined with sqlite3_get_autocommit(), so it will notice transactions started by the developer manually executing BEGIN).

It's true that there was a move in the past to away from statement parsing. Here's how the situation got (back) to this point:

Side note: At one point in the past, some statements would cause an implicit COMMIT beforehand (if a transaction was open, as determined with !sqlite3_get_autocommit()). Again, this only happened if isolation_mode is not None. The rule was: if a statement started with INSERT, UPDATE, DELETE or REPLACE (intended to catch DML), BEGIN a transaction; if it started with SELECT (i.e. DQL), do nothing; otherwise (this case is meant to catch DDL like CREATE TABLE), COMMIT a transaction. In Python's sqlite3 module, this was true before Python 3.6; in pysqlite, it was true before 2.8.0.

At one point pysqlite (commit 94eae50) got rid of all the above logic, and replaced it with simply: if !sqlite3_stmt_readonly() then start a transaction. That got rid of all the statement parsing, and the removal of the implicit transaction COMMIT meant you could include DDL like CREATE TABLE in a transaction implicitly started by pysqlite. But this exact version of the code never made it into a release.

The next day, the logic was changed (commit 796b3af) to if !sqlite3_stmt_readonly() && !is_ddl then start a transaction, where is_ddl was defined as a statement starting with CREATE, DROP or INDEX, so there was still text matching in there. This was done to improve backwards compatibility with existing code. This version still never implicitly committed transactions. This was released as pysqlite 2.8.0. That is still how it stands to this day, and is unlikely to change given that its readme says "This project is not actively maintained any more: You are better off using the sqlite3 module in the Python standard library".

This was initially merged into Python 3.6 beta, but issue 28518 was raised about backwards compatibility: conn.execute("BEGIN IMMEDIATE"), which had previously worked, would raise an exception "cannot start a transaction within a transaction". That's because this is not a read only statement (as explained in a comment by D. Richard Hipp), so a transaction was now being started automatically just before it was executed. Such statements are better suited to isolation_mode = None, but the issued showed the change would've broken existing code, so before Python 3.6 was released the code was changed to the current logic. Compared to Python 3.5, that meant that only the "otherwise" branch had really changed, and the release notes simply said "sqlite3 no longer implicitly commits an open transaction before DDL statements".

Funnily enough, the DB API spec - which as you said is the motivation for all this funkiness - says that a transaction should be opened before any statement at all, even a SELECT. This isn't clear from teh document itself but I came across an archived email from the PEP author that says this is what they intend it (but I can't find the link at the moment). Given that sqlite3 doesn't satisfy the spec, all this automagic behaviour seems like wasted effort!

2021-01-04
22:21 Reply: Statements outside of explicit transactions in SQLite (artifact: 93f71d274f user: james.oldfield)

Many thanks for this very comprehensive reply. It's much appreciated, especially since I can see there's a continuous deluge of questions here.

Out of curiosity, I made a little of test of what you said, and verified it's correct (although I never doubted it): On one connection I executed BEGIN, then SELECT and started iterating over the results, then COMMIT and continued iterating over the earlier cursor. On another connection I repeatedly tried executing INSERT statements. This was in journaling mode so the SELECT held up the INSERTs as expected. Sure enough, as you said, this did not release on the COMMIT, and instead the INSERT only progressed once the SELECT was exhausted.

I have to admit, I find this very surprising. I'd expect COMMIT (or ROLLBACK) to release all locks, and an attempt to continue using an open statement from before to result in an error, something like "transaction for that statement has now been closed". Now I know, I'll be more careful about resetting incompletely iterated statements! (I'm actually using the Python sqlite3 wrapper module, so this translates to calling .close() on the Cursor object in that API.)

I have a follow up question: Are there are any statements that write to the database but return multiple rows? I just ask because, if so, that would mean that an open cursor to such a statement would cause the write lock to be retained even after a COMMIT. Assuming that's not the case, we can always be sure a successful COMMIT will at least store all data in the database, even if it potentially doesn't release a read lock.

2021-01-03
18:00 Post: Statements outside of explicit transactions in SQLite (artifact: b054807bf1 user: james.oldfield)

There's a lot of waffle below, but I suppose you could summarise it all as: What happens to statements outside of explicit transactions? The documentation seems a bit unclear on this (although it's usually excellent by the way, many thanks for that).

Autocommit mode vs implicit transactions: Oddly, the section on implicit transactions doesn't mention autocommit mode. Autocommit mode only seems to be mentioned in passing on the sqlite3_get_autocommit(), which conversely doesn't mention implicit transactions! But both talk about statements not executed within an explicit transaction (BEGIN and COMMIT/ROLLBACK), so those are related... right? Am I right in thinking that an implicit transactions are only ever started when in autocommit mode, and conversely any statement executed in autocommit mode (except BEGIN ...) will start an implicit transaction?

Implicit transactions with multiple statements: This is my key confusion, where the documentation doesn't seem to match reality. The section on implicit transactions, linked above, says:

An implicit transaction ... is committed automatically when the last active statement finishes. A statement finishes when its last cursor closes ...

So I tried a test: In one thread, I ran a SELECT statement, but in between retrieving rows (2 at a time) I INSERTed some more rows (1 at a time) and did deliberate pauses. In another thread, using a different connection, I repeatedly did a SELECT COUNT(*). From what I understand of that quote, the SELECT and INSERTs in the first thread should have consisted of a single implicit transaction, because the "last active statement" on that connection had not yet finished (I hadn't reset the SELECT, and indeed continued to retrieve more rows through it afterwards). But, instead, the COUNT(*) kept increasing on the other thread, showing that the INSERT had committed immediately.

Is that quoted text correct? Does it mean something different than I interpreted it?

(Just to be clear: I'm actually very happy that statements not in a transaction are committed immediately, even when a cursor for a SELECT statement is also open. It's just that I'm confused about what the documentation is referring to.)

Can an implicit transaction hold up an explicit transaction being committed: The document File Locking And Concurrency In SQLite Version 3 mentions:

The SQL command "COMMIT" does not actually commit the changes to disk. It just turns autocommit back on. Then, at the conclusion of the command, the regular autocommit logic takes over and causes the actual commit to disk to occur.

But according to that earlier quotation, an implicit transaction won't be committed if a cursor is still open. Does this mean that even an explicit COMMIT statement for an explicit transaction might not actually commit data right away, if a cursor is still open from a statement? That would be very surprising! It seems not, given my previous experiment, but I just wanted to confirm this.