Draft Doc on SQLite SAVEPOINT, and its internal implementation

(1) By Dan Shearer (danshearer) on 2020-09-29 14:27:00 [link] [source]

Savepoints in SQLite

This contains preliminary analysis and some surprising (but consistent, useful and mostly-documented) behaviour of SQLite relating to savepoints. When this document is more comprehensive and accurate it can be added to the tree somewhere. I have been working through the transaction code in SQLite with Uilbeheist, and all contributions welcomed.

SAVEPOINT statement in SQLite vs the standard

The SQL standard defines savepoints as named sub-transactions. sqlite3 extends both SQL standard transactions and savepoints to be a superset of both. An SQLite BEGIN/COMMIT transaction is a special un-named case of a savepoint, and a named saveppoint outside a BEGIN/COMMIT has an implied BEGIN. A savepoint cannot ever be followed by a BEGIN because there can only be one open main transaction at once, and a BEGIN always marks the start of a main transaction.

A savepoint is most often used as a subtransaction.

This above context helps understand the SQLite SAVEPOINT documentation which says:

SAVEPOINTs are a method of creating transactions, similar to BEGIN and COMMIT, except that the SAVEPOINT and RELEASE commands are named and may be nested.

Other implementations of SQL stick to the standard definition, with MariaDB, Postgresql, Microsoft SQL Server and Oracle Server seeming to be more or less identical.

MariaDB can seem as if it behaves like SQLite, but that is only due to it being silent rather than throwing an error when a savepoint is used outside BEGIN/COMMIT. From the MariaDB documentation: "if SAVEPOINT is issued and no transaction was started, no error is reported but no savepoint is created". In fact MariaDB behaves like the strict standard and the other implementations.

Savepoints in SQLite Code

Internal terminology: Where savepoints are not used within a standard transaction, source code comments call it a "transaction savepoint". Similarly an internal name for a standard BEGIN/COMMIT transaction is "anonymous savepoint" while a "non-transaction savepoint" is the usual kind that follows a BEGIN.

vdbe.c maintains the struct Savepoint declared in sqliteInt.h, while pager.c maintains an array of struct PagerSavepoint. These parallel structures all come down to the same objects on disk.

vdbe.c

The opcode OP_Savepoint is the only relevant code in vdbe, which has some savepoint logic and calls btree.c/sqlite3BtreeSavepoint(). vdbe deals with the savepoints names and assigns each a sequence number.

btree.c

btree.c implments sqlite3BtreeSavepoint() which uses sqlite3PagerOpenSavepoint() to do the work. There is not much savepoint logic in btree.c however it is btree.c that implements transactions and subtransactions. (Subtransactions map onto subjournals but btree.c doesn't know anything about them.)

pager.c

Savepoint logic is mostly implemented in pager.c, by manipulating the objects in the Pager.aSavepoint[] array . pager.c has the complete implementation of sub-journals, which are maintained to match savepoint nesting. pager.c does not know about savepoint names, only the sequence numbers vdbe.c assigned. It is pager code that does the actual rollback to the correct savepoint, no other code is involved in this.

Note: Savepoint code in pager.c seems to be quite intertwined with journal states, but very little difference between using WALs or not. pagerUseWal() and the aWalData[] array seem hardly used suggesting that savepoint implications for WAL mode are little different from the others, which seems a little surprising.

(2) By Wout Mertens (wmertens) on 2020-09-30 10:14:39 in reply to 1 [link] [source]

Upon reading this, I'm missing an explicit pointing out of the surprising difference that is promised in the beginning, and the standard definition that other database engines are implementing instead.

For the rest, looks good to me!

(3) By Dan Shearer (danshearer) on 2020-10-01 06:48:58 in reply to 2 [source]

Wout Mertens (wmertens) on 2020-09-30 10:14:39:

Upon reading this, I'm missing an explicit pointing out of the surprising difference that is promised in the beginning, and the standard definition that other database engines are implementing instead.

I found it cool that SQLite users can decide never to use BEGIN/COMMIT|END again, unless they want IMMEDIATE (==EXCLUSIVE in WAL mode). I suppose if I had been a SAVEPOINT user for years I would have been less surprised, but my introduction to SAVEPOINT was reading the code in src/pager.c .

By allowing all transactions to be (a) named and (b) nestable without changing syntax depending on context, then SQLite is enabling an improvement in SQL code that does not exist elsewhere and I think that is good.

MS SQL Server does allow BEGIN/END to be nested (as SQLite easily could, but chooses not to). All other databases I tried or whose documentation I read implement nested transactions according to the SQL standard, which means only by means of SAVEPOINT within a BEGIN block. That means the SQL developer has a context-dependent change in syntax, and cannot name the outer transaction.

Dan Shearer

(4) By Wout Mertens (wmertens) on 2020-10-01 15:48:49 in reply to 3 [link] [source]

Ah yes, that is nice! I wasn't even aware of it, but I also use BEGIN IMMEDIATE mostly.