SQLite Forum

cannot start a transaction within a transaction using sqlite shell
Login

cannot start a transaction within a transaction using sqlite shell

(1.1) By Brian Minton (bjmgeek) on 2021-09-22 19:34:50 edited from 1.0 [link] [source]

I wanted to test transaction concurrency using the sqlite shell. So, I created a simple database. Note that this is on debian so the sqlite shell is sqlite3.

sqlite3 test.db 'create table foo(bar);'

Then, I created a writer loop to make a bunch of tiny transactions:

for n in `seq 1000000`; do echo "begin transaction; insert into foo values($n);end;" ;done | sqlite3 test.db

Note that this is a bunch of transactions in a single session, using a single sqlite shell instance which presumably keeps the database attached.

Then, in another terminal, I tried to truncate the table. I expected that this should either fail, if the database is in a transaction, or succeed if the database is not in a transaction.

while ! sqlite3 test.db 'delete from foo;'; do true ; done

As expected, this spawns a bunch of sqlite3 processes sequentially, each one attempting to start a write transaction. Eventually one delete transaction succeeds, and the table is truncated; However, then in the other window, the rest of the statements fail with: Error: near line 999998: cannot start a transaction within a transaction

So, my question is this: How is there an error message about starting a transaction within a transaction? There's an explicit begin and end for each transaction, and on the other side, the delete operation with truncate optimization should be an implicit transaction. I also notice that if I do each insert in a separate sqlite3 process invocation like so:

for n in `seq 1000000`; do echo "begin transaction; insert into foo values($n);end;" | sqlite3 test.db ;done

Then I don't receive the error about transactions within transactions.

thanks, Brian Minton

(2) By Keith Medcalf (kmedcalf) on 2021-09-22 18:34:42 in reply to 1.0 [link] [source]

Have you checked the return code from each command?

(3) By Scott Robison (casaderobison) on 2021-09-22 18:46:08 in reply to 2 [link] [source]

Probably not since the queries are just being fed into the SQLite shell.

(4.1) By Brian Minton (bjmgeek) on 2021-09-22 19:36:17 edited from 4.0 in reply to 1.0 [link] [source]

By the way, this is for version 3.34.1 2021-01-20 14:10:07 10e20c0b43500cfb9bbc0eaa061c57514f715d87238f4d835880cd846b9ealt1.

(5) By Brian Minton (bjmgeek) on 2021-09-22 19:31:10 in reply to 1.0 [link] [source]

I also noticed that if I leave the database in the default journal mode of delete, I can do a read-only transaction and cause the same error:

while ! sqlite3 test.db 'begin;select count(*) from foo;end;'; do true ; done

which produces the following output: Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked Error: database is locked 36943

and the other window starts showing: Error: near line 871305: cannot start a transaction within a transaction Error: near line 871306: cannot start a transaction within a transaction Error: near line 871307: cannot start a transaction within a transaction Error: near line 871308: cannot start a transaction within a transaction Error: near line 871309: cannot start a transaction within a transaction Error: near line 871310: cannot start a transaction within a transaction Error: near line 871311: cannot start a transaction within a transaction Error: near line 871312: cannot start a transaction within a transaction Error: near line 871313: cannot start a transaction within a transaction Error: near line 871314: cannot start a transaction within a transaction Error: near line 871315: cannot start a transaction within a transaction Error: near line 871316: cannot start a transaction within a transaction Error: near line 871317: cannot start a transaction within a transaction

In WAL mode, read-only transactions work just fine, but read-write transactions still cause the error.

(6) By Richard Hipp (drh) on 2021-09-22 19:35:59 in reply to 1.0 [link] [source]

I'm guessing:

  1. First window does: BEGIN; INSERT INTO ...;

  2. Second window does DELETE FROM ...; with an implicit COMMIT.

  3. First window tries to do: END;. But because END is just an alternative spelling for COMMIT, it cannot, because the second window jumped in line ahead of it. So the COMMIT fails, leaving the transaction open.

  4. All subsequent BEGIN operations fail because there is already a transaction open.

Possible solutions:

  • Use "BEGIN IMMEDIATE" instead of "BEGIN TRANSACTION".

  • Put the BEGIN and the END on separate input lines. I think (but I'm not sure - I'll need to check the code) that the CLI works a line at a time. So if the BEGIN fails, the END never has a chance to run. Larry has been in that code more recently than me → perhaps he can answer off the top of his head.

(7) By Brian Minton (bjmgeek) on 2021-09-22 19:36:38 in reply to 2 [link] [source]

I have not checked the exit status, but it's all one sqlite process.

(8.2) By Larry Brasfield (larrybr) on 2021-09-23 11:17:04 edited from 8.1 in reply to 1.0 [link] [source]

(Edited to retract earlier agreement and amend earlier experiment report)

After looking into this, I no longer agree that your test should have either succeeded, without the complaint you posted, or should have done something else. (I would think it would have "failed" often with a flurry of "busy" messages. This I expect because, by default, there is no busy handler set by the shell. The shell indicates this with a message, "Error: near line 18575: database is locked". )

With sqlite3 v3.36.0 on Ubuntu, instead of the error you report I get Error: near line ###: disk I/O error , repeatedly, where the line number progresses.

I think this also should not happen. This will be investigated.

This does not happen on Ubuntu 20.04 running on a system by itself. It appears to be an artifact of running on Ubuntu WSL.

(Amendment via edit in light of Richard's post:)

With this in the DB stuffing process: for n in $(seq 1000000); do echo "begin immediate transaction; insert into foo values($n); commit;"; done | sqlite3 test.db , I continue to get the "disk I/O error" on WSL only.

I agree that letting SQLite defer actual transaction until the next, does-something statement, is problematic. I am investigating why, without such deferment (using "immediate") there is still an error other than SQLITE_BUSY.

(9) By Brian Minton (bjmgeek) on 2021-09-22 19:46:20 in reply to 6 [link] [source]

I agree that that is what is happening, but I don't think that's what's supposed to happen. My understanding was that transactions are distinct to an individual process. Also, note that even with the DELETE running in an explicit transaction, the same thing happens.

Looking at the documentation https://www.sqlite.org/lang_transaction.html I see the following:

SQLite supports multiple simultaneous read transactions coming from separate database connections, possibly in separate threads or processes, but only one simultaneous write transaction.

(10) By Keith Medcalf (kmedcalf) on 2021-09-22 20:46:38 in reply to 5 [link] [source]

This is because of a failure to comprehend what is happening, and a failure to check the return code of each statement within a batch.

You are sending "commands" of the form: BEGIN; <do something>; COMMIT;

This ENTIRE bufferload consisting of THREE STATEMENTS is sent to the CLI for execution. However, the execution had an ABORT at the second statement in the batch; thus, aborting the batch. The COMMIT; statement was never executed.

Thereafter, for each subsequent duplicate bufferload of three statements batched together, the first statement in the batch, BEGIN;, is executed when a transaction is in progress. An error message that you cannot start a transaction within a transaction is issued and the entire batch is ABORTED -- the remaining three statements in the batch are not executed.

Lather, Rinse, Repeat.

As for the difference with DELETE journal vs WAL probably is the result of an attempt to upgrade a stale snapshot.

(11) By Larry Brasfield (larrybr) on 2021-09-23 01:37:01 in reply to 10 [link] [source]

This ENTIRE bufferload consisting of THREE STATEMENTS is sent to the CLI for execution. However, the execution had an ABORT at the second statement in the batch; thus, aborting the batch. The COMMIT; statement was never executed.

I confirm this is how the shell handles "grouped" SQL statements.a By "grouped", I mean when each SQL input line either contains one or more semicolon-separated statements terminated by a semicolon, or ends with an open construct, such as a quoted string or identifier or a block comment which does not have its terminating token such that one or more additional lines are collected into the "group" before it is submitted to the library for execution (prepare(), step(), etc.) For example, this would be a group: INSERT INTO Fiddle VALUES (1) /* and more! */ ,(2) ('and more'); SELECT * FROM Fiddle;

The bottom line is just as Keith says: Failure on one statement within the group precludes execution of latter statements within the group.


a. I confirm that general sense of the processing. But I would not say anything "is sent to the CLI for execution" except in the sense that the CLI accepts whole lines at a time. Whether or not lines are given over to the SQLite library for execution is determined by the CLI's own logic.

(12) By Larry Brasfield (larrybr) on 2021-09-23 11:00:12 in reply to 9 [source]

I agree that that is what is happening, but I don't think that's what's supposed to happen.

Supposing is a funny thing, kind of like beauty being in the eye of the beholder. I suppose, (and I'm pretty sure Richard supposes), that what you are seeing is exactly what has been designed to happen. I believe your supposition needs to adapt to what Richard has said is likely happening, which is occurring in detail for the reason Keith says (and I confirm.)

My understanding was that transactions are distinct to an individual process.

Processes have nothing to do with it, except as something which affects implementation of the documented behavior (which is what you are seeing.) SQLite "connections" (aka pointers to opaque sqlite3 objects) are what map one-to-(zero or one) to transactions.

Also, note that even with the DELETE running in an explicit transaction, the same thing happens.

Whether the other connection interfering with your insert sequence is working under explicit or automatic implicit transaction has no bearing on the problem. The same thing is happening with that connection's access to the DB either way.

Looking at the documentation [about transaction concurrency]

This has no bearing on the problem you see. The problem here is relatively simple, with the main complication arising from how the shell processes statement groups.

(13) By Keith Medcalf (kmedcalf) on 2021-09-23 21:59:43 in reply to 8.2 [link] [source]

The root cause is likely that WSL does not work correctly and is insufficient for reliable use and should only be used for "entertainment purposes".

The Great Unwashed seem to commonly believe that these "entertainment quality only" systems are designed for "general purpose production use" when this has been shown, repeatedly, not to be the case.

I do not think there is anything that can be done other than to advise patients "Do not do that then".

(14) By Larry Brasfield (larrybr) on 2021-09-23 23:05:38 in reply to 13 [link] [source]

The root cause is likely that WSL does not work correctly and is insufficient for reliable use and should only be used for "entertainment purposes".

I have found it to be convenient and reliable for most of my day-to-day tasks that a pure Linux system would support. It can build SQLite and its docs, and run most of SQLite's TCL test suite with results identical to those on other platforms. (Tests doing full precision comparison of floating point values reliably fail, however.)

The Great Unwashed seem to commonly believe that these "entertainment quality only" systems are designed for ...

As a member of the occasionally washed, I have come to believe that WSL is good for more than entertainment but that there are limitations which make it worthwhile to keep my Linux and other Unix-like boxes ready for use.

I do not think there is anything that can be done other than to advise patients "Do not do that then".

Hmmmm. Applying some imagination, I can come up alternatives -- most relating to appropriately wary expectation. I admit to slight surprise when, after I encountered the apparent WSL file locking flaw whose effects are described above, my suspicion of it was so soon confirmed on a genuine Linux system. I had been led to believe that WSL Linux-like file system semantics were implemented more completely than they really are. In that respect, the fact that the Linux system calls which are functional on Linux just pretend to work on WSL puts it into the "toy" category where multi-process file access is concerned.

Henceforth, I will be limiting my use of WSL to situations where processes interact only through pipes, starting each other, and non-concurrent file access. It's still a useful tool, but one to be used with care by those not fond of blaming their tools for poor results.

(15) By Warren Young (wyoung) on 2021-09-24 03:05:37 in reply to 14 [link] [source]

WSL2 works hugely differently from WSL1, correcting a large number of inherent flaws in WSL1. WSL2 still isn’t perfect, but being a lightweight Linux VM rather than an NT “personality,” I would class WSL2 as useful for some production tasks.

Can we please be clear about which version we’re talking about?

(16) By Larry Brasfield (larrybr) on 2021-09-24 12:06:30 in reply to 15 [link] [source]

Sorry, I've been using WSL2 since it was released. I'll have to check to see which version's behavior is set to be used. (I may have messed with it as I have a very non-default setup.)

As I said, it (whatever it is) is useful provided it limitations are acceptable.

(17) By Warren Young (wyoung) on 2021-09-24 12:35:58 in reply to 16 [link] [source]

You can ask wsl --status which version is the default on your system.

WSL1 is still in use on many systems from legacy installations, but also because it works inside VM systems that don't support double-virtualization, as with ARM Windows on Apple M1.

One of many differences between the two is that WSL1 uses a POSIX gloss on NTFS as the filesystem, whereas WSL2 uses an actual Linux kernel with regular Linux filesystems. This naturally has a whole laundry list of implications for SQLite's locking and file I/O semantics.

(18) By Larry Brasfield (larrybr) on 2021-09-25 15:09:17 in reply to 17 [link] [source]

Thanks for the tips, Warren. I will follow up on this in about 4 weeks when I get back to that workstation to do the implied experiments.

WSL2 uses an actual Linux kernel with regular Linux filesystems.

That's why I was surprised at the failure, in part. The other part is that, at times, it has passed the whole suite of SQLite's TCL tests except for ones doing full precision FP value comparisons. This makes me think I must have switched that system back to WSL1 for some reason and forgotten to switch it back.

(19) By Brian Minton (bjmgeek) on 2021-09-29 19:10:16 in reply to 11 [link] [source]

I've tried this with multi-line statements as well:

window 1:
for n in `seq 1000000`; do echo -e "begin transaction;\ninsert into foo values($n);\nend;" ;done|sqlite3 test.db 

Window 2:
while ! sqlite3 test.db 'begin;select min(bar) from foo;end;'; do true ; done

The result is 

Error: database is locked
Error: database is locked
Error: database is locked
Error: database is locked
Error: database is locked
Error: database is locked
Error: database is locked
Error: database is locked
[...]
1

on the select side,
and on the insert side:
or n in `seq 1000000`; do echo -e "begin transaction;\ninsert into foo values($n);\nend;" ;done|sqlite3 test.db 

Error: near line 2640: database is locked
Error: near line 2641: cannot start a transaction within a transaction
Error: near line 2871: database is locked
Error: near line 2872: cannot start a transaction within a transaction
Error: near line 2874: database is locked
Error: near line 2875: cannot start a transaction within a transaction
Error: near line 2937: database is locked
Error: near line 2938: cannot start a transaction within a transaction
Error: near line 2940: database is locked
Error: near line 2941: cannot start a transaction within a transaction
Error: near line 2964: database is locked
Error: near line 2965: cannot start a transaction within a transaction
Error: near line 3114: database is locked
[...]

Even after the select transaction finishes, all further insert transactions fail.

(20) By Brian Minton (bjmgeek) on 2021-09-29 19:22:34 in reply to 12 [link] [source]

You are correct that I should have been talking about sqlite connection objects, not processes. However, in this specific case, I think there is a one-to-one mapping between the two, since each invocation of the sqlite shell is associated with a single connection object.