facing issue from sqlite3_step()

(1) By Prajeesh (prajeeshprakashp) on 2020-07-16 02:40:58 [link] [source]

Hi Team, I am facing an issue from sqlite3_step(). Issue is that there are 3 records are present in my database table 'A'. On thread is trying to read all the record from the table 'A' (I am using sqlite3_prepare_v2 and sqlite3_step function for this) and from second thread I am updating the fetched information by thread 1 (ie, One thread is reading the data another thread is writing the data which is read by thread 1. For writing I am using INSERT OR REPLACE INTO query). I am using sqlite3_step function inside a loop to read all the three records from table A. eg. while(SQLITE_ROW == sqlite3_step(*statement) { // reading the information using sqlite3_column_xx }
If I repeatably do this read and update, For the first few loop everything is working fine I received all the records correctly. But at some point sqlite3_step is going for extra iteration. That means since there are 3 records present in DB sqlite3_step loop(above) should iterate only 3 times. But here it is going for 4th iteration and giving the same record which is already read. So I am expecting SQLITE_DONE but I am getting SQLITE_ROW after 3rd Iteration. My sqlite3 is in thread safe mode.

Thank you Prajeesh

(2) By Igor Tandetnik (itandetnik) on 2020-07-16 03:46:13 in reply to 1 [source]

From https://sqlite.org/isolation.html , "No Isolation Between Operations On The Same Database Connection" section:

Within a single database connection X, a SELECT statement always sees all changes to the database that are completed prior to the start of the SELECT statement, whether committed or uncommitted. And the SELECT statement obviously does not see any changes that occur after the SELECT statement completes. But what about changes that occur while the SELECT statement is running? What if a SELECT statement is started and the sqlite3_step() interface steps through roughly half of its output, then some UPDATE statements are run by the application that modify the table that the SELECT statement is reading, then more calls to sqlite3_step() are made to finish out the SELECT statement? Will the later steps of the SELECT statement see the changes made by the UPDATE or not? The answer is that this behavior is undefined. In particular, whether or not the SELECT statement sees the concurrent changes depends on which release of SQLite is running, the schema of the database file, whether or not ANALYZE has been run, and the details of the query. In some cases, it might depend on the content of the database file, too. There is no good way to know whether or not a SELECT statement will see changes that were made to the database by the same database connection after the SELECT statement was started. And hence, developers should diligently avoid writing applications that make assumptions about what will occur in that circumstance.

If an application issues a SELECT statement on a single table like "SELECT rowid, * FROM table WHERE ..." and starts stepping through the output of that statement using sqlite3_step() and examining each row, then it is safe for the application to delete the current row or any prior row using "DELETE FROM table WHERE rowid=?". It is also safe (in the sense that it will not harm the database) for the application to delete a row that expected to appear later in the query but has not appeared yet. If a future row is deleted, however, it might happen that the row turns up after a subsequent sqlite3_step(), even after it has allegedly been deleted. Or it might not. That behavior is undefined. The application can also INSERT new rows into the table while the SELECT statement is running, but whether or not the new rows appear in subsequent sqlite3_step()s of the query is undefined. And the application can UPDATE the current row or any prior row, though doing so might cause that row to reappear in a subsequent sqlite3_step(). As long as the application is prepared to deal with these ambiguities, the operations themselves are safe and will not harm the database file.

Emphasis mine. Note that INSERT OR REPLACE is equivalent to deleting the rows that violate the constraint, followed by inserting the new row. The SELECT statement might have already observed the row that is being deleted, and later will observe the row that is being inserted; this way it may appear that the table has extra rows.

(3) By Keith Medcalf (kmedcalf) on 2020-07-16 04:29:36 in reply to 1 [link] [source]

There is no isolation between threads, only between connections. So if you transmute the database then that transmutation is visible to other "statements" executing on the same connection.

The results of transforming a table on the same connection as the one you are using to read that very same table is undefined, so whatever happens is the expected and intended result.

If you do not want the update(s) to affect the select(s), you must do those operations on separate connections.

If you are using non-WAL journalling you will also have to contend with ensuring that the updates will (a) fit in the cache and (b) not committed until the select is finished, otherwise you will get an "error" that the database is busy when attempting to commit the changes.

(4) By Prajeesh (prajeeshprakashp) on 2020-07-20 07:27:14 in reply to 3 [link] [source]

Thank you Keith Medcalf. In my current design, I do read from in-memory DB and write to in-memory and disk (Disk write is happening sequentially also I am not performing any Disk read operation. On system startup, I am restoring the data from Disk to in-memory).

I tried using shared cache mode for in-memory db and creating two connections to it. But the issue is if I try to read from a table from one thread, while I am writing into it from another thread, I get a sqlite busy error.

So, is there any way to handle this scenario with multiple connections to in-memory db?

Thank you

(5) By ddevienne on 2020-07-20 07:53:09 in reply to 4 [link] [source]

is there any way to handle this scenario with multiple connections to in-memory db?

I'm afraid not. Multiple connections to in-memory DB requires shared-cache.
Which works well for parallel read-only scenarios only, no writes.

WAL is not supported for in-memory DB, and likely won't be.
So cannot mix reads and writes concurrent access to for in-memory DBs...

If you write a memory-backed VFL, I guess one could achieve WAL
and thus MVCC for an in-memory DB, except SQLite wouldn't know it's
in-memory, it would appear as an in-disk DB to it.

Of use some kind of memory-backed FS, like a RAM disk or something.
That way no need for a complex VFS at all. --DD

(6) By Keith Medcalf (kmedcalf) on 2020-07-20 23:47:49 in reply to 4 [link] [source]

Not really. About the best you can do is to put the db in a RAMFS. For example, on Linux put the db in /dev/shm

(7) By Gunter Hick (gunter_hick) on 2020-07-22 06:09:18 in reply to 4 [link] [source]

The problem of modifying the data source as you read it is quite common. There are two solutions:

a) deal with the side effect that altering a record my change its position in the visitation sequence, so that any given record may be retrieved 0 to infinity times

b) separate the read and update operations.

For option b) you can

1) formulate what you are attempting to do as an UPDATE query. This will visit each record exactly once and perform the specified changes, while observing any constraints you have defined

or

2) within a single transaction on a single connection, read all of the rows, taking note of any changes required, and then, after the last row was read, write back any necessary changes, handling any constraint errors as they occur

(8) By J-L Hainaut (JLHainaut) on 2020-07-23 13:57:04 in reply to 7 [link] [source]

It is so common that it was given the name "Halloween problem" in 1974 (https://en.wikipedia.org/wiki/Halloween_Problem). Not because it is particularly scary (though it may be so) but because it was identified on Halloween day!