SQLite Forum

set/get last_insert_rowid - To mutex or not mutex?
Login

set/get last_insert_rowid - To mutex or not mutex?

(1) By anonymous on 2020-10-20 14:40:10 [source]

Hi. I'm doing an UPSERT, and to figure out (after the fact) whether it did an INSERT or an UPDATE
I decided to force a sqlite3_set_last_insert_rowid(db, -1) so I can be sure (?)
sqlite3_last_insert_rowid returns a positive ROWID on INSERTs only.

So first, any holes in the above scheme?

But what prompted this new thread was the fact the set is mutex-protected, while the get is not.
I'm no thread-safety expert, but both reads and writes must be mutex-protected to avoid data-races, no?

So am I missing something, or there's a weird mismatch here?
Either neither is mutexed; or both are mutexed. But a mix like this feels wrong.

Thoughts?

*
** Return the ROWID of the most recent insert
*/
SQLITE_API sqlite_int64 sqlite3_last_insert_rowid(sqlite3 *db){
#ifdef SQLITE_ENABLE_API_ARMOR
  if( !sqlite3SafetyCheckOk(db) ){
    (void)SQLITE_MISUSE_BKPT;
    return 0;
  }
#endif
  return db->lastRowid;
}

/*
** Set the value returned by the sqlite3_last_insert_rowid() API function.
*/
SQLITE_API void sqlite3_set_last_insert_rowid(sqlite3 *db, sqlite3_int64 iRowid){
#ifdef SQLITE_ENABLE_API_ARMOR
  if( !sqlite3SafetyCheckOk(db) ){
    (void)SQLITE_MISUSE_BKPT;
    return;
  }
#endif
  sqlite3_mutex_enter(db->mutex);
  db->lastRowid = iRowid;
  sqlite3_mutex_leave(db->mutex);
}

(2) By Gunter Hick (gunter_hick) on 2020-10-20 14:57:00 in reply to 1 [link] [source]

The "last inserted rowid" is only valid in "single thread per connection" environments. No mutex is required to read memory written by the same thread.

The function documentation specifically states that multithread interference may cause sqlite3_last_insert_rowid() to return garbage if another thread is performing an INSERT on the same connection while the function is running.

(3) By anonymous on 2020-10-20 15:10:26 in reply to 2 [link] [source]

OK, thanks Gunter.

So I guess it is a case of a documented data-race then.

Still, it doesn't make much sense to me. If you are not multi-threaded,
then taking the mutex on writes is not necessary.

And if you are multi-threaded, not taking the mutex on read guarantees
garbage in db->lastRowid in general. So might as well not take it on
write either. lastRowid would need to be an atomic for that to be correct.

PS: I'm in a multi-threaded environment, but do enforce (as far as I know!)
the single-thread-per-connection myself.

(4) By Gunter Hick (gunter_hick) on 2020-10-21 06:24:37 in reply to 3 [link] [source]

The memory barrier issued when releasing the mutex will make sure that the
 (component) writes already executed by (and visible to) the CPU running the thread will be visible to any and all other CPUs. This will cause the subsequent read to be consistent, even if the thread is switched over to a different CPU before reading back the value.

If a different thread uses the connection, it may clobber the lastRowid variable anyway when it issues an INSERT (either directly or as the result of a virtual table implementation or a trigger program), so this situation is already a programming error (failing to retain control of the connection between setting and getting the lastRowid variable).

It just so happens that IF the underlying memory architecture supports less than 64bit atomic writes AND the (illegal) second thread is changing the lastRowid EXACTLY WHEN the first thread is retrieving its value, it is not predictable if the value retrieveid will be the old (first thread) value, the new (second thread) value or any mix of components because the threads are concurrently executing on two different CPUs.

The value returned to the first thread is already unreliable by virtue of having lost control of the connection, and may refer to an arbitrary table, so not returning a valid rowid sometimes is not a big deal IMHO

(5) By anonymous on 2020-10-21 07:07:45 in reply to 4 [link] [source]

That's getting into memory-model guarantees, and platform-specific issues.
C++ didn't have an official memory-model until C++11, and I'm not even sure
when one was standardized for C, when SQLite requires only C89.

But I hear what you are saying. On modern HW with 64-bit atomic writes,
the un-mutexed read should be visible thanks to the memory barrier, and
give you a complete lastRowid. Maybe not the one you want, i.e. not the
one from the table you expect (or the insert you expect), in case of MT
access that violates the SQLite documented usage constraints.

Quite subtle overall, I'd say.