SQLite Forum

WAL journal and threading mode
Login

WAL journal and threading mode

(1) By PazO (JohnSmith) on 2020-11-09 07:10:40 [link] [source]

In my application there is an intensive usage of the database from a multi-threading environment.
Until now my sqlite configuration was: journal-mode=DELETE (DEFAULT), synchronous=FULL (DEFAULT), and threading-mode MULTITHREAD(2), where each thread is using its own connection.

I created a stress unit-test application that WRITEs from many threads, each thread with its own connection. While testing performance I found that when journal-mode=WAL and synchronous=NORMAL the performance goes much higher but also the CPU goes high (up to 90%). I will emphasize here that I implemented sqlite3_busy_handler that always returns true, but every 100 retries sleeps for 1 ms.

My question is about the combination of WAL journal and threading mode:
Assuming that WAL journal takes care of locks over the database (?) do I need also to manage synchronization using threading mode?
What will happen if I use WAL|NORMAL with zero multi-threading protection: SQLITE_CONFIG_SINGLETHREAD (SQLITE_THREADSAFE=0)?

Many thanks for this clarification.

(2) By Gunter Hick (gunter_hick) on 2020-11-09 10:11:22 in reply to 1 [link] [source]

Journal mode and threading mode are orthogonal concepts. You still need exactly the same thread protection as before.

WAL journal mode is about multiple READERS and a SINGLE writer; multiple writers still have to take turns.

The performance gain is mostly from changing the synch mode to normal, which is not recommended for journal mode DELETE/TRUNCATE/PERSIST, but consistent with changing to WAL mode.

Did you profile your application to see where the CPU is spending so much time? My guess is that most of your threads are busy-wating in a tight loop for the write lock on the WAL journal most of the time. Maybe you would like to consider synchronizing your writer threads with a semaphore that blocks threads that want to write before harassing SQLite.

(3) By PazO (JohnSmith) on 2020-11-09 12:21:04 in reply to 2 [link] [source]

Yes, your guess is write: my unit-test busy CPU-time is while my sqlite3_busy_handler callback is being called million of times due to the server busy condition.
This is when sqlite is being attacked by other worker-threads of my unit-test.

The interesting thing that I did find is that when I configure sqlite to TRUNCATE-NORMAL the CPU usage of my app is around 30% while WAL-NORMAL brings it up to 99%.
I should mention that measuring performance the WAL performance is at least twice as fast as TRUNCATE.

(4) By Gunter Hick (gunter_hick) on 2020-11-09 13:16:00 in reply to 3 [link] [source]

You should at least sleep a little bit in the callback to give the writer thread a chance at actually doing something other than fighting for control of a cpu.

The SQLite standard busy handler sleeps progressively longer the more often it gets called for a specific operation until the maximum retry timeout is reached.

(5) By PazO (JohnSmith) on 2020-11-10 04:17:21 in reply to 4 [link] [source]

How do I setup sqlite to call the SQLite standard busy handler in these conditions?

I found that if I do not register my own sqlite_busy_handler callback then all WRITE operations will fail immediately with the server busy error...

(this is in my unit-test application that works in MULTITHREADING(2) mode, with one connection per-thread)

(6) By Keith Medcalf (kmedcalf) on 2020-11-10 06:47:50 in reply to 1 [source]

My question is about the combination of WAL journal and threading mode: Assuming that WAL journal takes care of locks over the database (?) do I need also to manage synchronization using threading mode? What will happen if I use WAL|NORMAL with zero multi-threading protection: SQLITE_CONFIG_SINGLETHREAD (SQLITE_THREADSAFE=0)?

Threading mode has nothing whatsoever to do with journal mode. Threading mode is about protecting SHARED DATA from simultaneous update by multiple threads. If you understand multiprogramming at all, then you should already understand this.

There are TWO types of data associated with the SQLite3 library. There is data associated with a "Connection" and there is data associated with an instance of the "Library" itself.

The THREADING MODE controls which semaphores are used to protect this data from concurrent modification.

The default mode SERIALIZED, means that there are active semaphores which will PREVENT you from multiple concurrent updates of data associated with the "Connection".

Setting MULTITHREADED turns this protection off. This does not mean that you can now have multiple concurrent updating of the "Connection" associated data, it means that if you attempt to do so, the SQLite3 library will not protect you and instead all hell will break loose.

However, for both SERIALIZED and MULTITHREAD threading modes, the semaphores are still in effect which serialize access to global data structures.

Setting SINGLETHREAD threading mode, IN ADDITION to disabling the access semaphores around connection data,also disables the semaphores protecting the global data structures from concurrent change.

The so-called threading mode has nothing whatsoever to do with threading -- it has to do with the level of protection (NOTHING, BELT, or BELT and SUSPENDERS) that the library provides to protect itself from improper use.

(7) By Keith Medcalf (kmedcalf) on 2020-11-10 06:53:23 in reply to 5 [link] [source]

How do I setup sqlite to call the SQLite standard busy handler in these conditions?

I found that if I do not register my own sqlite_busy_handler callback then all WRITE operations will fail immediately with the server busy error...

See PRAGMA busy_timeout https://sqlite.org/pragma.html#pragma_busy_timeout
or
sqlite3_busy_timeout() https://sqlite.org/c3ref/busy_timeout.html

(8) By PazO (JohnSmith) on 2020-11-10 08:24:11 in reply to 7 [link] [source]

Thank you for this info.
I did try to use it in the past but I misunderstand its usage: I thought it will sleep for ms milliseconds. Now I understand that it will actually do busy-wait and occasionally sleep for short periods until ms is reached.
I will re-try it now with ms set to large value (few minutes?) as I want it always to succeed, unless something really went wrong.

(9) By Keith Medcalf (kmedcalf) on 2020-11-10 10:09:19 in reply to 6 [link] [source]

For the greater certainty, "global data" and "data associated with a connection" DOES NOT INCLUDE data contained in any file on disk or stored "at the other end" of a connection.

It means "data stored in memory". An example of "global data" is such things as linked lists of pointers to all the "connections" open and myriads of other global housekeeping data stored in program memory. An example of "connection data" may include such things as the data maintained by the connection structure itself (where am I? who am I? what files am I connected to? what is the current locking state of those files?) as well as the lists of and subordinate objects to the connection itself such as the internal reprentation of the schema for the database files belonging to the connection, the statements belonging to the connection, and so on and so forth.

The threading mode is concerned with control of multiple access and modification of that data.

The journal mode / transaction mode / multiple access concurrency control / file locking is concerned with accessing data stored in files that are at the other end of the connection.

They are two completely separate and unrelated things.

(10.2) By Keith Medcalf (kmedcalf) on 2020-11-10 10:29:16 edited from 10.1 in reply to 6 [link] [source]

The default threading mode is BELT and SUSPENDERS (SERIALIZED threading mode). That means that the library will take necessary precautions to ensure that your pants do not fall down under any circumstance.

You may take off the SQLite3 provided suspenders (by setting MULTITHREADED threading mode) in which case the library will not prevent your pants from falling down if you concurrently modify connection related data, you must do that yourself with your own suspenders. One common example of how you might provide your own suspenders is by assigning one connection (and it subordinate objects) for use by only one thread.

Similarly you can take off the BELT as well (by setting SINGLETHREAD threading mode), in which case any type of concurrent access might cause your pants to fall down.

Just because SQLite3 is providing a BELT and SUSPENDERS does not in any way restrict you from wearing your own suspenders as well. You will just pay a very tiny price in CPU usage for the extra set of suspenders. However, in doing so, even if your suspenders get frayed and worn or are worn improperly, your pants will still be prevented from falling down by the builtin high quality and carefully fitted suspenders provided by the SQLite3 library.