SQLite User Forum

[WAL] Race condition in robust_open for multiple users sharing same database
Login

[WAL] Race condition in robust_open for multiple users sharing same database

(1) By pduchi on 2024-04-17 14:17:54 [link] [source]

Setup:

  1. Journal mode of sqlite db is configured in wal mode
  2. Processes accessing the database and running each under a different user
  3. They share the database via group read/write permissions

Problem:
If all those process write to the same database using exclusive access with BEGIN IMMEDIATE transaction, the following error is sometimes seen.
""Runtime error near line 1: attempt to write a readonly database (8)""

Analysis:
After investigation it was seen that our default umask was set to 022.
-rw-rw---- 1 root db-group 360448 Apr 17 13:32 database.db
In case BEGIN IMMEDIATE transaction command is executed a -shm and -wal file is created with following permissions:
-rw-rw---- 1 user1 db-group 32768 Apr 17 13:48 database.db-shm
-rw-rw---- 1 user1 db-group 0 Apr 17 13:48 database.db-wal

However if we check the code which does this creation
static int robust_open(const char *z, int f, mode_t m)
It is seen that it opens the file with the mode of database.db (660) and later on if in the end the mode does not match, it is chmod again with 660.
In our case the shm/wal file had 640 after open as it still applies the umask (mode & ~umask).
Due to this there is a slight race condition between the file being created as 640 and final 660.
This resulted in the error as above when another user wants to start an exclusive transaction at the same time.
""Runtime error near line 1: attempt to write a readonly database (8)""

Fix proposal:
Instead of doing the chmod later it would be better if the umask would be overruled in the code as follows:
mode_t old_mask = umask(0);
#if defined(O_CLOEXEC)
fd = osOpen(z,f|O_CLOEXEC, m2);
#else
fd = osOpen(z,f,m2);
#endif
umask(old_mask);

This will avoid the race condition and the other chmod can be removed as this is no longer needed.

(2) By anonymous on 2024-04-17 16:48:00 in reply to 1 [link] [source]

After investigation it was seen that our default umask was set to 022.

There's your problem. A set of cooperating processes running with different user ids and the same group id should use umask 002.

  mode_t old_mask = umask(0);

... and now you have a race condition inside each process, unless you can guarantee that none of them ever use more than one thread.

(3.4) By pduchi on 2024-04-18 11:18:19 edited from 3.3 in reply to 2 [source]

Indeed you are right, the proposal could result in
1. accidental ending up with umask 0 for your process
2. Other threads creating files with full permissions in between umask(0) -> umask(oldmask)

It would still be a good idea if this initial race condition could be documented somewhere or fixed in sqlite.c itself.