Centos pre-thread db connection using WAL journal_mode
I have a multi threaded application that is running on Centos. I am attempting to use a pre thread connection to a single db file. According to the documentation multiple connections to the same DB is supported. I am using the WAL journal_mode to allow atomic transactions and rollback support.
I have a testing app that ends up failing due to the journal_mode support seeming to not supported the same on the centos platform.
As part of initializing the journal_mode the sqlite3 code for unixLockSharedMemory calls ftruncate as part of some validation checking.
sqlite3.c ln 38018.
The first thread to arrive at this code succeeds and physical -shm and -wal files are created on disk for the db.
The next thread to arrive at this code fails with error code (22)
EINVAL The argument length is negative or larger than the maximum file size. The comment around the call to ftruncate seems to indicate that the connection is assumed to be the first connection but this is actually the second.
This seems to indicate that the journal behavior does not support multiple connections to the same db file on the Centos platform.
Any suggestions for how to properly set this up? Or should I change to a single connection per process?
Is the file located on a local filesystem?
Yes the -wal and -shm files are on the local file system.
I suspect you've got cross-thread data leakage. You cannot, for example, use one thread to query the DB and another to consume the results from that query without making a copy in RAM before passing the data off to the other thread.
When I get the failure it is at initialization time for the db connection.
the Pragma statement being executed is "PRAGMA journal_mode = WAL" this results in the SQL error code 'SQLITE_IOERR_SHMOPEN' being returned.
digging into the sqlite.c file that is where I tracked the failing piece to be the truncate call.
What file system are you using?
digging into the sqlite.c file
Can you replicate the error with a non-amalgam build of SQLite? For example, do any of
test/thread* fail on that same system?
I am using the WSL install of Centos on my windows machine so I can debug. This is the file system information for that setup.
NOTE: my code is located in the c: directory structure. The windows format for that drive is NTFS
Filesystem Type Size Used Avail Use% Mounted on
rootfs wslfs 954G 400G 554G 42% /
none tmpfs 954G 400G 554G 42% /dev
none tmpfs 954G 400G 554G 42% /run
none tmpfs 954G 400G 554G 42% /run/lock
none tmpfs 954G 400G 554G 42% /run/shm
none tmpfs 954G 400G 554G 42% /run/user
tmpfs tmpfs 954G 400G 554G 42% /sys/fs/cgroup
C:\ drvfs 954G 400G 554G 42% /mnt/c
So this thing you are reporting is an problem with an emulator?
One that has never been encountered by anyone using "the real thing"?
Have you considered that the problem may have nothing whatsoever to do with SQLite3 but is rather a defect in the emulator you are using and you should be directing your problem there instead of here so that they can make their emulator work more better?
I am seeing it as well on our build machines, which are running Centos but am less able to debug on that hardware directly.
I will take a different approach and use a single connection to a db as part of my process.
What does "uname -a" have to say about the running CentOS, for example:
Linux xxx 5.8.3-x86_64-linode137 #1 SMP PREEMPT Mon Aug 24 14:50:33 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
and the release of CentOS:
# cat /etc/centos-release CentOS release 6.10 (Final) #
Are you actually running the code (in your production environment) on an actual Linux kernel or rather on some kiddie-tainer (there are lots of these that go by lots of amusing names) which claims (or often does not claim but many seem to assume) to support the needed capabilities?
If this is WSL 2, then try a VM instead, so that you'll at least be running on a supported filesystem.
on my windows machine so I can debug
There are debuggers on CentOS. I would not be surprised to find that they work poorly on WSL1, but I'd expect them to work well on WSL2, and even better on a HyperV VM.
So, not CentOS, then, which does not support NTFS.
This is a file system access error you're getting, so yes, using a nonstandard kernel atop a nonstandard file system for that kernel is relevant, so you should have reported it from the start.
Does the problem happen on the "real" CentOS machines? Can you replicate it on a VM?