random DB locks
(1) By roidal on 2021-08-17 10:57:58 [link] [source]
Iam using a python web-application which utilizes a sqlite-db for storing data. The Database itselfe is in WAL-mode, and the application always does a commit after a modify-statement keeping lock-times as short as possible, to allow multiple users accessing the web-application. This were working fine for about a year, but since a few weeks the application crashes randomly with "sqlite3.OperationalError: database is locked". I'am using the default-timeout from python (5 seconds). To become a better understanding of the problem i implemented a log-mechanism, writing out some timings/debug-data IF the application require more then 3 seconds to finish successfully. Though that never happens. Looks like in some rare cases the DB does *not* become unlocked correctly after the application ends successfully, and next time it crashes while waiting for the lock to be released. I'am wondering if someone else experince that? All that is running on a Server using RHEL 8.3 and a XFS as FS.
(2) By Simon Slavin (slavin) on 2021-08-17 23:37:58 in reply to 1 [link] [source]
Please run an integrity check on the database.
How are you doing the COMMIT ?
Is the database stored on the same computer that is doing the processing ? If not, how is the storage accessed ?
(3) By roidal on 2021-08-18 09:47:33 in reply to 2 [link] [source]
Integrity check is ok.
Iam calling the commit-method of the connection-object.
Right now i do not have the exact information about storage since the application runs on a virtual server. I will try to figure it out.
(4) By Simon Slavin (slavin) on 2021-08-18 14:50:41 in reply to 3 [source]
Virtual servers have a poor reputation for supporting file locking. Yours might be simulating full XFS support, but the XFS disk is not a real disk, it's part of the simulated hardware. And the communication between the virtual computer and the virtual storage may not provide full support for locking. Consequently you may be running into
In your case, you don't have two competing processes causing corruption. But you're getting lock failure anyway.
I am guessing that identical software to yours won't have this problem if it is run on a real computer with a real XFS drive inside it. But I have no way to prove it, and it's possible you have a different problem none of us have thought of. Do you have any way to test this ?
(5) By roidal on 2021-08-18 18:18:53 in reply to 4 [link] [source]
Thanks for feedback.
I could find a problem with my logging which prevented long running sessions to be monitored. This is fixed now and i can see that in some cases commits take long time (>3 seconds instead of milliseconds).
So yes, i believe there is some problem with IO.