What does, if many threads and/or processes need to write the database at the same instant, mean?
(1) By iio7 (iio7iio7) on 2021-04-10 21:25:17 [link] [source]
Can someone explain in detail what exactly this means?
"If many threads and/or processes need to write the database at the same instant (and they cannot queue up and take turns"
I am making a web application where I know that I will not need a client/server architecture, however, the site will have about 500K visits a day where users will both read and write.
The writes are small writes consisting of ratings, i.e. storing a single integer in a table. Other writes are of similar type, single integers for statistics usage (did the user click on your shop?), etc.
I am considering using SQLite, but I don't quite understand if the above will cause a problem in my case.
Any advice and explanation will be greatly appreciated.
Kind regards
(2) By ThanksRyan on 2021-04-10 23:13:09 in reply to 1 [link] [source]
Di you read the rest of the statement?
SQLite only supports one writer at a time per database file. But in most cases, a write transaction only takes milliseconds and so multiple writers can simply take turns. SQLite will handle more write concurrency that many people suspect.
And it continues on.
but I don't quite understand if the above will cause a problem in my case.
Have you considered simulating what your application would do with 500k visits with reads/writes? This will give you an idea if you can use SQLite.
(3) By Simon Slavin (slavin) on 2021-04-11 10:25:21 in reply to 1 [link] [source]
500K visits a day. To allow for a working day and mixed timezones, assume a 12 hour day with equal distribution. That's 42,000 visits an hour, 700 visits a minute, 12 visits a second, perhaps peak at 18 visits in a second.
You describe each visit being one or two very simple SQL commands. There's nothing in SQLite which would fail to deal with that, assuming solid state storage. You're more likely to encounter problems with your web backend (LAMP ?) or, if your web pages are huge, bandwidth.
Assuming your data is valuable, I remind you to figure out your backup strategy procedure right at the start.
(4) By iio7 (iio7iio7) on 2021-04-11 11:50:53 in reply to 1 [link] [source]
So I decided to do some testing and it went really bad, with locking all over the place.
I have the following options set:
PRAGMA journal_mode = WAL;
PRAGMA temp_store = MEMORY;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -64000;
I have a small PHP script that inserts a single row with some random data into a table.
The script gets called in a loop from just two different machines on the network.
SQLite3::prepare(): Unable to prepare statement: 5, database is locked in ...
I have tried various changes to the settings, but I get the same result. About half of the queries fails with the locking warning.
I had ecspected the queries to build up in a queue, seeing a decrease in write speed, but they don't queue, they just block.
I am thinking it is because of the multi-threaded nature of a PHP web application?
Am I missing something?
(5) By iio7 (iio7iio7) on 2021-04-11 11:58:45 in reply to 1 [link] [source]
Hmm. Changing the busy timeout value solved the problem. Now I am simulating thousands of writes from the two machines and all runs fine.
$sqlite_db->busyTimeout(500);
However, I don't know what a reasonable timeout is.
(6) By Tim Streater (Clothears) on 2021-04-11 12:22:52 in reply to 5 [link] [source]
That's only 0.5 sec. Next try with 20 secs.
(7) By iio7 (iio7iio7) on 2021-04-11 14:37:11 in reply to 1 [link] [source]
What I don't understand is that if I use PRAGMA to set the busy_timeout
and I verify it is set, I still get the locking error. But if I set it during code execution, then I don't get the error:
$sqlite_db->busyTimeout(20000);
(8) By Simon Slavin (slavin) on 2021-04-11 17:13:55 in reply to 7 [link] [source]
Timeout setting is not a setting of a database file, it's associated with each connection to the database. If each web-page access makes its own connection, then you. need to set timeout for each one. Which is what you're doing.
The number in the brackets is milliseconds. 20 seconds is a fine and reasonable timeout setting and you should have a usable setup with it.
(9) By ThanksRyan on 2021-04-11 20:12:34 in reply to 1 [link] [source]
Looks like people on your reddit post also give you as much confidence as well:
https://www.reddit.com/r/sqlite/comments/modtns/deploying_sqlite_on_a_production_website/
(10) By anonymous on 2021-04-11 20:46:00 in reply to 8 [link] [source]
How can one change the built-in default connection busy_timeout to something greater than zero (which seems impractical for default)?
I'm having similar issues; too many 'locked' messages from two processes that access the same database every few seconds.
(11) By Larry Brasfield (larrybr) on 2021-04-11 21:03:29 in reply to 10 [link] [source]
Go to this page, https://sqlite.org/keyword_index.html , and search for the word "timeout". The first hit leads to the busy_timeout pragma.
(12) By Richard Damon (RichardDamon) on 2021-04-11 21:48:57 in reply to 11 [link] [source]
But I think this pragma is still just for the current connection. The question is how to set it so that connections default to a longer timeout than 0 by default without each connection needing to set its own timeout.
(13) By Tim Streater (Clothears) on 2021-04-11 21:57:53 in reply to 12 [link] [source]
Include it in a wrapper function that also takes care of error handling and reporting when setting up the connection. That way you don't have to worry about it. AFAIK you always have to set the timeout each time you connect, or perhaps supply your own timeout function (not sure about that one).
(14) By Larry Brasfield (larrybr) on 2021-04-11 22:02:51 in reply to 12 [link] [source]
There is no compile-time option for that default. The datum which holds this timeout is one of many that are initialized to 0 in a memset(...) call. To provide customized "defaults", a wrapper around one of the sqlite3_open*() calls would be easiest, short of modifying one of those calls directly.
(15) By Keith Medcalf (kmedcalf) on 2021-04-11 22:16:21 in reply to 12 [link] [source]
Create en extension that sets the timeout then register it in the autoexec list.
#include "sqlite3ext.h"
SQLITE_EXTENSION_INIT1
static int sqlite_set_autobusy(sqlite3 *db, char** pzErrMsg, void* pApi)
{
return sqlite3_busy_timeout(db, 60000);
}
#ifdef _WIN32
__declspec(dllexport)
#endif
int sqlite3_autobusy_init(sqlite3 *db, char **pzErrMsg, const sqlite3_api_routines *pApi)
{
SQLITE_EXTENSION_INIT2(pApi);
sqlite3_auto_extension((void*)sqlite_set_autobusy);
sqlite3_busy_timeout(db, 60000);
return SQLITE_LOADED_PERMANENTLY;
}
(16) By Keith Medcalf (kmedcalf) on 2021-04-11 22:37:37 in reply to 15 [link] [source]
That return code should be SQLITE_OK_LOAD_PERMANENTLY
(17) By anonymous on 2021-04-11 22:51:01 in reply to 10 [source]
I meant to change the built-in default so that I can simply replace the sqlite3.dll without access to the applications that use it (but which I know do not change that pragma)?
(18) By Larry Brasfield (larrybr) on 2021-04-12 01:04:27 in reply to 17 [link] [source]
You will not be able to do that without altering the SQLite3 code. However, if you can settle for building something which behaves as if you had done that, you can create a shim DLL which mostly just passes some/most calls through to a "real" SQLite3 DLL but does something different some select calls, such as the wrapping suggested elsewhere in this thread.
(19) By David Raymond (dvdraymond) on 2021-04-13 13:01:22 in reply to 4 [link] [source]
PRAGMA journal_mode = WAL;
The script gets called in a loop from just two different machines on the network.
Just to make sure I'm understanding this right, each of those two machines is accessing its own separate database. Right? Because WAL mode doesn't work across different machines.
Or is that perhaps part of your problem :)