How many read connections when using concurrency?

(1) By grodes on 2024-07-10 08:55:53 [source]

I have a multi-threaded HTTP server (Golang) that uses concurrency (goroutines, coroutines, async/await) to handle multiple requests simultaneously when it is I/O blocked.

I am using SQLite3 as the database. How many read connections should I keep open in my pool of database connections?

I use the following config for SQLite3:

PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;
PRAGMA journal_size_limit=6144000;
PRAGMA busy_timeout=5000;
PRAGMA cache_size=-32000;

My understanding is that it is hard to specify a number, as it would depend on how much time a request will be I/O blocked.

Ideally, each new request that is accepted should be able to get a read connection (get it from the pool or add a new connection to the pool if it is empty). However, I can see how this would expose me to an attacker filling up my database pool by sending a burst of HTTP requests to my server.

If I am going to define a hard max limit for my database pool, maybe it makes sense to just populate the database pool directly to this max limit on server startup.

What would be a good way to measure the maximum allowed number of read connections per available CPU thread? 1 per thread? 2 per thread? 3?

(2) By Stephan Beal (stephan) on 2024-07-10 09:11:30 in reply to 1 [link] [source]

What would be a good way to measure the maximum allowed number of read connections per available CPU thread?

There is no specific limit, nor is there any specific need for a pool. This site, for instance, forks a new instance of the Fossil SCM¹ executable for every single page request, all of which use a single shared sqlite db.

That applies to many of the sites within this project's overall umbrella, but only once in a blue moon will a user see a "database is locked" error on any of them. (In 16+ years of using fossil, i can count on one hand how many such errors i've seen on its various sites/repositories.)

If you believe you need a pool because "it must certainly be faster," my recommendation is to try using sqlite without one first, before committing to the additional complexity of having one.

^{^} sqlite's source control system (which also happens to have a forum feature)

(3) By grodes on 2024-07-10 09:58:05 in reply to 2 [link] [source]

Which alternative would you propose instead of using a pool? I can think of:

A single sqlite connection for the whole server behind a lock?
- this would introduce probably a lot of lock contention
A new read connection for each new request?
- this would require each new request to open the sqlite db file, parse the schema, prepare the statements, etc...

A pool of read connections of a size of cpu*2 seems like more optimal since you avoid having to parse the schema and you can leverage prepared statements.

(4.1) By Stephan Beal (stephan) on 2024-07-10 10:24:38 edited from 4.0 in reply to 3 [link] [source]

this would require each new request to open the sqlite db file, parse the schema, prepare the statements, etc...

Yes, but have you proven that that's too slow for your case? It's fast enough for all of the sites operated under this project's umbrella.

Will your software run often enough that those few microseconds of overhead make up for the development time spent on working around them?

A pool of read connections of a size of cpu*2 seems like more optimal since you avoid having to parse the schema and you can leverage prepared statements.

And you introduce complexity and maintenance burdens which you otherwise don't have. Plus you artificially introduce a resource bottleneck which otherwise does not exist. Instead of waiting a couple of microseconds for the schema to parse, your users wait much longer for a new pool entry to become available.

That's just my proverbial 0.02€. "If it ain't broke, don't fix it."