Possible transaction issues on handling multiple databases shared by multiple connections

(1) By Leandro Santiago (leandro) on 2020-04-12 11:28:38 [link] [source]

Hello all.

My application has several threads and I am intending to have multiple sqlite databases being shared among those threads, where a thread which 'owns' a database is the only one able to write on it (inserts and updates), whereas other threads can connect to it can only read (mode=ro, pragma query_only=1).

Therefore a thread might be able to read from multiple databases, but own (be the writer) only a single one.

My current issue is that other threads can own their own database (as main schema), but also attach for reading (as described above) from databases owned by other threads.

It looks like something like that, in pseudocode:

thread1:
  db := owns "db1.sqlite" as "main" schema
  loop { write to "main" }

thread2:
  db := owns "db2.sqlite" as "main" schema
  db1 := attach "db1.sqlite" as "db1" schema on db connection
  loop { write to "main", and reads from "db1", sometimes in the same transaction}

(other threads might read from db1.sqlite and db2.sqlite, but not own them...)

Will sqlite behave well in such scenario? The application is on early development stages, and I ask that because I noticed that a thread that reads from a database connected as not "main" schema often crashes when the database is being changed in a transaction performed by another thread is connected to it as "main" schema.

It makes sense for me, if the journal file (the test was before I started using WAL) is used as synchronization point, and I understood that the journal filename is defined by the connection "main", which is different between a writer which owns it and a reader which has it under a different name.

Would you folks be able to tell me if sqlite is able to work under the described scenario?

Could such scenario potentially cause deadlocks? I mean, will a database open on readonly mode be locked if it's part of a transaction executed on a different database on the same connection? I read that on WAL writers don't wait for readers, but I confess I don't understand well how sqlite manages that.

I am always on the latest sqlite version built from source code and statically linked on my application (using go-sqlite3). The platform is Linux with local filesystem (ext4, probably).

On my current scenario, I am using cache=shared, which might cause issues as well.

I'd be grateful for any thoughts on the issue.

(2) By Simon Slavin (slavin) on 2020-04-12 16:02:10 in reply to 1 [link] [source]

Your basic idea seems sound. Given this much multi-access do not forget to set a timeout of at least ten seconds on every connection. The timeout affects every database attached to the connection.

If a thread does ATTACH, make sure it does DETACH. Do not pass connections from thread to thread.

I would test your idea with shared cache and without, with WAL and without. See which one performs acceptably. Some of these might trigger deadlocks, some just delays. I might be able to figure out what combination of these will happen but it is going to be far more valuable to you to do the experiment yourself.

Might your App have a situation where a database is ATTACHED by one thread before it is opened by its 'owning' thread ? This shouldn't be a problem but you should figure this out to understand any locking or access problems.

(4) By Leandro Santiago (leandro) on 2020-04-14 11:04:51 in reply to 2 [link] [source]

Thank you for the advice.

The number of databases and connections is known by design and the setup process is quite deterministic, so I am ensuring that a database is not attached before its 'owner' opens it.

I've been sharing the same read-only connection across different threads (effectively go routines, triggered by a http server) and so far there has been no issues (on a single database connection). Am I relying on some kind of "undefined behaviour"? Otherwise I'd need to keep a connection pool, which is okay as well.

I'll be running more experiments, but it's good to know I'm going on the right direction :-)

(3) By Keith Medcalf (kmedcalf) on 2020-04-12 18:03:38 in reply to 1 [link] [source]

Why are you using shared cache?

Is there an actual reason or is it "just because it exists"?

(5) By Leandro Santiago (leandro) on 2020-04-14 11:06:50 in reply to 3 [link] [source]

Pretty much it. "caching is better than no caching", said my brain. But caching is not a requirement and I will run more experiments to figure out it affects the behaviour and performance of my application.

Do you think caching in such scenario could cause any issues?

(7) By Keith Medcalf (kmedcalf) on 2020-04-14 19:21:28 in reply to 5 [link] [source]

I am asking you you are using shared_cache as opposed to the default for everything to have its own cache.

Whether it causes any issues is like saying "Do you think tying my shoes will cause issues?" That depends on if the shoe needs to come off quickly, like when it gets run over by a train, and whether you need to tie your shoe or if you will trip over the laces if you don't.

In other words, engaging in premature optimization is almost always a bad idea.

(6) By Dan Kennedy (dan) on 2020-04-14 14:02:19 in reply to 1 [source]

It will work in WAL mode. You could deadlock in regular journal mode, but not in WAL mode, as in WAL mode readers cannot block writers. Writers block other writers, but since each db file has a designated writer, you won't run in to that.

Don't use shared-cache mode.