Suggestion Re Statement Cache

(1) By Larry Brasfield (larrybr) on 2023-09-05 22:21:16 [link] [source]

Recently, I had occasion to think about scenarios where somebody wishes to execute the same SQL repeatedly, and wishes to avoid the resource usage associated with compiling the SQL into a VDBE instruction sequence and storing that redundantly. Or, more trickily, running the same DQL or DML from multiple threads.

I have also seen words on prepared statement caching, something that the Tcl interface does now at the library client level, and the possibility that the SQLite library might do the same some day.

These two (or three) concepts, after stewing in my skull for awhile, lead to this proposal:

Perhaps there should be no prepared statement caching. Instead, what is cached is compiled VDBE programs. Prepared statements, if created with SQL that is in the VDBE program cache (as a key), get something from the cache instead of the compiler being run. That something might be a copy, or it might even be a reference so that recompiles (due to schema changes) update all dependent prepared statements at the same time. Or, this could be done like a lazy write, updating VDBE sequences that are not in execution but leaving ones alone that are amidst execution.

This would, I submit, reduce motivation for applications to be keeping prepared statements lying around or to be trying to use a single prepared statement from multiple threads with the synchronization that entails.

The registers and other execution related data kept with prepared statements would be created at need, cheaply I expect. But recompilation of SQL could be avoided by use of a VDBE program cache. Its size could be tunable, or sqlite3_prepare_v3() could have a flag value indicating that the given SQL is to be cached.

(2) By Warren Young (wyoung) on 2023-09-06 06:35:23 in reply to 1 [link] [source]

Pending benchmarking that proves my intuition wrong, that sounds like an overall pessimization. Looking up a prepared statement in a cache by a static text key is fast, to the point that it should be possible to do it at program compile time given a suitably clever data structure and a smart-enough language compiler.¹

Your alternative has broader application in that it works from any language, over any API, but it adds the runtime cost of SQL compilation to the cache lookup time. I don’t see how any current language compiler could reasonably be expected to optimize that out at compilation time.

If you’re willing to do the work, and it’s optional, I have no objection. I do think that if you implement this, benchmarking will show it to be conditionally faster, aiding the likes of Tcl users while being valuable from highly optimized statically compiled languages² only when they’re being used in a “scripting” style.³

^{^} Never tried it, never needed it, but I’ve seen modern C++ compilers pull off wonders…
^{^} C, C++, Rust, Swift…
^{^} Oppose the cycle-bumming coding style used in high-volume applications, where developer time trades off favorably against large-N multipliers: big data centers, embedded, computational finance, games, AI, etc.

(3) By Rowan Worth (sqweek) on 2023-09-06 06:59:10 in reply to 2 [link] [source]

I don't think Larry was suggesting changing the interface to the cache:

Prepared statements, if created with SQL that is in the VDBE program cache (as a key), get something from the cache instead of the compiler being run.

ie. the cache is still keyed based on SQL text, there's no compilation involved in a cache lookup.

(the nuance between caching a prepared statement vs a VDBE program is lost on me, which may also indicate that I have misunderstood the proposal)

(4) By Larry Brasfield (larrybr) on 2023-09-06 08:29:02 in reply to 3 [link] [source]

nuance between caching a prepared statement vs a VDBE program

For purposes of this discussion, a sqlite3_stmt object contains several kinds of state: (1) A sequence of VDBE instructions which the SQL compiler produces and which may be interpreted by the VDBE (aka "virtual database engine")¹; (2) A set of bound parameter values, possibly empty; (3) A set of registers which are used as intermediate or output data values during execution of the VDBE instruction; and (4) An instruction pointer.

The 1st kind of state is determined almost solely by the SQL presented², during the sqlite3_prepare() call. This can be expensive in terms of execution time.

The 2nd kind of state (parameter binding) is logically part of client usage of the prepared statement, while the 3rd and 4th kinds are developed during sqlite3_step() calls.

The 1st kind of state (VDBE instructions) can be shared among threads with minimal precautions needed to avoid non-deterministic behavior. Such sharing would allow the expense of SQL compilation to be avoided where the same SQL is being used repeatedly.

The 2nd-4th kinds of state are necessarily associated with execution of the VDBE and generally should not be shared among threads. Threads need to keep their own VDBE execution state to avoid interfering with each other when they independently run prepared statements.

My suggestion (or proposal) is that, within the library implementation, without substantial exposure at the API, VDBE programs can be cached to serve two distinct purposes: (1) execution time improvement where the same SQL is being run a lot, such as the Tcl adapter; and (2) memory savings where large VDBE programs can be reused rather than replicated.

I don't think Larry was suggesting changing the interface to the cache:

Right. It's a key/value store where the keys are whole SQL statements and the values are VDBE programs. It resembles a prepared statement cache, except only the 1st kind of prepared statement state is kept.

^{^} These are what is shown via "EXPLAIN ...".
^{^} The VDBE instructions can be affected by collected statistics during optimization, so "solely" is a slight stretch.

(5) By Donal Fellows (dkfellows) on 2023-09-06 10:23:13 in reply to 4 [link] [source]

The main problem with having a cache (of anything) is working out when to invalidate the cache. That's often difficult to get right (speaking as someone who's written a number of the caching mechanisms inside Tcl).

Under what circumstances would a cached VDBE program cease to be valid?

(6) By Gunter Hick (gunter_hick) on 2023-09-06 11:05:49 in reply to 5 [link] [source]

At the latest when it needs to be re-prepared because the schema has changed. Which makes it hard to share the VDBE code, because it is presumably quite unsafe to replace the VDBE code out from under a "running program". Which means you need a reference count so that the "old version" can be freed once every "running program" has been stopped.

(8) By Larry Brasfield (larrybr) on 2023-09-06 19:52:05 in reply to 5 [link] [source]

Under what circumstances would a cached VDBE program cease to be valid?

I would share them using a reference-counting scheme, and those whose count reaches 0 would be subject to a least-recently-used flushing algorithm. The number of not-presently-used VDBE programs (in the cache only) would be tunable.

(13) By Roger Binns (rogerbinns) on 2023-09-08 15:24:52 in reply to 8 [link] [source]

(Author of a Python SQLite wrapper that has its own statement cache.)

I would be delighted to delete my statement cache code, and instead rely on one built in to SQLite. Every other SQLite wrapper in every language seems to implement a statement cache so there is definitely a need there, and I have asked for one before.

I don't think you need to do anything as complicated as is being proposed. My approach is:

When sqlite3_prepare_v3 would be called, look in the cache first and return something from there if the SQL text matches. (It is taken out of the the cache leaving a hole.)
When sqlite3_finalize would be called, put the sqlite3_stmt back into the cache evicting a member if the cache is at capacity. The evicted member is actually finalized. Bindings are cleared for the statement put into the cache to free memory.

My original cache used a dictionary/hash table for the cache. The drawback is that you can only have one cached statement per key, and it is common for the same SQL text to be active multiple times.

My second implementation is a simple array of int which is the hash value of the SQL text. When looking for an entry it is a linear scan to find a matching hash value, with a secondary array containing structures with the SQL text, length, and sqlite3_stmt. Linear scans are fast on modern processors, especially compared to random access pointer chasing like in various tree structures. There is a magic hash value meaning unoccupied and an incrementing index to which entry is next to be evicted.

This approach doesn't need any reference counting or similar overhauls, and works well.

Some other minor details:

The entire text used for prepare is used as the key, not just the first statement
When the text is larger than 16kb there is no caching. The number was determined empirically and avoids something like restoring a dump from doing lots of pointless cache management
The prepare flags also have to match in addition to the SQL text
Explain mode is also treated like a prepare flag and I only expose it as though it was one
There is a no cache prepare option because some queries want that such as a pragma setting a password. Also authorizers only fire during prepare so their use requires no cache.
I provide an api for getting cache statistics and entries which helps validate the cache is working

It is all these details that are tricky and why I'd prefer SQLite has this as optional functionality since it is one place to get all the nuances right, instead of each wrapper having their own quirks and bugs.

(14) By Larry Brasfield (larrybr) on 2023-09-08 17:42:06 in reply to 13 [link] [source]

I don't think you need to do anything as complicated as is being proposed. ...

I do not see caching only the reusable and semi-constant part(s) of a prepared statement object as being significantly more complicated than caching whole prepared statements along with their execution support parts which have to be initialized anyway upon reuse.

Clearly, for users of the API as it exists today, there is little choice. But what I am suggesting is a way to get the benefit of whole-prepared-statement caching with an optimization that benefits multiple, simultaneous uses of the same SQL.

It is all these details that are tricky ...

No argument on that.

(15) By Roger Binns (rogerbinns) on 2023-09-08 18:01:49 in reply to 14 [link] [source]

... as being significantly more complicated than caching whole prepared statements ...

What I meant is that doing even simplistic caching is sufficient, and what everyone has independently developed today. That alone would provide a lot of benefit, especially getting the tricky details right. Having a SQLITE_PREPARE_CAN_CACHE flag to sqlite3_prepare_v3 would be the only needed change to the SQLite API.

As you point out, behind the scenes SQLite can then use reference counts, break the statement data structure into parts etc, but none of that would need to be exposed, nor have public APIs. It is only if the SQLite team doesn't implement the cache (as has been the case for 20 years) that some machinery could be exposed. But it would expose ugly implementation details, and it would be strongly preferred the team do this, get it right, and let all the wrappers delete code that almost certainly has bugs.

(16) By Larry Brasfield (larrybr) on 2023-09-08 18:49:51 in reply to 15 [link] [source]

I pretty much agree with this.

Much of what I am trying to bring out in this thread is that, by doing the caching within the library, it can be done in a way which yields more performance benefits than are available to clients doing whole-sqlite3_stmt caching.

(17) By Pepijn Van Eeckhoudt (pepijnve) on 2023-09-09 13:51:33 in reply to 16 [link] [source]

by doing the caching within the library

I can't justify it properly, but I feel like I would prefer to control this type of caching from the application level. The library handling it automatically as a replacement for full statement caching, while convenient, seems to take away some control from the application developer and seems to make the performance behaviour of the library less predictable?

A couple of questions come to mind:

Would the cache be global state or tied to the database connection?

How is cache eviction controlled? How can I guarantee sharing for some performance critical statements for instance if it's a transparent global cache with a size upper bound? Or is it unbounded, but using reference counting which would mean I still need to hold on to at least one statement instance in my application where performance matters.

Related to this, are you planning on providing a kind of global opt-out? An OMIT... compile flag for instance?

it can be done in a way which yields more performance benefits than are available to clients doing whole-sqlite3_stmt caching

Could you elaborate on that a bit? I would like to understand what could be gained besides reduced memory usage.

(20) By David Jones (vman59) on 2023-09-10 20:31:28 in reply to 17 [link] [source]

I agree with this. If you have a performance sensitive application and use the SQLite API, I'd expect you to code prudently when it comes to repeated queries using the same SQL. Wrapper libraries, OTOH, may get themselves into redundant prepares from trying to provide a simplified or encapsulated programming interface to the client. I'd rather see a cache object that can help wrappers with repeat queries than have caching be automatic.

For my usage patterns, the times SQLite is 'too slow' are from poorly optimized execution against large databases, not excessive compilation of the SQL.

(21) By ddevienne on 2023-09-11 07:49:38 in reply to 20 [link] [source]

For my usage patterns, the times SQLite is 'too slow' are [...], not excessive compilation of the SQL.

I haven't followed that thread too closely, TBH, but I tend to fall in that camp.

SQLite's SQL parser and comparatively simple NGQP planer are fast enough.

So while a Statement cache seems attractive, from a theoretical standpoint,
and especially as Roger puts it, i.e. an opt-in flag on _prepare_v3 as the sole API,
I'd rather invest Richard's time (if it was my call :)) on other topics (like better blobs).

So I'm +0 on Roger's approach to Statement caching, -1 on others. FWIW.

(23) By Pepijn Van Eeckhoudt (pepijnve) on 2023-09-11 10:21:26 in reply to 21 [link] [source]

If there was a kind of caching available, what I would personally like to be able to use is something like this. (Disclaimer: I'm just throwing an idea out there. This isn't a fully baked design; there may be glaring flaws. It's a fleshed out version of the explicit mechanism I was suggesting earlier. Apologies up front if this is a compete brain fart)

(1) Some way to compile SQL to VDBE and a way to release that object.

typedef struct sqlite3_program sqlite3_program;

int sqlite3_compile(
  sqlite3 *db,            /* Database handle */
  const char *zSql,       /* SQL statement, UTF-8 encoded */
  int nByte,              /* Maximum length of zSql in bytes. */
  unsigned int compFlags, /* Zero or more SQLITE_COMPILE_ flags */
  sqlite3_program **ppProgram,  /* OUT: Program handle */
  const char **pzTail     /* OUT: Pointer to unused portion of zSql */
);

int sqlite3_free_program(
  sqlite3_program *pProgram
);

(2) Some way to create a new execution context using the compiled program.

int sqlite3_prepare_v4(
  sqlite3 *db,            /* Database handle */
  sqlite3_program *pProgram, 
  unsigned int prepFlags, /* Zero or more SQLITE_PREPARE_ flags */
  sqlite3_stmt **ppStmt  /* OUT: Statement handle */
);

In other words, make the two types of state (type (1) and types (2, 3, 4) as Larry described it) associated with a prepared statement today explicit. This is similar to the fork suggestion, but you 'fork' from a specific object instead of another statement.

For existing usages of prepare_v2 and v3 nothing would change. A sqlite3_program may be created internally, but if there is one it's hidden to the user and its lifetime is tied to that of the sqlite3_stmt. I.e. the sqlite3_stmt is the owner of the sqlite3_program instead of the user application.

To make automatic recompiling feasible you probably need the reference counting logic that has been hinted at in this thread in the implementation so that you don't rug pull in flight queries. In other words, internally that sqlite3_program may have to be a bit more sophisticated than just a simple opaque pointer to a VDBE program to make automatic recompilation possible. It's more like a little mini-cache for a single SQL string that's user created/managed.

What I don't quite see yet is how you could handle bind parameter dependent VDBE program caching (assuming that's a thing; I might be misinterpreting the documentation). You would want to cache the minimal set of distinct query plans, but if I understand it correctly you would only know that you want to use a different query plan during the optimisation stage of the compiler. This creates a bit of a chicken or egg problem. Perhaps that's just something that can't be cached. I was wondering how this would be handled for the transparent cache as well.

There are a couple of reasons I would like a design along these lines

It's opt-in. As others have stated SQLite is plenty fast as is. For situations where performance is already fine I don't have to think about potential implications on the behaviour of a hidden cache. (Avoiding cache thrashing, cache sizing, guaranteeing caching where it's required, etc.) Where I do need every last drop of performance I can get it. For existing users, there's no risk of performance regressions since that all stays the same.
It's conceptually similar to, but much less error prone than, prepared statement pooling which many people seem to be doing today already. Pooling is tricky because you're trying to reuse an object that has complex mutable state. You need to make sure you don't use the same instance twice at the same time, you need to make sure the instance is returned to the pool instead of being freed, you need to think about how many instances you keep alive, you need to make sure bound paramaters are cleared to not risk leaking state from one usage to the next, etc. With the proposed design sqlite3_stmt can be used in a simple RAII style.
It seems simpler for the SQLite developers to implement since they can punt on a bunch of tricky cache design decisions. Those are moved to the user side of the fence instead. I can then apply whatever caching approach makes sense for my usage of the library.

These reasons are probably also the downsides as well though.

It's not automatic, so you're not improving performance transparently for all users.
It's a new API concept that will have to be maintained in a backwards compatible fashion. I can't judge if this is something you would want to commit to or not. I do think some notion of a compiled statement is sufficiently intrinsic and abstract that that shouldn't be a problem, but that's hard to predict.

(24) By Roger Binns (rogerbinns) on 2023-09-11 14:44:50 in reply to 17 [link] [source]

The target of a statement cache is NOT an app developer writing code in C directly using the SQLite API, carefully and tightly controlling everything their code does.

It is the various wrappers for SQLite that provide a higher level of abstraction for other languages like Python, Perl, TCL, Java, C# etc. Those wrappers usually do not expose the low level SQLite statement machinery because there is no benefit, and it is a lot of work. For example in my Python wrapper you simply write:

connection.execute("select ... where ...", (7, "123 Main St")).get

At the C level it has to call sqlite3_prepare, the binding functions, sqlite3_step, the sqlite3_column functions, and sqlite3_finalize. There is no value in higher level languages or libraries in exposing all these underlying details.

And developers are usually using yet another library/layer of abstraction on top of this. They work with ORMs and records and Employee and Invoice with changes and reads working through the layers. That is how you end up with the same SQL statements being repeatedly generated, and why a cache is beneficial.

Would the cache be global state or tied to the database connection?

Connection since that is the scope of a sqlite3_stmt.

How is cache eviction controlled?

LRU is the norm for caching.

How can I guarantee ...

By writing the exact code you want.

The most straight forward implementation inside SQLite would be:

SQLITE_CAN_CACHE flag to sqlite3_prepare_v3 which says the sqlite3_stmt could come from a cache, and could be placed back into the cache after use
sqlite3_finalize could add the sqlite3_stmt to the cache if SQLITE_CAN_CACHE had been used in the prepare
sqlite3_db_config to control cache maximum size
SQLITE_OMIT_STATEMENTCACHE to exclude the code at compile time

For reference the statement cache in my Python wrapper is implemented entirely in C and follows this pattern. It is 350 lines of code, of which 50 is a function to get the cache statistics and entries. The cache itself is a simple linear array indexed by hash value of the SQL text, and allows the same SQL text to be present multiple times because that does happen.

it can be done in a way which yields more performance benefits than are available to clients doing whole-sqlite3_stmt caching

Could you elaborate on that a bit?

Using the existing public SQLite API only lets you do caching on the opaque sqlite3_stmt pointer. This is what sqlite3_stmt really is, and a cache inside SQLite could use that for optimization, with reduced memory usage being the most obvious. Another example is if a statement has to be reprepared by SQLite because the schema changed then it only has to be done once, not for every sqlite3_stmt of the same SQL.

(25) By Pepijn Van Eeckhoudt (pepijnve) on 2023-09-12 06:34:09 in reply to 24 [link] [source]

The target of a statement cache is NOT an app developer writing code in C directly using the SQLite API, carefully and tightly controlling everything their code does.

I think there's value in both. One does not preclude the other. But I understand your point. It would be nice that things just automatically work for all users.

FWIW, I'm looking at this as both an application and a wrapper developer. I write application code in Java that use SQLite. The wrapper I use is an in-house developed one that binds the C API using libffi. The C functions are all exposed directly at the lowest level. There's a more idiomatic OO version of that built on top of it that's still the SQLite API in structure. And then on top of that there's the JDBC driver.

Having the direct SQLite API available is still useful. There's an impedance mismatch between the JDBC API on the one hand and the SQLite API on the other hand. Where I really need minimum overhead it's beneficial to program with the SQLite API directly. And their caching would be useful.

If there were a cache facilitating API available in SQLite (see other thread) I would be inclined to implement that on the Java side in the JDBC layer. An LRU cache is much simpler to implement at that level (since it's part of the standard library already) rather than in C.

(28) By David Jones (vman59) on 2023-09-14 09:52:57 in reply to 17 [link] [source]

Here's a proposal for a cache object that lets library writers request caching with a modicum of control over the cache size. The central idea is a provisional sqlite3_stmt that can be the fifth argument to sqlite3_prepare_v3 when flagged as such by the fourth argument.

/*
 * Proposed API for caching assistance.
 *
 * New functions:
 *    int sqlite3_stmtcache_create()
 *    int sqlite3_stmtcache_search()
 *
 * Modified functions:
 *    int sqlite3_prepare_v3()
 *    int sqlite3_stmt_status()
 */
int sqlite3_stmtcache_create (
 	sqlite3 *cnx,
	int max_entries,
	int max_program_size,
	int max_storage,
	sqlite3_stmt_cache **cache );   /* OUT: returns cache object address */

int sqlite3_stmtcache_search (
	sqlite3_stmt_cache *cache,
	const char *key,
	sqlite3_stmt **prov_stmt,     /* OUT: new object */
	int flags );
/*
 * The return list flag for search makes prov_stmt contain return a
 * bound statement object whose result set is a list of the cache entries.
 * Schema for the cache's virtual table:
 *    CREATE TABLE x(key, state, create_time, size, hits);
 */
#define SQLITE_CACHE_RETURN_LIST 0x01

/*
 * Extra codes used by sqlite3_prepare_v3() and sqlite3_stmt_status().
 */
#define SQLITE_PREPARE_PROVISIONAL 0x08
/*  If set in prepflags, indicates the ppstmt argument is a pointer to the
 *  provisional object returned by sqlite3_stmtcache_search().
 */

#define SQLITE_STMTSTATUS_CACHE_STATE 98
/* Additional op code for sqlite3_stmt_status, whose 'counter' value is
 * one of of the following values:
 *    SQLITE_CACHE_STATE_IGNORED
 *    SQLITE_CACHE_STATE_NASCENT
 *    SQLITE_CACHE_STATE_LOADED
 *    SQLITE_CACHE_STATE_OVERSIZE
 *    SQLITE_CACHE_STATE_BOUND
 */
#define SQLITE_CACHE_STATE_IGNORED 0
/* Cache not used, due to one of:
 *   - Statement was prepared without SQLITE_PREPARE_PROVISIONAL flag.
 *   - Search found nascent entry, continue without caching to avoid race.
 *   - Cache limit exceeded and could not free existing entry.
 */

#define SQLITE_CACHE_STATE_NASCENT 1
/* Possible status for provisional statement between search and prepare calls,
 * indicating an incomplete entry was created. The prepare call will compile
 * the SQL and complete the entry.
 */

#define SQLITE_CACHE_STATE_LOADED 2
/* Possible status for provisional statement between search and prepare calls,
 * indicating an existing entry was found that prepare may use.
/* 

#define SQLITE_CACHE_STATE_OVERSIZE 3
/* Possible status for provisional statement between search and prepare calls,
 * indicating an existing entry was found, but previous prepare step created
 * an oversize entry. The prepare call will update cache statistics, convert
 * state to ignored, and proceed with prepare.
 */

#define SQLITE_CACHE_STATE_BOUND 4
/* State after prepare call, VDBE program is saved in cache entry. Statement
 * finalize will decrement ref count.
 */

(7) By David Jones (vman59) on 2023-09-06 12:26:14 in reply to 4 [source]

What about having a fork statement call which lets a thread create a new sqlite3_stmt object where the VDBE program is a reference back to the parent but all other state is independent? The thread would treat it as any other statement object and finalize would handle updating reference counts.

(11) By Larry Brasfield (larrybr) on 2023-09-07 16:11:00 in reply to 7 [link] [source]

I would be wary of having that "parent" be somehow more special than the "forked" sqlite3_stmt objects. Lifetime management issues arise from that.

But there is nothing to prevent sqlite3_stmt objects from sharing a VDBE program. It is already kept in memory separate from the struct known as sqlite3_stmt. Its ownership is a one-to-one relation now, but that could change in a well known way.

(18) By Pepijn Van Eeckhoudt (pepijnve) on 2023-09-09 15:18:40 in reply to 4 [link] [source]

Right. It's a key/value store where the keys are whole SQL statements and the values are VDBE programs.

The VDBE instructions can be affected by collected statistics during optimization, so "solely" is a slight stretch.

How would the cache handle the optimisation case? Would this (logically) be done by using sqlite3_expanded_sql to obtain the cache key?

(19) By Larry Brasfield (larrybr) on 2023-09-09 15:55:11 in reply to 18 [link] [source]

How would the cache handle the optimisation case?

I don't understand the question. Optimization is done as (or before, actually) the VDBE program is generated. It is just that program that goes into the cache.¹

Would this (logically) be done by using sqlite3_expanded_sql to obtain the cache key?

No. That API returns SQL with bound parameters substituted. The cache key needs to be SQL matching what clients present to sqlite3_prepare(); to do otherwise would defeat avoidance of the compilation.

^{^} Some data associate with the VDBE program would stay with it, such as how many registers it needs, their initialization, etc.

(22) By Pepijn Van Eeckhoudt (pepijnve) on 2023-09-11 08:39:22 in reply to 19 [link] [source]

I don't understand the question. Optimization is done as (or before, actually) the VDBE program is generated. It is just that program that goes into the cache.1

Perhaps the documentation is wrong or I'm misinterpreting it, but today sqlite3_prepare docs state:

If the specific value bound to a host parameter in the WHERE clause might influence the choice of query plan for a statement, then the statement will be automatically recompiled, as if there had been a schema change, on the first sqlite3_step() call following any change to the bindings of that parameter.

Intuitively that makes sense since depending on statistics the most optimal query plan (and as a corollary the VDBE program) may differ. That's why I was asking the expanded_sql-as-key question.

(9) By Pepijn Van Eeckhoudt (pepijnve) on 2023-09-07 13:09:42 in reply to 1 [link] [source]

This would, I submit, reduce motivation for applications to be keeping prepared statements lying around or to be trying to use a single prepared statement from multiple threads with the synchronization that entails.

How would this operate in a multi-threaded context? Since SQLite does not support/allow use of database connections and derived objects from more than one thread concurrently, concurrent queries require multiple database connections. To get any benefit from a VDBE cache, the cache would have to cross connection boundaries. Or am I misinterpreting the multithreaded use restrictions?

Unrelated to that, it feels (purely on intuition) like the cache management might get tricky very quickly. Schema changes are one thing, but if I understand it correctly statistics also play a role in query planning. So running analyze might have to invalidate the cache as well. Or would you provide a manual cache invalidation mechanism to avoid getting stuck with a suboptimal cached query plan.

That being said, I have often thought while working on my application that recompiling the same SQL string to VDBE seems so wasteful as well. The API is was intuitively looking for to help with that was something that could give me an opaque pointer to the compiled version of the statement that I could pass in to sqlite3_prepare as something it might be able to reuse. Something explicit rather than transparent/automatic in other words.

(10) By Larry Brasfield (larrybr) on 2023-09-07 15:59:02 in reply to 9 [link] [source]

How would this operate in a multi-threaded context?

Clearly, for library builds that support multithreading, the caching mechanism would have to be thread safe. There are many ways to do that, and the critical sections needed to do so would be short. Conceptually, it is the same problem that is presented by thread-safe containers and allocators.

Since SQLite does not support/allow use of database connections and derived objects from more than one thread concurrently, concurrent queries require multiple database connections. To get any benefit from a VDBE cache, the cache would have to cross connection boundaries.

That last assertion slightly overstates the case. Even single-threaded (and single-connection) application could benefit from avoided SQL recompilation.

But your connection boundaries point is apt. To be most useful, the VDBE program cache would need to be available to multiple connections, just as allocated memory is. And it would need similar protection against concurrent access. But that protection need not extend to use of the cached (or shared out) objects; it only needs to cover the management of the cache or updates of the VDBE program objects when schema changes occur.

... So running analyze might have to invalidate the cache as well.

That's an interesting proposition. The effects of analyze are optimizations, affecting performance rather than correctness. Having prepared statements laying around, even today, is bound to occasionally result in lost optimization opportunity. I would not view this as "cache invalidation", but the question of when analyze takes effect, and whether any SQL compilation can be reused across execution of ANALYZE certainly deserves careful thought.

Or would you provide a manual cache invalidation mechanism to avoid getting stuck with a suboptimal cached query plan.

Perhaps running "ANALYZE ..." implicitly has that effect. After all, it takes time (presumably) and dumping unused cache objects incurs just a time penalty. Even cache objects presently held by sqlite3_stmt objects could be marked for recompilation (or replacement with a recompiled VDBE program.)

... recompiling the same SQL string to VDBE seems so wasteful as well.

Yes, that is why the Tcl adapter keeps some prepared statements laying around.

[on explicit management of compiled statements]

In my view, that's a much bigger step. I would defer that until the details of automatic management are worked out and some experience is gained with it. However, that sort of control is why I mentioned that sqlite3_prepare_v3() might gain a flag value to affect caching.

(12) By anonymous on 2023-09-07 20:43:55 in reply to 10 [link] [source]

sounds like https://www.sqlshack.com/understanding-sql-server-query-plan-cache/

(26) By Donal Fellows (dkfellows) on 2023-09-12 11:10:25 in reply to 10 [link] [source]

To be most useful, the VDBE program cache would need to be available to multiple connections

However, that requirement makes the cache massively more complex. By comparison, making the cache be per-connection simplifies things a lot as so much other state is also per-connection and we already know that the cost of recompiling SQL to a VDBE program is usually not too great as that's what happens right now. It won't support all scenarios equally well, but it will make the implementation simpler. (It also means that the implementation of ANALYZE doesn't need to be careful; it can just spoil the cache if it changes anything at all, an operation easily implemented with an in-memory-only epoch counter.)

The observation that it would be useful to have a flag to mark a statement as being cacheable/non-cacheable makes a lot of sense to me. It doesn't require the implementation to cache statements, but it lets client code indicate that some statements are definitely not a good idea to cache. It would be entirely a matter for the cache implementation whether it has size limits for SQL or implements a fixed size with an LRU eviction policy. (Tcl's regular expression engine does that sort of trick on a per-thread basis.)

(27) By Stephan (stephancb) on 2023-09-12 15:31:16 in reply to 26 [link] [source]

Could such a VDBE program cache live in a new system table (like `sqlite_schema') which (optionally) goes together with the database file?

then multiple connections could use it?
as a bonus such a table would provide some kind of "frequently used queries" (FUQs) demonstrating how the database has been queried (the table needs to have both the original SQL string and the VDBE program, which the cache probably would have anyway).

I realize some complications with this, like attached databases.

(29) By mlaw (tantaman) on 2023-12-11 16:44:46 in reply to 1 [link] [source]

Any updates on the statement cache? Is it likely to be implemented?

(30) By anonymous on 2024-10-27 05:05:56 in reply to 1 [link] [source]

It's not uncommon that:

Some programs are read only
Some programs have different execution context and can run in readonly mode depending on workloads' needs
Some programs have no schema-altering statements
Some programs run on resource-limited systems / edge. There is no need for a fully fledged SQLite. The parser could be omitted.
Some programs deal with a high number of relatively small databases (E.g.: 1 per tenant)
Some programs have a high number short lived read only connections

I don't think we should focus on the schema change constraint too much. If the schema version is outdated

The call to the service could be proxied to another remote service with a fully fledged SQLite to migrate db to the latest schema version. It doesn't happen often, it's often fine for this to be slow.
Database can be preemptively migrated to use the latest schema
The cache key could contain the schema version number

Often, in the context of web applications, the schema is up to date, the database is small, the workload is read-only.

Using deterministic cross connections cached statements / vdbe programs for short lived read-only workload, could be both safer and faster. It would also make it possible to introduce a "lighter-sqlite" without the parser or compiler.

Joris

(31) By Stephan Beal (stephan) on 2024-10-27 05:32:59 in reply to 30 [link] [source]

The parser could be omitted.

Unlike most (all?) other db engines, SQLite does not store an "idealized representation" of schemas in the database: it stores only the SQL for that schema (precisely as it was provided by the user, perhaps modified by subsequent ALTER TABLE statements). Without the parser, it cannot read a schema, making is functionally useless.

(32) By ralf (ralfbertling) on 2024-10-28 09:26:46 in reply to 31 [link] [source]

Hi, as I understand Joris (aka anonymous) the program wouldn't read the schema at all, but only execute pre-compiled statements (none of which would allowed to be a schema change).

To allow for that proposal the SQLite-DB in question would have to be schema-sealed and to be cross-compatible with other implementations to read a checksum of the schema (which in turn strictly-speaking requires reading, but not parsing the schema).

This is not dissimilar from workloads on IBMs classic DB2-database, where (often COBOL)-application would perform a "BIND" as part of the building process, freezing the access pattern and only very carefully perform a "REBIND" after relevant schema-changes or significant changes in statitics.

(My former employer sells tools to decide, when the risk/benefit-ratio is favorable.)

I assume the intersection that the breaking changes (older clients would not be able to write to such a database as they could not distinguish between schema-changes and other write-access) would not be worth the trouble.

The initial proposal seems to be interessting to me. It could IMHO be fully transparent short of a PRAGMA to do manual lifetime management.

However this might open a can of worms as people might next ask for what Oracle calls "cursor sharing" i.e. automagically replacing literals by parameters to make caching more effective. Than you might want to sometimes re-plan queries depending of the parameter values in case of skewed statistics.

It should be very clear what this feature does and doesn't do.

Regards, ralf

(34) By joris (joris_) on 2024-10-30 01:14:28 in reply to 32 [link] [source]

as I understand Joris

Your understanding is correct but I am also trying to push the agenda of being able to serialize the schema structures and prepared statements. Context: I am using CGSQL (https://github.com/ricomariani/CG-SQL-author) which compiles SQLite SQL-like stored-procedures into c binaries which is calling sqlite executing statements using sqlite queries as text (or prepared statements). I would like to go further than this and embed binaries of a very lightweight SQLite (without parser) and the serialized schema structures and prepared statements, so that the sql schema and sql queries in text can be removed from the binaries.

I tried to break it down into a list of what I believe is the most important and popular to the more niche use-cases

Being able to share schema cache of an already connected database when flag is enabled the hash of the schema is matching and user requested it for the specific connection. The schema cannot be updated
Being able to update the schema of a database even if the cache is shared; if another database relies on it: rebuild from scratch (fine because rare) or clone & fork the cache
Being able to execute prepared statements across database having the same schema sharing the same schema cache; If it fails, I assume the user would setup a fallback to reprepare the statement for the new cache. Ideally the hash table storing the prepared statement would be related to a schema cache so that it wouldn't fail.
Being able to serialize schema cache so that it can be embedded and loaded
Being able to serialize prepared statements so that it can be embedded and loaded

However this might open a can of worms as people might next ask for what Oracle calls "cursor sharing" i.e. automagically replacing literals by parameters to make caching more effective. Than you might want to sometimes re-plan queries depending of the parameter values in case of skewed statistics.

(1) Not automagically replacing literals by parameters

(a) The use of bound parameters reduces the need for this
(b) It could be dealt with in the application layer. A simple script could detect similar queries from logs and invite the developer to use bound parameters so that the checksum of the queries match and the cache is shared
(c) To facilitate the process, a utility could be made using the SQLITE parser to replace all hardcoded literal with parameters and parameters already defined, it might already exist the project. I saw something similar to this in mysql for metrics / logging
- If it's simple, it would be nice if sqlite provided an api to do this:
```
INPUT:  "SELECT ? as 'abc', 'abc' `abc`, 'abc' \"abc\", 'abc' 'abc', 'abc' abc, 'abc' as \"abc\", 'abc' as 'abc', 'abc' as abc;"
OUPTUT: "SELECT ? as 'abc',   ?   `abc`,   ?   \"abc\",  ?   'abc',   ?   abc,   ?    as \"abc\",  ?   as 'abc',   ?   as abc;", null, 'abc', 'abc', 'abc', 'abc', 'abc', 'abc', 'abc'
```

(2) Not being able to replan queries:

(a) When multiple databases are used with an identical schema, the databases tend to be smaller because it's part of a sharding strategy. Which means that the main filter (logical discriminator: date range, user, group) which slow down the queries the most has already been applied. I can't find it anymore, but I saw an option once to be able to force sqlite to pick a "stable" deterministic query plan that doesn't rely on stats. I am sure it's good enough and outweighs the overhead for the most common usage pattern where many small databases with an identical schema are being opened and closed after handling very few of http requests to execute the same queries over and over. You could argue that the sharding strategy is the problem, but it offers many benefits: It's safer and it makes it easier to deal with data sovereignty, data residency, scale, tiers for redundancies, cold storage, archival, ...
(b) You can control the base statistics by making sure the fixtures of your "base database" is representative of the databases
(c) You could generate cache based on different profiles
(d) In the worse case scenario, you could resolve the cache variant per query in your app:
- this.executeQueryVariant("SELECT 123", categoryCount > 200 ? 'many_categories' : 'many_brands');
- SELECT /* many_categories */ 123; -> generate cache with many_categories fixtures
- SELECT /* many_brands */ 123; -> generate cache with many_brands fixtures

Joris

(33.1) By joris (joris_) on 2024-10-30 01:17:34 edited from 33.0 in reply to 31 [link] [source]

Without the parser, it cannot read a schema, making is functionally useless

Why wouldn't the compiled sql schema also be cached? It sounds like it would bring even more benefits.

Edit: I wrote a better response above Ralf's comment