SQLite User Forum: Native support for pluggable backends in SQLite

Ben, Gunter, others,

Ben's proposal is the sort of philosophy we're implementing in LumoSQL,
although we are approaching it from the opposite direction. In LumoSQL we are
trying to identify what is missing in SQLite, and then implement
minimally-intrusive ways of addressing those problems. As of LumoSQL v0.3 we
have working, benchmarked prototypes and are starting to close in on what a
backend API might look like.

Gunter raised a point about the Virtual Table Interface which I respond to
near the bottom.

I will leave a direct answer to Ben about header files and API
details to another post. Because I feel we need to know why we are making
changes, and then how we're going to do it, and finally what should be done.
I have already discussed the general source architecture with Ben (and of
course he was working in this area years ago) so I'll be glad to move the
code architecture chat out in public.

We have done a great deal of work on the "what's missing?" problem, including
[extensively documenting existing knowledge](https://lumosql.org/src/lumodoc/doc/trunk/doc/lumo-relevant-knowledgebase.md).
I recommend anyone interested in storage APIs for SQLite to look there
for inspiration in code and papers related
to SQLite, K-V stores, VFS, benchmarking, transaction theory and more.
A lot of SQLite work has taken place in isolated hard forks,
and we are trying to make it easier for these forks to feel less isolated.

None of this overlooks the value of the existing binary on-disk format
of SQLite as relied on by billions of people daily. That's another
part of the discussion. This discussion is about what actions will
address the pressure for SQLite storage to do more and different things.

# What Is Missing in SQLite Storage?

### Nobody Really Knows :-)

In terms of performance, reliability and tradeoffs, nobody knows what is
missing. That is because there is no consistent way of comparing one version
of SQLite with a modified version, and certainly not for repeating these
comparisons across many different users, systems and build environments.
That's why we [built benchmarking](https//lumosql.org/benchmarking), which
also works for
[vanilla SQLite-on-SQLite](https://sqlite.org/forum/forumpost/e542545a47).

### Ability to Try New or Different K-V Options

There is a small number of K-V stores that provide the facilities SQLite
needs: MVCC, compatible transaction model, a Berkely Database-style API, and
implemented in C.
The main candidates are: LMDB, BDB (from when Oracle bought Sleepycat) and BDB (forked by Bloomberg when Oracle bought Sleepycat, and adding things like
page-level locking.)
[Karl Malbrain's database](https://github.com/malbrain/database/wiki) is a
candidate that implements the
[Serial Safety Net concurrency protocol](www.cs.cmu.edu/~./pavlo/papers/p781-wu.pdf), which rethinks the way concurrency is handled in a K-V store.
This is all about looking at alternative futures for SQLite.

### A Modern Alternative to 1990s-era WAL files

Every mainstream SQL database including SQLite uses 1990s-era batch design
to achieve reliabile persistent page storage. This works, but it also
ignores the last two decades of development in virtual memory management.
Modern operating systems know how to save and retrieve pages from disk with
very low corruption risk and little-to-no recovery requirements. SQLite has
an option to try this out, because LMDB is ubiquitous in open source
codebases and we know a lot about how it behaves. SQLite is the only
mainstream database that has any chance of exploring 21st century page
management. SQLite+LMDB has no WAL files, and greatly reduced pager code.

### Better Corruption Detection

SQLite's main techniques for avoiding corruption seems to be to take extreme
care in the WAL implementation, and to benefit from efficient code that
limits the time window for errors - plus not needing to worry about
consistency when doing network replication. The major databases (Pg, MS,
Oracle, MySQL, MariaDB) all take the approach of having in-and-out of band
corruption detection mechanisms. Being an embedded database, SQLite is often
used in quite unforgiving hard-crash scenarios, and the question "who has
regular difficulty with corrupted SQLite databases?" always results in a
forest of raised hands at talks and conferences.

What's missing therefore is:
* pre-emptive corruption detection (see next point)
* option to use the operating system for this (see previous point)

Unlike the other main databases, SQLite has no pre-emptive corruption
detection and only fairly basic on-demand detection.

### Checksummed Corruption Detection

For reasons that are unclear, not one mainstream database has the facility
to do fine-grained corruption detection that is directly visible to the
user. Richard Hipp recently wrote the
[SQLite Checksum VFS](https://sqlite.org/cksumvfs.html) which shows
some of the possibilities, with many limitations.

LumoSQL has investigated this extensively and designed a
[robust system for corruption detection and magic](https://lumosql.org/src/lumodoc/doc/trunk/doc/lumo-corruption-detection-and-magic.md)
. This needs to be part of the backend API abstraction so that it applies to
all K-V stores.

### Privacy Support is Missing

Encryption is not electronic privacy, but it is a pre-requistite for it.
Since assorted new privacy laws mandate reliable encryption, and since
non-crypto-capable SQLite is deployed pervasively, something needs to
change. Exactly what or how is unclear because there are many existing
options. LumoSQL has documented most of the crypto work that has been
done relevant to SQLite, and looked at the new requirements that either
exist or are coming soon, and is developing a design that works with
multiple backends.

### Networking

This isn't further discussed here, however several of the SQLite hard
forks have added networking backends. LumoSQL is trying to design the
backend storage API so that it is ***not incompatible with networking***,
which is not the same thing as supporting it.

# Why Not Use an Existing SQLite API?

An alternative K-V stores can be implemented as a Virtual Table.
That is what
[Gigimushroom's Database Backend Engine](https://github.com/gigimushroom/DatabaseBackendEngine)
did in 2019. But that won't give access to the internal features of
the VDBE-and-below layer that is the secret to performant storage in SQLite,
which in turn is why so many SQLite forks have choses to access these
internals directly.

It is interesting to read SQLCipher's comments on not using existing APIs
for their encryption product. A storage engine isn't the same as an
encryption layer, but there are design issues in common. SQLCipher says:

> SQLCipher is an extension to SQLite, but it does not function as a loadable plugin for many reasons. Instead, SQLCipher modifies SQLite itself, and is maintained as a separate version of the source tree. SQLCipher releases are baselined against a specific source version of SQLite. However, the project minimizes alterations to core SQLite code to reduce the risk of breaking changes during upstream SQLite merges.
>
> The reasons that SQLCipher is packaged this way, as opposed to a "plugin" or extension to the SQLite amalgamation, follow:
>
> * Enabling an SQLite codec requires the compile-time definition of SQLITE_HAS_CODEC, which is not present on standard, unmodified SQLite builds.
> * Even when enabled, SQLite isn't setup to load codecs as plugins. While SQLite does have a plugin function for loadable extensions, it does not extend access to any system internals (it mainly used to allow custom user functions).
> * SQLCipher makes calls to internal functions that are not part of the public SQLite API. Sometimes these APIs change, even in between minor SQLite versions. Thus, each update adn merge requires inspection, testing and verification. Making SQLCipher portable across multiple versions of SQLite would not be feasible, nor could it to use only the public API (for instance, even the first critical step of attaching the codec callback to the pager uses an internal API).
> * SQLCipher modifies supporting functions to introduce special pragmas, built in functions, etc (e.g. "PRAGMA cipher_*"). Injecting this functionality in a plugin architecture wouldn't be possible.
> * SQLCipher's test harness relies on support in testfixture to take advantage of the test API and various internal checks (memory reference counting, etc.)
> * Even if it were possible to use a loadable plugin, dynamic libraries aren't available on all supported platforms, for example iOS

For these reasons, and more, LumoSQL is implementing a backend storage API
abstraction.