SQLite User Forum: Native support for pluggable backends in SQLite

Thank you, anonymous, for your detailed analysis!

> - The comparison function will presumably need to be set for the cursor (when you are opening it), I would expect. (This is only applicable for cursors that do not use 64-bit integer keys, though.)
> - Each "KV table" has either 64-bit integer keys (which always uses numerical order, I think) and arbitrary sequences of bytes as data, or the arbitrary byte sequence used as data is also used as the key and no separate key is stored (in which case the comparison function is needed, I think). [...]

Key handling is one of the biggest pain points of this whole idea, since the way SQLite 3 handles btree keys internally is far from "clean". SQLite 4 was going to clean this up, but, what can you do? :)

From your comment I think you are saying the details of key comparison vary at runtime based on the schema. The API I was imagining would abstract over this from the point of view of the back-ends, but there are definitely pros and cons to each approach. One means potentially more redundant code per back-end, but also potentially more control. I'm not opposed to either way.

> - For accessing the data stored in the header of the standard SQLite file format, you could have a method that is given the number which is the offset in the standard SQLite format, for example 68 to read or set the application ID, or 56 for the text encoding. Some of these header fields will not be applicable to custom storage engines.

This could be better. The concerns are 1. flexibility (forward compatibility), 2. usefulness (allowing back-ends to actually use fields if they have reason to) 3. minimizing back-end code for back-ends that don't care. My hunch is that using byte offsets is "less general", but at this point the SQLite format is well documented and unlikely to change, so... Yeah, either way.

> If multiples are loaded, and one finds that the file is not in its format, it can return SQLITE_NOTADB to indicate that the next one should be tried. If it does care about the filename, its xOpen method can check the filename and return SQLITE_NOTADB based on the filename, without trying to open the file.

This is an interesting idea, but it could make detecting corruption non-trivial (or automatically "fixing" it destructive, if the back-end is a different format). SQLite makes incredible guarantees for reliability and long term support, so I don't want to cut any corners that might introduce problems down the line. Storing yet more info in the filename is bad, but in this case I think it's necessary given the entire file content is under control of the back-end.

I can also imagine back-ends that might wrap the existing btree (e.g. for transparent compression at the KV level), which you wouldn't want the btree back-end to try to open directly, even though the header would be compatible.

> It might need to indicate savepoint numbers.

Yes, quite likely. Good catch.

> There are a few other problems with the API that you have specified. (For one thing, some of it is unclear. Some things might also be unnecessary, and some things might be missing.)

Yes, indeed. There would definitely be at least a few minor changes discovered during implementation.

I'm mainly concerned what the SQLite authors (Dr. Hipp et al.) think of the overall direction. Including some of the "policy" questions like using filename extensions to identify back-ends and the handling of keys as opaque buffers, and whether there is any interest in such an invasive and "risky" change at all.

Thanks again for your comments!

--Ben