Key-Value VFS (kvvfs)

kvvfs - the Key/Value VFS - is an SQLite3 VFS which delegates storage of its pages and metadata to a key-value store.

SQLite releases prior to 3.52 include only basic support for this VFS - only APIs on this page specifically mentioning "version 1" are available. As of 3.52, it has been significantly extended.

Kvvfs was conceived in order to support JS's localStorage and sessionStorage objects. Its native implementation uses files as key/value storage (one file per record) but the JS implementation replaces a handful of methods so that it can use the aforementioned JS objects as storage. Version 1 of kvvfs is specifically hard-coded to localStorage and sessionStorage so has historically only been available in the main UI thread. Version 2 lifts that limitation, at the cost of persistence, in Worker threads (it uses transient storage instead). Even so, backing up such kvvfs storage is easy, so those databases can be made persistent with a little extra effort.

Kvvfs encodes each page of a database into a bespoke ASCII encoding so that each can be stored in a JS string. Each page is stored as a distinct key in the underlying storage object, along with well-defined keys for some metadata like the db's unencoded size and its journal, each as individual records.

Kvvfs, because of that encoding, is significantly less efficient than a plain in-memory db but it also, as a side effect of its design, offers a JSON-friendly interchange format for exporting and importing databases without requiring sqlite3_serialize() or an SQL dump.

Kvvfs is probably not appropriate for heavy db loads. It is relatively malloc()-heavy, having to de/allocate frequently, and it spends much of its time converting the raw db pages into and out of an ASCII encoding. See also: #performance

Summary of quirks and limitations:

Using Kvvfs

For version 1 usage, see ./persistence.md#kvvfs. That all applies here but it does not cover version 2 uses.

Version 2

Version 2 of kvvfs extends it to support using Storage-like objects as backing storage, Storage being the JS class which localStorage and sessionStorage both derive from. This essentially means that it uses Storage-like objects to house in-memory databases, each page of the database living in a different property of the storage object.

Version 2 remains compatible with version 1 databases and always writes localStorage/sessionStorage metadata in the v1 format, so such dbs can be manipulated freely by either version. For transient storage objects (new in version 2), the format of its record keys is simpified, requiring less space than version 1 keys by eliding redundant (in this context) info from the keys. (It's a long story.)

Version 2's new capabilities:

VFS Flags

Kvvfs accepts the following URI-style flag:

Storage Objects

The JavaScript Storage class is the basis of localStorage and sessionStorage. Those objects are implemented in native code, so Storage cannot be subclassed from JS code. Even so, we can use Storage-like objects (same interface) to plug in transient storage. Rather than expose an object interface which we are beholden to support forever, this API exposes a virtual filesystem in which each file is, behind the scenes, a storage object referenced client-side only by name.

Built-in storage pool names:

Each name, and thus each storage unit, refers to a single database. Kvvfs supports only one database and one journal per storage unit. Kvvfs journals are stored as a single record within the database's own storage object.

In the case of localStorage and sessionStorage, a database may be hosted in them alongside other client-side data, with the caveat that they will contend for space and may outright conflict with database handles from other browser tabs which use that same storage. All storage keys in those two specific storage units are prefixed with kvvfs-(local|session)-, so as to not collide with any client data.

Managing Kvvfs Storage

sqlite3_js_kvvfs_clear() and sqlite3.kvvfs.clear()

sqlite3_js_kvvfs_clear() (version 1) and sqlite3.kvvfs.clear() (version 2) clear kvvfs storage. They differ only in their argument's default value:

For backwards compatibilty, a which value of an empty string resolves to both 'local' and 'session' (which will only have an effect in the main UI thread unless a separate thread specifically installs storage units with those names).

These functions normally throw if the storage is currently opened by a db, as no good can come from wiping the storage out from under an opened database, but there is an exception: for backwards compatibility reasons, the local and session storage objects may be wiped while they are in use. Kvvfs version 1 was not equipped to recognize that the db was in use so permitted this operation.

Sidebar: in fact, kvvfs recovers silently from a storage wipe while it's in use, so long as the db is re-initialized, e.g. its schema reinstalled, before any further operations on it. This being a large foot-gun, however, these routines specifically do not permit it except for the aforementioned compatibility cases.

sqlite3_js_kvvfs_size() and sqlite3.kvvfs.size()

sqlite3_js_kvvfs_size() (version 1) and sqlite3.kvvfs.size() (version 2) estimate of the number of bytes of storage used for the database, including its keys and values. Their signatures and their arguments' semantics are exactly as documented for clear(), only the operation they perform differs.

sqlite3.kvvfs.export()

Usages:

Exports a kvvfs storage object to an object, optionally JSON-friendly, which can later be passed to import() to restore it. The primary benefits over sqlite3_deserialize() are simplified usage and a JSON-friendly storage format.

Passing it only a name is equivalent to passing it {name:thatName}.

Its options object argument may have the following properties:

Throws if this db is not opened.

The returned object is structured as follows...

The kvvfs-related encoding of the db pages is not part of this interface - it is simply passed on as-is. Interested parties are directed to src/os_kv.c in the SQLite source tree, with the caveat that that code also does not offer a public interface. i.e. the encoding is a private implementation detail of kvvfs. The format may be changed in the future but kvvfs provides strong backwards compatibility guarantees and will continue to support the current format.

sqlite3.kvvfs.import()

Usage: sqlite3.kvvfs.import(exportObj [, overwrite=false])

Expects an object from export(). On success, it replaces the storage named by exportObj.name with the given import. Throws on error. Error conditions include:

sqlite3.kvvfs.reserve()

Usage: sqlite3.kvvfs.reserve(storageName)

If no kvvfs storage exists with the given name, one is installed. If one exists, its reference count is increased so that it won't be freed by the closing of a database or journal file which currently has it opened.

Throws if the name is not valid for a new storage object.

The built-in storage objects are all preinstalled and have artifically-high reference counts, so they will normally not be freed.

Usage: sqlite3.kvvfs.unlink(storageName)

Conditionally "unlinks" a kvvfs storage object, reducing its reference count by 1.

This is a no-op if name ends in "-journal" or refers to a built-in storage object.

It will not lower the refcount below the number of currently-opened db/journal files for the storage (so that it cannot delete the storage out from under them).

If the refcount reaches 0 then the storage object is removed.

Returns true if it reduces the refcount, else false. A result of true does not necessarily mean that the storage unit was removed, just that its refcount was lowered.

sqlite3.kvvfs.listen()

Usage: sqlite3.kvvfs.listen(listener)

Kvvfs v2 supports asynchronous event listeners for some of its actions, the intent of which is to provide a way to back up databases incrementally in a manner commonly used by streaming-based backup solutions at the db page level, as opposed to at the record-level.

The argument must be an object with the following properties:

Passing the exact same object to unlisten() will remove the listener.

Each event listener is a callback function with the following interface:

Each event callback gets passed a single object with the following properties:

The key provided to write and delete, and the value provided to write, are in one of the following forms:

For local and session storage, all of those keys have a prefix of 'kvvfs-local-' resp. 'kvvfs-session-'. This is required both for backwards compatibility and to enable dbs in those storage objects to coexit with client data. Other storage objects do not have a prefix.

Design note: JS StorageEvents are only available in the main thread, which is why the listeners are not based on that.

sqlite3.kvvfs.unlisten()

Usage: sqlite3.kvvfs.unlisten(listener)

Removes the kvvfs event listeners for the given options object. It must be passed the same object instance which was passed to listen().

Returns true if it unmapps the listener, else false. If a listener is added multiple times, unlisten() removes all of them.

This has no side effects if opt is invalid or is not a match for any listeners.

Misc.

Performance

Benchmark-wise it's difficult to pinpoint because (A) it's highly dependent on build-time optimization levels and (B) JS VM performance can vary wildly across runs. For small/moderate job sizes it's generally roughly 1/2-1/3 the speed of an in-memory db. That delta is well below human-perceptible levels for individual queries and smallish workloads2. That gap goes, however, grow as the workload sizes increase into the hundreds of thousands of queries.

We can compare overall db performance of in-memory and kvvfs databases with SQLite's "speedtest1" tool. It runs many types of queries and has a configurable "job size" which determines how much work it does. A higher job size directly maps to more db I/O. In its native build, speedtest1's default job size is 100, but that is more than a million queries and takes a long time to run in a browser. The job size is an abstraction, it does not tell us how many queries will be run - see the output of speedtest1 in the links below for the precise counts.

Comparisons of various speedtest1 --size N values:


  1. ^ This is largely a remnant of version 1's limitations. No effort has yet been made to change that behavior in version 2 but it "should" be possible without too much work.
  2. ^ Recall that localStorage and sessionStorage have modest storage limits, so there is no possibility to store monstrous databases in them. Non-persistent kvvfs has no inherent size limits - it will take whatever memory the browser will give it.