high performance mode for sqlite

(1) By Isokaze on 2024-01-16 07:38:17 [link] [source]

Hello,

I am new to sqlite. I was wondering if a fork exists of sqlite that focuses on performance? Going as far as breaking compatibility with the file format.

My understanding is that sqlite is already among the faster SQL databases and there would be at this point plenty of opportunities in the code for optimization if compatibility was not an issue and if the ability to safely use database files from strangers wasn't either and performance was considered a priority to invest time in.

(2) By Gunter Hick (gunter_hick) on 2024-01-16 08:34:18 in reply to 1 [link] [source]

There are options to omit features you don't need at compile time, and possibilities to configure SQLite to sacrifice ACID properties for speed.

The code itself is already highly optimized for speed; one of the optimizations is compiling all of the sources (aka the amalgamation) in one go to avoid creating external symbols for internal routines and to allow the compiler to apply as much optimization as possible.

SQLite is a library, not a client/server database, so there are no network latencies to contend with.

(3) By Isokaze on 2024-01-16 10:37:38 in reply to 2 [link] [source]

I mean something different. For example:

replace varint with another construct that is SIMD friendly.
arrange table leafs to separate rowid and records for cache-friendly search, arrange the rowids in a way that is SIMD friendly, which would allow a intra-page btree search to be replaced by a parallel compare. This is assuming that rowids in one table leaf will have almost the same bitlength and varint would not be used to store a rowid.
If a value in an index key is the same for each index entry in a page, it is still repeated for each index row, which would not be necessary if there was an alternative scheme for index pages that would allow defining spans of rows that share a common property.

(4) By Gunter Hick (gunter_hick) on 2024-01-16 11:29:07 in reply to 3 [link] [source]

I guess that would not be SQLite any longer if you replace most of the basics of the internal engine.

(5) By Isokaze on 2024-01-16 12:16:35 in reply to 4 [link] [source]

Not replacing anything, just adding more complexity on top. The user could choose between different file formats.

Not most of the basics, just the page format. Which is one small part, but of course an important part from a performance standpoint.

(6) By Gunter Hick (gunter_hick) on 2024-01-16 12:34:27 in reply to 5 [link] [source]

The file format plays a major role in determining processing speed.

It seems to me that you are proposing to trade storage efficiency (varint, variable length records, ...) for processing speed. Using more space to store records means less fan out at the index level, means more levels to traverse and more pages to read from storage.

Most queries are already IO bound; making them do more IO for the same information content is IMHO likely to make things slower, not faster. Reading 10% more blocks is going to impact performance much more than saving 50% of processing time betweeen waiting for IO to complete.

(7.1) By ralf (ralfbertling) on 2024-01-16 12:54:50 edited from 7.0 in reply to 6 [link] [source]

So the answer to the initial question is: "no".

On a theoretical note, the file format could be better optimized for databases with large caches or in-memory-databases.

The file format does also allow for some flags to do such extensions.

If you have the means you could create upward compatible extensions and prove they are worth it.

Please note that SQLite is open source, but not open contribution. So IF your proposal is significantly better, the team might take up the idea and re-implement it.

For your own use, you are of course free to do whatever you want with your extension.

Regards, ralf (edited for typo)

(8) By Isokaze on 2024-01-16 12:56:42 in reply to 6 [link] [source]

I would say that the page format can be made more compact and more efficient at the same time. It is after all, quite simple.

Look at a typical table leaf. the rowids are all sequential. No need to compress it as a varint if you only need to store the lower 8 bits and can get the remaining 56 bits from the page header.

Same for a typical index page, index values are sorted and sorted means there is opportunity to compress: only store the least significant bits that differ per row, store the most significant bits once in the page header. Even more true for non-unique indexes.

For a table interior page, there would no longer be a need for a heap if the rowid and the page id are fixed in length.

(9) By Gunter Hick (gunter_hick) on 2024-01-16 13:44:51 in reply to 8 [link] [source]

"The rowids are all sequential" is only true for an initial load that uses automatically created rowids.

There can be arbitrary gaps between the rowids of adjacent records. The caller may provide values for the rowid if he so chooses. Or sufficiently many records "in between" are deleted and VACCUUM is required to keep the rowids, e.g. when an alias for the rowid is declared.

Compressing index values sound like a good idea until you realise that you may have to recompute the number of bits to store (and thus need to rewrite the complete page) on an insert or while rebalancing the tree. Plus the number of "common bits" is likely to be smaller (or even zero) the closer to the root page you get.

(10) By Isokaze on 2024-01-16 15:26:09 in reply to 9 [link] [source]

On a typical table leaf the rowids are sequential.

If they are not, you can store the page in the original format.

if a row is inserted into an index page, the index value will be close to all existing index values. You can store a few extra bits per row redundantly to reduce the chance of having to rebuild the page. In other words if you have enough redundancy that the common part of the index key is common about a few adjacent pages, then an index key which falls outside the range cannot end up being stored in this page.

You can also limit yourself to tables or individual pages that are never or seldomly written to. And use the old format otherwise.

Pages at the top of the tree can use the old format. There aren't very many top level pages.

(12) By Bill Wade (billwade) on 2024-01-17 13:45:05 in reply to 10 [link] [source]

I think you are running into the Lite in SQLite. More choices mean more complexity (more testing, ...).

You can come up with reasonable file formats that store an array of four-byte floats in not much more than four bytes per per row, and nearly random access to any row.

In sqlite (assuming a billion rows), using the "natural" table (index int, value real), you're looking at something above 14 bytes per row, and slightly less random access performance.

It would be nice (for some use cases), if SQLite provided smart, clean, optimized support for arrays (without gaps in indices), and for four-byte floats.

It would be nice (for some use cases), if SQLite provided column compression.

Adding more and more complexity takes you farther and farther from Lite.

(15) By Isokaze on 2024-01-17 16:01:00 in reply to 12 [link] [source]

Adding more and more complexity takes you farther and farther from Lite.

You can say that about any feature. This feature in particular wouldn't result in API changes. Besides that you can select between file formats. I don't think "Lite" means to keep internal sophistication at a low level. It means you don't have to run a network server. It means sqlite does not strive to implement large portions of the SQL spec, outside the core requirements. And so on.

(32) By Bill Wade (billwade) on 2024-01-23 14:21:36 in reply to 15 [link] [source]

True. You could write an implementation of the API that supported the current file format and additional file formats.

You are proposing column-compression, but only for the rowid column (and perhaps only using a compression which happens to be SIMD friendly).

I was suggesting a format with column-compression for every column.

Perhaps the current format has a file header that begins with "sqlite 3", your format begins with sqlite 4, mine with sqlite 5, and the next great idea (use an even better technique for b-tree internal pages) uses sqlite 6. Maybe sqlite 7 shards the database across file systems, for parallel reads and writes.

If backwards compatibility is to be preserved, the development team (and the delivered code) needs to support all of those formats (and possibly different formats on a per-table basis).

I suspect that the development team (and those supporting them) have different priorities.

They might be wrong. I might be wrong. Certainly when things are mostly I/O bound, compression is good to consider. Show them that it makes sense. You have access to the existing source code.

(11) By Simon Slavin (slavin) on 2024-01-17 13:25:54 in reply to 3 [source]

None of those things will speed up how SQLite works. They are ways to make an alternative database engine which doesn't have anything to do with SQLite.

There are adjustments you can make to SQLite to speed things up. Most of them can be done using PRAGMAs:

https://www.sqlite.org/pragma.html

In each case you are sacrificing one feature (ACID, multiuser simultaneous access, filespace, speed) for another. Some of them give you something close to what you originally asked for: ways to speed up SQLite.

Your question makes me think of premature optimization (https://en.wikipedia.org/wiki/Program_optimization#When_to_optimize). So I ask … what are you doing that requires such fast SQLite operation, and have you already tried the standard things like batching your INSERTs into transactions and creating indexes after putting the data in ?

(14) By Isokaze on 2024-01-17 15:48:30 in reply to 11 [link] [source]

You don't think sqlite can be sped up using different algorithms?

I listened to a talk by Richard Hipp the other day where he said varint was a mistake.

Just a regular optimization, independent of any trade-offs that can be applied to increase performance.

This boils down to keeping backwards compatibility in the file format (I think), when you don't correct "a mistake". I don't think compatibility is very important, for many users (when their use case does not demand it or conversion to/from the portable format is easily accomplished when needed), especially if you do not have to give up the portable format.

(16) By Gunter Hick (gunter_hick) on 2024-01-18 14:06:59 in reply to 14 [link] [source]

Yeah, you did not get what he was saying about varints.

He only said that the SQLite3 implementation of varints was a mistake, because you need to parse the varint before you can compare it.

Just seconds later he proposes a different scheme (used in SQLite4, which experimental and has no release date) where the first byte tells you the length and magnitude of the varint so you can compare them with memcmp() without decoding the whole thing.

Not the "ditch varints" you seem to have heard.

(27.1) By Isokaze on 2024-01-18 19:07:46 edited from 27.0 in reply to 16 [link] [source]

Deleted

(29) By Gunter Hick (gunter_hick) on 2024-01-19 06:53:08 in reply to 27.1 [link] [source]

Please don't delete your posts. It makes the discussion hard to follow.

(30.1) By Isokaze on 2024-01-19 09:58:48 edited from 30.0 in reply to 29 [link] [source]

The forum software should delete the comment totally.

If I kept in a comment that was entirely wrong and only visible for 2 munutes, the discussion would not become easier to follow.

(28.2) By Isokaze on 2024-01-18 19:26:11 edited from 28.1 in reply to 16 [link] [source]

Another alternative is to use the 3 most significant bits of the first byte as a length code:

000_XXXXX: represents numbers 0-31

001_YYYYY_XXXXXXXX: 32-8191

can only represent maximum 31 with a single byte (down from 240), but all multi-byte variants have a larger range. And it remains less-than comparable and can easily be converted without branching to a regular integer (set leading 3 bits zero, cast to uint64_t and shift right by 8*(7-length_code).

(31) By Isokaze on 2024-01-19 12:37:02 in reply to 28.2 [link] [source]

PS: I think a varint that can count to 240 is ideal as a length field in a MakeRecord structure, but it is not ideal as a rowid. There should be 2 kinds of varint used.

(33) By Vadim Goncharov (nuclight) on 2024-02-23 15:37:34 in reply to 28.2 [link] [source]

If you are interested in different VarInt formats for different purposes, (though there I considered almost always definite-length varints - the memcmp()able as in SQLite 4, opposed to current indefinite-length varint), you can see my drafts at https://github.com/nuclight/musctp/ - Appendix I in main draft, also some in ipnh.txt and cbar.txt (sometimes with sub-byte resolution).

As for rowid's, I think they do not need full 64 bits per se, maximum 60, given that SQLite file size is limited by 2^32 pages. However, this contradicts with cases such as INTEGER PRIMARY KEY...

(17) By Simon Slavin (slavin) on 2024-01-18 14:23:31 in reply to 14 [link] [source]

I think SQLite 3 is the SQLite file format. If you change the file format in a way that is not backward-compatible, you're no longer doing SQLite. Remember, there are literally billions of installations of SQLite and the majority don't run under Windows or macOS.

It's worth knowing that over the years DRH has reflected on a lot of things that are in SQLite only because of backward compatibility. They slightly slow down SQLite, or make programming slightly more awkward, There are many things he'd change if he was creating a new DBMS from scratch. But that's not what he is committed to, and it's not what we discuss here.

Did you take a look at the PRAGMAs I mentioned ? Do they solve your problem ?

(18) By Isokaze on 2024-01-18 15:24:46 in reply to 17 [link] [source]

I suppose this is an issue where different opinions exist.

But I find it hard to follow the logic of your argument.

20 years ago there existed a billion web pages that used whatever HTML standard was used at the time. Web standards evolved regardless. If I set up a web server what do I care what billions of other devices use as a file format. My web server is not supposed to run queries against their databases.

In my opinion this universal compatibility is a gimmicky feature. The portability without having to dump the database is useful, but is contradictory to standard practice. In other words if 20 year old web browsers would still be the standard today, you'd have an inferior WWW.

BTW, i tried opening a recently made database with version 3.11 and i got:

Error: malformed database schema (town_status) - near "(": syntax error

(town_status is a view). The compatibility does not even exist. And is there a simple way to convert the database? I think i would have to dump it, no?

(23) By Spindrift (spindrift) on 2024-01-18 17:09:59 in reply to 18 [link] [source]

That's forward compatibility, not backwards compatibility.

There's nothing gimmicky about the entire purpose of the project being to create databases today that will work as intended and designed with the current version of the library in 2050.

Absolutely no one was trying to achieve that with HTML 20 years ago - we have no such guarantee with web pages and technology of today. Flash containing websites are a great example of something that was absolutely not backwards compatible. Such an issue would be considered an abject failure of sqlite.

And no, it is not expected than an ancient version of the library can open a modern up to date database. That is irrelevant.

What is expected is that a current, up to date, latest bux fixes version of the library can open any previously created sqlite3 database and therefore support both access to that data, as well as provide consistent behaviour of sql queries into the far (medium!) future, for future hardware, and future operating systems and future programming languages accessing data and information that we are storing today and in years passim.

(24) By Isokaze on 2024-01-18 17:22:08 in reply to 23 [link] [source]

I don't want to reduce backward compatibility. I am talking about an alternative page format. To be used side-by-side with the existing format.

(34) By Vadim Goncharov (nuclight) on 2024-02-23 15:55:31 in reply to 18 [link] [source]

Web is notoriously bad example. In fact, it's example of how to NOT design technologies the right way (and yes, it would be better world if Web was destroyed and built from scratch, that's HTML/HTTP-relying part of Fossil makes me sad...)

And one of it's biggest problems is indeed lack of strategical planning which equals to compatibility problems. Millions of sites, browser extensions etc. are repeatedly need rewriting "to be modern" or even to work at all. And with time it gets worse and worse, e.g. we lost XUL extensions in 2017 (Tree Style Tab became unusable for years, for example)...

So thank all gods that developers of things which are at foundament if industry, e.g. C and SQLite, do care about compatibility for decades.

(19) By Isokaze on 2024-01-18 15:42:01 in reply to 17 [link] [source]

Or to use another analogy: If a man sells you a car today that is 10% less fuel efficient, but you can take the diesel injection unit out and put it in a 20 year old model, would you buy the car?

I said above the feature is useful. I want to add to that that long term file format stability is also useful. But if a different long term file format was introduced 5 years ago, it would be pretty universal by now.

(20) By Donald Griggs (dfgriggs) on 2024-01-18 15:55:47 in reply to 19 [link] [source]

As the saying goes, "Build a better mousetrap, and the world will beat a path to your door."

Contained in this thread is some excellent advice on improving performance if you're encountering problems. There was also some reasoning as to why some types of changes might end up being disappointing.

But conceivably they are wrong, and sqlite is indeed hampered by its current file format and the changes you envision would be a boon to many users. If you are convinced of that you should feel free to create an implementation of your proposal and benchmark the results against the current sqlite release. The source is not only open and documented, it is unencumbered by a restrictive license. A number of products incorporate the sqlite source in them.

(21) By Isokaze on 2024-01-18 16:21:41 in reply to 20 [link] [source]

Like they say: "There is more than one way to skin a cat".

Since the forum not only listens on http port 80 but is also very nice to use, i thought i come here and chat about it.

(22) By Spindrift (spindrift) on 2024-01-18 17:02:50 in reply to 19 [link] [source]

But that is far from the point here, for this project.

The file format is the file format. The underlying principle is of backwards compatible, long term support.

https://sqlite.org/lts.html

There are many ways to skin a cat, as you say elsewhere. But the strategy here is already clear, and that clarity underlies all of the decisions that you are questioning on this thread.

Forks undertaking what you propose are entirely possible, would no doubt have value and would be of great interest. But you are unlikely to foment such a directional shift in this base project.

(25) By Isokaze on 2024-01-18 17:32:35 in reply to 22 [link] [source]

An alternative page format does not break backward compatibility.

It would be another feature that would have to be supported till 2050.

(26) By Spindrift (spindrift) on 2024-01-18 18:30:30 in reply to 25 [link] [source]

Understood.

(35) By Vadim Goncharov (nuclight) on 2024-02-23 15:57:25 in reply to 25 [link] [source]

Then you'd better suggest it at SQLite 4 threads.

(36) By ralf (ralfbertling) on 2024-02-26 08:11:56 in reply to 35 [link] [source]

I guess that would have to be SQLite 5 as SQLite 4 turned out to be a dead end branch.

But seriously: When the support until 2050 was originally announced that was a far more distant future than it is now.

Some quirks are in this design as noted on this very page. Is there any roadmap that reaches beyond 2050?

(I am aware that we are all getting older and maybe drh maybe wants to retire at some point - fair enough)

Cheers, ralf

(13) By Donal Fellows (dkfellows) on 2024-01-17 15:00:00 in reply to 1 [link] [source]

I was wondering if a fork exists of sqlite that focuses on performance?

The primary method of speeding SQLite up for writing data is simple: disable sync() calls. This is because SQLite is strongly I/O-bound in all important use-cases, and the key cost there is transactions (especially commits). Disabling sync calls (by pragma) lets you get a write speed that's pretty much the same as any direct write code you might write yourself. (There is, of course, a downside to doing that; you're relying on the OS to not crash before data hits its final location on disk.)

For reading, by far the most important thing (at least in ordinary use cases) tends to be getting your indices right. Virtually everything else is beside the point by comparison; indices let you have efficient algorithms, especially in terms of the number of disk reads required to serve up a result set for a query. There might be cases where an index could be used and isn't, and those are candidate areas for future improvements to SQLite; any suggestions on that will be welcome.

Both of the above cases are ones where the SQLite library is doing the right thing already. It's the task of the user of the library (typically the application programmer) to do the right thing with the tools. Also, if you have something where you need to add a bit of intense computation, you can add a custom function, or maybe do that work outside the context of the DB.

If there are other cases that you can think of, please list them. (Using a more expansive data format isn't necessarily a win, FWIW, as it potentially forces more I/O to be done and I/O is absolutely a real cost.)