SQLite Forum

Native support for pluggable backends in SQLite
Login
phpstatic on 2020-11-18 15:33:26 wrote:

> I check your benchmark report from this link: https://lumosql.org/benchmarking/

Thanks, it's really helpful to have people look at what we've done so far.

The concept is that you can run the [benchmarking tool](https://lumosql.org/src/lumosql/doc/tip/doc/lumo-test-build.md) for yourself. That is the tool that produced these example results on old hardware, so you will get much better numbers. 

The reason for this design is that typically benchmarking is about one person running many tests and publishing their results, or, many people posting to a forum about the results they have. Results are very hard to replicate, and almost impossible to compare. Soon you will also be able to use this tool to upload your results database to create **a public database of reproducible measurements for comparison**. 

In this case, while you have much better hardware and get much bigger numbers, the comparisons should be equally valid. 

Please tell me if I have not been clear in this explanation so far :-)

> The number seems not good(100TPS for sqlite, 300 TPS for sqlite+LMDB).

I recommend you look at the [benchmark filter tool](https://lumosql.org/src/lumosql/doc/trunk/doc/lumo-benchmark-filter.md) which illustrates how you can:

1. Get your own numbers for 3.33.0 (or some other version you specify). I'm sure your numbers will be much higher than these example numbers.
2. Compare ***on your exact machine*** a 3.33.0 vs 3.31.0. Or 3.31.0 vs 3.8.1+LMDB . Etc.

After that, when your benchmarking database is uploaded and combined with the example benchmarking database you saw at lumosql.org, then we can answer some more questions:

3. Does SQLite version X vs SQLite version Y behave proportionately the same on your modern hardware to my old hardware? Perhaps it doesn't, because of memory pressures, or disk I/O bottlenecks. That's very useful information.
4. Same question as (3), except for different backends. We do not necessarily expect uniform linear improvements when moving from old hardware to new hardware.

> Is the test enable MMAP + WAL ?

That is an excellent point, because it is about the dimensions of benchmarking. Now that we have a working benchmarking tool that saves information to an SQLite database, we are improving the data it collects. The dimensions we have so far include:

* SQLite version
* Build options
* Backend, and backend version
* Environment major characteristics, eg Linux/Windows and their versions
* Environment details - whatever we can get, in a portable way. Hardware details, OS detailed version numbers, disk media info. 

While we are making the schema as future-proof as we can, we are also trying to make it comprehensive now. We are very aware that we haven't done any benchmarking on Windows, and never even run LumoSQL on Android once... but these are both important deployment targets. So the benchmarking has to try very hard to run on these systems and give results that can be meaningfully compared.

Here is what is still to come in benchmarking, very soon:

* Capturing system hardware
* Results upload system
* Verification of results, by testing that benchmark results rows are internally consistent

> This is not the case any more, use a VFS shim can handle the encryption and corruption detection. Use VFS no need call internal functions.

Unfortunately that is not quite the case. In the [LumoSQL Corruption Design](https://lumosql.org/src/lumodoc/doc/trunk/doc/lumo-corruption-detection-and-magic.md) document the heading **"Design of the SQLite Checksum VFS Loadable Extension"** lists positives and negatives. The negatives listed include:

* No information about the logical location of this error, eg what row(s) it affects. The application knows nothing about how rows map to pages. All the application knows is that SQLITE_IOERR_DATA was returned during a read operation. That's a lot better than silent corruption, but also not as helpful as it could be.
* Brittle implementation due to requirements of the file format. The "bytes of reserved space on each page" value at offset 20 in the SQLite database header must be exactly 8. What if we want a better or different checksum?
* No facility for isolation or recovery of data. If we know with certainty that only row number 1234 is corrupted, then the application can take some action that is a lot less drastic than suspecting the entire file. This comes down to the design goal: the checksum VFS is aimed at the "random storage bitflips" problem. This is a very real issue, but there are many other causes of corruption in a database, including application bugs, partial restore of a backup file, an SQLite crash, and more.