SQLite Forum

varint documentation is misleading
Login

varint documentation is misleading

(1) By anonymous on 2020-06-10 14:06:34 [link]

https://sqlite.org/src4/doc/trunk/www/varint.wiki

> # Decode
> - If A0 is between 0 and 240 inclusive, then the result is the value of A0.
> - If A0 is between 241 and 248 inclusive, then the result is 240+256*(A0-241)+A1.

> - ...

However:

https://www3.sqlite.org/src/file?name=src/util.c

```C
/*
** Read a 64-bit variable-length integer from memory starting at p[0].
** Return the number of bytes read.  The value is stored in *v.
*/
u8 sqlite3GetVarint(const unsigned char *p, u64 *v){
  u32 a,b,s;

  if( ((signed char*)p)[0]>=0 ){
    *v = *p;
    return 1;
  }
  if( ((signed char*)p)[1]>=0 ){
    *v = ((u32)(p[0]&0x7f)<<7) | p[1];
    return 2;
  }

  ...
```
\- which doesn't look like the description above **at all**.

https://www.sqlite.org/fileformat2.html#varint seems to be correct-ish:

> The varint consists of either zero or more bytes which have the high-order bit set followed by a single byte with the high-order bit clear, or nine bytes, whichever is shorter. The lower seven bits of each of the first eight bytes and all 8 bits of the ninth byte are used to reconstruct the 64-bit twos-complement integer.

\- but the wording could be improved/simplified, e.g.:

> The varint consists of 0-8 bytes which have the high-order bit set followed by a single byte with the high-order bit clear. The lower seven bits of each byte are used to reconstruct the 64-bit twos-complement integer.

(2) By Richard Hipp (drh) on 2020-06-10 14:35:42 in reply to 1 [link]

You seem to be comparing the definition of Varint for SQLite4 against
the implementation of Varint for SQLite3.

The Varint definition for SQLite3 is here:  <https://www.sqlite.org/fileformat2.html#varint>

(3) By anonymous on 2020-06-10 15:17:41 in reply to 2

I see, thanks - https://sqlite.org/src4/doc/trunk/www/varint.wiki is the first result in google for "sqlite varint", I didn't notice "4" there.

(4) By anonymous on 2023-06-18 16:24:24 in reply to 2 [link]

Sorry to revive this old thread.
I'm doing some research on the different types of `varuints` at the moment and stumbled upon SQLite3's and SQLite4's approaches.

SQLite3's approach is pretty standard and self-evident, but, as stated in the wiki, SQLite4's approach has three very interesting properties. This, of course, comes at the cost of additional complexity and potential performance cost.

I would image all both advantages and drawbacks were discussed and benchmarked at some point, but I can't seem to find any mailing list discussions regarding this. Can someone point me in the right direction here?

(5) By Stephan Beal (stephan) on 2023-06-18 16:48:06 in reply to 4 [link]

> SQLite3's approach is pretty standard and self-evident, but, as stated in the wiki, SQLite4's approach has three very interesting properties. 

Note that sqlite4 was a short-lived experiment about a decade ago, with no development since then. Any documentation specific to sqlite4 is essentially of purely historical interest. The benchmarks comparing sqlite3/4 were focused on finding out whether v4's different storage model was significantly faster than v3's. It was expected/hoped to be but it turned out that (IIRC) sqlite's workloads simply weren't a great fit for that particular storage model. AFAIK there were no benchmarks specifically comparing the data type encodings.

(6) By anonymous on 2023-06-18 17:01:32 in reply to 5 [link]

Wow, I was not expecting getting an answer so quickly! Thanks a lot!

> AFAIK there were no benchmarks specifically comparing the data type encodings.

I see, I was beginning to think that this was the case. I'm designing a binary encoding for a hobby project and I'm more inclined to use the v3's encoding as well.

Still, encoding up to 240 instead of 127 in a single byte is pretty hard to pass.

If I ever get around to benchmarking this, I'll post it somewhere.

(7) By Vadim Goncharov (nuclight) on 2023-08-31 21:34:39 in reply to 6 [link]

What is your goal (desired properties)? I have several varint formats in my hobby spec, including sub-vyte ones...