SQLite Forum

autovacuum in sqlite VS autovacuum in PG
Login
nitpick about Keith's final paragraph.

Paraphrasing

"create table x(x integer not null check (x in (0,1))"

" ... about 1.8 trillion rows [is at about 2 terabytes]"

Implying sqlite can hold each row of x in slightly more than one byte of file space.

I think that ignores too much (roughly a factor of 10) of the overhead associated each row.

For each row of table x, SQLite knows x.rowid and x.x.

From https://sqlite.org/fileformat2.html

I think I see that the leaf pages need, per row
A 2-byte cell pointer, pointing at the location of the cell holding the row on the page.

The b-tree "cell" for the table x row has
- A varint giving the cell size. This will be 1 byte.
- A varint for the rowid. For a million rows, most of these will be three bytes. For 1.8e12 rows, I believe the majority of these would be six bytes.

The payload uses the record format. It seems to need a varint (1 byte) for the header size, and a varint for each column (apparently other than the rowid) which hold the type (and for some values, including 0,1, the value) of the column. For table x, I believe the payload is 2 bytes.

For a million rows, I expect 2 byte cell pointer, 1 byte cell size, 3-byte rowid, and 2 byte payload, or 8 bytes total. I see 8MB file size when I insert a million records into table x, so that seems consistent.

For 1.8e12 rows, I think the rowid grows to ~six bytes, so the cost per row is around 11 bytes.