Column ordering and space saving
(1.2) Originally by Andrea Aime (andrea.aime) with edits by Richard Hipp (drh) on 2020-10-02 16:00:02 from 1.1 [source]
I've been reading a few articles regarding PostgreSQL about re-ordering columns in order to save space, for example:
However, the SQLite type system is different, which may make the considerations above unapplicable.
So I'm wondering if there would be any space saving by re-ordering columns, is there any sort of padding/alignment in SQLite, or is the record packed with no spaces between columns?
Edit by drh to activate hyperlinks
(2) By Richard Hipp (drh) on 2020-10-02 16:07:34 in reply to 1.2 [link] [source]
SQLite does not pad or align columns within a row. Everything is tightly packed together using minimal space. Two consequences of this design:
SQLite has to work harder (use more CPU cycles) to access data within a row once it has that row in memory.
SQLite uses fewer bytes on disk, less memory, and spends less time moving content around because there are fewer bytes to move.
We believe that the benefit of #2 outweighs the disadvantage of #1, especially on modern hardware in which #1 is operating out of L1 cache whereas #2 is using main-memory transfers. But your mileage may vary depending on what you store in the database file.
(3) By David Raymond (dvdraymond) on 2020-10-02 16:14:14 in reply to 1.1 [link] [source]
You can see the file format here
For a record of 5 columns, you basically have 5 numbers that give the value types, and then the 5 values. No padding, spacing, etc. And nothing that explicitly says what value is what column. You get 5 values in order with no info about what is what. It just knows from the table definition in sqlite_master what the columns are and their names. For a number of values like NULL there isn't even any data in the value section. The type for the column specifys NULL, and then there is no payload. So it would be kind of hard to make it more compact.
(4) By Simon Slavin (slavin) on 2020-10-03 12:21:44 in reply to 1.2 [link] [source]
Also note that SQLite has almost no fixed-length datatypes. It does not support fixed string sizes like 'CHAR (20)'. The integer '1' takes less space for storage than the integer '4', which in turn takes less space for storage than the integer '4444444'.
SQLite is designed to be used inside things like GPS units and drill controllers, with tiny amounts of storage space. It has to be far more space-efficient than PostgreSQL. Which is not saying anything bad about PostgreSQL in comparison with other client/server systems.
(5) By Richard Damon (RichardDamon) on 2020-10-03 12:58:19 in reply to 4 [link] [source]
I think the only fixed length type is Float, all floats will be 8 bytes long, unless it gets stored as another type (but then it isn't stored as a float). If it happens to be an integral value, It might get compresses that way.
(6) By Andrea Aime (andrea.aime) on 2020-10-03 17:06:06 in reply to 1.2 [link] [source]
Thanks everyone for the detailed replies, much appreciated! :-D
(7) By Gunter Hick (gunter_hick) on 2020-10-05 05:40:17 in reply to 1.2 [link] [source]
From the documentation the following rules of thumb may be gleaned: x) keep heavily acessed fields (key and index fields) at the front of the record x) keep large, variable sized fields (clob and blob) at the end of the record x) sometimes it pays to split off clob/blob information from the main record The above does not save space, but saves tons of effort when decoding the compressed records, as acccessing field n requires decoding of fields 1..n-1 even if they are not used.