SQLite Forum

binary .dump format
Login
I'm gonna risk showing my low knowledge and ask something.

For a binary dump of the data, why not just make a straight copy of all the leaf table b-tree pages? (I'm sticking my fingers in my ears and ignoring WITHOUT ROWID tables for now)

Go through each of the tables, and dump out the leaf pages in btree order (not original physical page order). If a leaf has overflow pages, maybe put them in order right after the leaf they overflow from.

We'll be storing sqlite_master, so a loader tool will have the whole schema. All the interior btree pages and all index pages (again, ignoring WITHOUT ROWID for now) are just there for quick search and wouldn't have had any data that needed storing. And when restoring a dump we're just gonna visit each record once and don't need quick searches.

Re-creating the tree part shouldn't be too bad if you have all the leaves, right? You could either decode them one at a time and do a bunch of inserts to the new file, or in a fancier move copy them all at once to the new file, then re-create the interior pages with something akin to create index. The only copied pages you'd have to make updates to would be for new root page numbers in sqlite_master, and overflow page pointers. Right? You'd probably just need a new header to list the start and end pages of each table in the dump.

Just storing the original leaf pages as-is would also let you re-use a bunch of already written code I'd bet.

Admittedly it wouldn't really work for comparing dumps of equivalent data, as how things are distributed between pages and on each page could be different. But as far as a "binary .dump format", why not?

Anyway, let me know how insane I'm talking. And apologies if it really is insane.