DBSTAT docs example potentially misleading
The DBSTAT docs conclude with this cool example to measure contiguity of the database file.
CREATE TEMP TABLE s(rowid INTEGER PRIMARY KEY, pageno INT); INSERT INTO s(pageno) SELECT pageno FROM dbstat ORDER BY path; SELECT sum(s1.pageno+1==s2.pageno)*1.0/count(*) FROM s AS s1, s AS s2 WHERE s1.rowid+1=s2.rowid; DROP TABLE s;
I don't think this is a good measurement if the database has multiple large btrees, because
path is not prefixed by the btree name/id. So the
s table ends up with colliding paths interleaved from different btrees, which most all then count against the contiguity, even after a fresh vacuum. The measurement as written is therefore pessimistic (possibly, extremely so).
I have a big database that gets reported as only 60% contiguous by the above algorithm, but if I modify it to look at one btree at a time, actually each one is almost perfect!