DBSTAT docs example potentially misleading
(1) By mlin (dnamlin) on 2021-06-02 13:05:24 [link] [source]
The DBSTAT docs conclude with this cool example to measure contiguity of the database file.
CREATE TEMP TABLE s(rowid INTEGER PRIMARY KEY, pageno INT); INSERT INTO s(pageno) SELECT pageno FROM dbstat ORDER BY path; SELECT sum(s1.pageno+1==s2.pageno)*1.0/count(*) FROM s AS s1, s AS s2 WHERE s1.rowid+1=s2.rowid; DROP TABLE s;
I don't think this is a good measurement if the database has multiple large btrees, because
path is not prefixed by the btree name/id. So the
s table ends up with colliding paths interleaved from different btrees, which most all then count against the contiguity, even after a fresh vacuum. The measurement as written is therefore pessimistic (possibly, extremely so).
I have a big database that gets reported as only 60% contiguous by the above algorithm, but if I modify it to look at one btree at a time, actually each one is almost perfect!
(2) By mlin (dnamlin) on 2021-06-03 00:35:44 in reply to 1 [source]
tl;dr the example should say
ORDER BY name, path instead of just
ORDER BY path