Automatic covering index for virtual tables
(1) By Marcin Wisnicki (marwis) on 2020-12-03 01:14:45 [link] [source]
I'm working on ESENT module for SQLite. So far I have implemented basic reading without index support.
I have noticed that joining two virtual tables that don't support indexing (xBestIndex
sets estimatedCost=100000 and returns OK) will perform full scan on both, resulting in extremely slow queries for medium sized tables.
However if the same were to happen to unindexed temporary tables SQLite will create automatic covering index and use that to greatly speedup query.
CREATE VIRTUAL TABLE namespace USING esentvtab("F:\FileHistory\mwisn\MWSURFACE4\Configuration\Catalog1.edb", namespace); CREATE VIRTUAL TABLE string USING esentvtab("F:\FileHistory\mwisn\MWSURFACE4\Configuration\Catalog1.edb", string); CREATE TEMPORARY TABLE temp.namespace1 AS SELECT * FROM namespace; CREATE TEMPORARY TABLE temp.string1 AS SELECT * FROM string; EXPLAIN QUERY PLAN SELECT namespace.id, string.string FROM namespace LEFT JOIN string ON parentId = string.id LIMIT 10; EXPLAIN QUERY PLAN SELECT namespace1.id, string1.string FROM namespace1 LEFT JOIN string1 ON parentId = string1.id LIMIT 10;
Resulting query plans are as follows:
QUERY PLAN |--SCAN TABLE namespace VIRTUAL TABLE INDEX 0: `--SCAN TABLE string VIRTUAL TABLE INDEX 0: QUERY PLAN |--SCAN TABLE namespace1 `--SEARCH TABLE string1 USING AUTOMATIC COVERING INDEX (id=?)
Is there an easy way to add automatic index to virtual table?
I plan on implementing ESENT indexing later but this would be quick win.
(2) By Gunter Hick (gunter_hick) on 2020-12-03 07:33:52 in reply to 1 [link] [source]
No.
Your implementation of the xBestIndex function precludes such a feature, even if SQLite would support it. As you neglect to provide a schema, I will assume the following definitions returned to SQLite:
CREATE TABLE namespace (id INTEGER, parentid INTEGER); CREATE TABLE string (id INTEGER, string TEXT);
Firstly, your xBestIndex function should return the estimated number of disk accesses (the number of rows in the table can serve as a stand in value) as estimatedCost. Also try to fill in the other return fields, this will make a large difference in query performance.
So lets now assume that SQLite would attempt to create an automatic index on the string table by executing the equivalent of
CREATE TEMP TABLE string_id AS SELECT rowid,id FROM string; CREATE TEMP INDEX string_id_index on string_id(id);
Your query thus becomes
SELECT n.id, s.string FROM namespace n JOIN string_id i ON n.parentId = i.id JOIN string s on s.rowid = i.rowid;
Secondly, your xBestIndex function needs to support access via rowid for this to work; i.e. setting the estimatedCost and estimatedRows to 1, setting the UNIQUE flag and setting the argvIndex and omit fields for the rowid constraint. This will allow SQLite to call your xFilter function with the desired rowid to retrieve the string.
Since you already need to implement direct access via rowid to work, it would be cheaper and faster to expose the rowid as the id field.
(3) By Dan Kennedy (dan) on 2020-12-03 11:31:54 in reply to 2 [source]
It wouldn't be quite that tricky in SQLite, as automatic indexes are always covering indexes. But automatic indexes on virtual tables are, as you say, currently unsupported.
If you can fool SQLite into materializing a sub-select that queries the virtual table, it materializes into an automatic index. e.g.
CREATE VIRTUAL TABLE t1 USING fts5(a, b); CREATE VIRTUAL TABLE t2 USING fts5(c, d); EXPLAIN QUERY PLAN SELECT * FROM t1, (SELECT * FROM t2('abc') LIMIT -1) WHERE c=a;
such a trick could easily break next release though. And it really requires you to plan the query yourself, instead of having the planner do it.