SQLite Forum

Virtual table ORDER BY and GT/LT (GE/LE) constraints expected behavior - full table scan when not needed?
Login

Virtual table ORDER BY and GT/LT (GE/LE) constraints expected behavior - full table scan when not needed?

(1) By Patrick DeVivo (patrickdevivo) on 2021-09-02 21:36:37 [link] [source]

Hi all!

I maintain a project that heavily uses the SQLite virtual table mechanism, https://github.com/askgitdev/askgit (via a golang driver).

I wanted to ask about the expected behavior of a query on a virtual table that has an ORDER BY clause (which is consumed by the index) AND a where constraint on the column in that ORDER BY. So for instance:

SELECT field FROM some_v_table WHERE field > 10 ORDER BY field DESC

Let's say some_v_table without constraints has 20 rows, where each field is a number from 1 to 20.

With the ORDER BY consumed in the BestIndex, the table should return 20, 19, 18 ...

My assumption would be that SQLite would know then to stop iteration when the virtual table reaches 10, since the BestIndex indicates that the results are sorted by this field.

Instead, the behavior I've observed is that SQLite does a "full scan" regardless, iterating down to 1, even though it ultimatelt discards those results less than 10 when returning results.

I'm wondering if this is expected behavior and I would need to implement a "stop" on the virtual table iteration to avoid the full scan, myself. If not, it may indicate something I'm doing incorrectly in my BestIndex/Filter implementation.

Please let me know if this question is not clear, I'm happy to provide some more examples/code pointers if necessary.

Thank you! Patrick

(2) By Gunter Hick (gunter_hick) on 2021-09-03 06:58:31 in reply to 1 [source]

There are two separate things going on here.

SQLite is asking about a constraint AND about an ordering.

I guess you are already correctly setting an index number and the orderByComsumed flag (the query plan should reflect this) so that your xFilter and xNext functions return rows in the correct order.

What you are observing is consistent with telling SQLite that you can NOT handle the constraint; SQLite codes a check and discards any returned rows that do not match. It will not stop until you tell it that the EOF has been reached (xEOF returns true).

You can speed things up by handling the constraint (field,>) yourself. You need to assign an argv index value to receive the value to compare the field against in the xFilter function and code to return EOF when the constraint is no longer met.

You should also change your estimated cost and estimated rows return values to reflect that you will be retrieving, on average, half as many rows for a greater/less than scan.

If you set the omit flag, SQLite will trust your VT table implementation and not code a constraint check itself.

(3) By Dan Kennedy (dan) on 2021-09-03 11:32:57 in reply to 1 [link] [source]

My assumption would be that SQLite would know then to stop iteration when the virtual table reaches 10, since the BestIndex indicates that the results are sorted by this field.

Instead, the behavior I've observed is that SQLite does a "full scan" regardless, iterating down to 1, even though it ultimatelt discards those results less than 10 when returning results.

I'm wondering if this is expected behavior and I would need to implement a "stop" on the virtual table iteration to avoid the full scan, myself. If not, it may indicate something I'm doing incorrectly in my BestIndex/Filter implementation.

SQLite isn't quite smart enough to do that. It will take all the rows you can give it and test each one for (field>10), returning those that match to the user.

You'll need to handle the constraint in the usual manner - by setting the argvIndex member so that the RHS value is passed to the xFilter callback and so on.

Dan.