Window Functions and Views: Missing Optimization?
(1.1) By miasuji on 2021-02-20 20:28:04 edited from 1.0 [link] [source]
Let's say I have a table like this:
CREATE TABLE ListItem ( id INTEGER PRIMARY KEY , list_id INTEGER REFERENCES List(id) ); CREATE INDEX fk ON ListItem(list_id);
Let's get an item's position in its list:
SELECT ListItem.* , row_number() OVER () AS position FROM ListItem WHERE list_id = 99;
This is fast, and its speed is dependent on the number of items in the single list (not the total number of ListItems) since the WHERE clause filters the input rows before they get to the window function.
Now let's generalize this into a view:
CREATE VIEW ListItem_Annotated AS SELECT ListItem.* , row_number() OVER (PARTITION BY list_id) AS position FROM ListItem;
SELECT * FROM ListItem_Annotated WHERE list_id = 99;
Unfortunately, the speed of this query is dependent on the total number of ListItems across all Lists. It seems that the optimizer isn't using the WHERE clause to filter the input rows before they get to the window function, even though we know that we're partitioning on the same column as is in the WHERE clause.
EXPLAIN QUERY PLAN confirms a less optimal query as well. In the first
case, we do a SEARCH TABLE using the
fk index. However in the second
case, we do a SCAN TABLE instead, as well as an additional
CO-ROUTINE and SCAN SUBQUERY.
Is there some reason the optimizer can't turn these into the same query plan?
ps: Here's some SQL to create, populate, and test the above example.
pps: Thanks for SQLite! It's really phenomenal. I'm consistently amazed by what it can do, and inspired by the project's values. Cheers!
(2) By Dan Kennedy (dan) on 2021-02-23 15:54:58 in reply to 1.1 [link] [source]
Seems like a good optimization. Thanks!
(3) By miasuji on 2021-02-28 20:37:38 in reply to 2 [source]
Beautiful! Thanks so much for the quick follow-up.