Window Functions and Views: Missing Optimization?

(1.1) By miasuji on 2021-02-20 20:28:04 edited from 1.0 [link] [source]

Let's say I have a table like this:

CREATE TABLE ListItem (
    id        INTEGER PRIMARY KEY
  , list_id   INTEGER REFERENCES List(id)
);
CREATE INDEX fk ON ListItem(list_id);

Let's get an item's position in its list:

SELECT
    ListItem.*
  , row_number() OVER () AS position
FROM ListItem
WHERE list_id = 99;

This is fast, and its speed is dependent on the number of items in the single list (not the total number of ListItems) since the WHERE clause filters the input rows before they get to the window function.

Now let's generalize this into a view:

CREATE VIEW ListItem_Annotated AS
SELECT
    ListItem.*
  , row_number() OVER (PARTITION BY list_id) AS position
FROM ListItem;

SELECT *
FROM ListItem_Annotated
WHERE list_id = 99;

Unfortunately, the speed of this query is dependent on the total number of ListItems across all Lists. It seems that the optimizer isn't using the WHERE clause to filter the input rows before they get to the window function, even though we know that we're partitioning on the same column as is in the WHERE clause.

EXPLAIN QUERY PLAN confirms a less optimal query as well. In the first case, we do a SEARCH TABLE using the fk index. However in the second case, we do a SCAN TABLE instead, as well as an additional CO-ROUTINE and SCAN SUBQUERY.

Is there some reason the optimizer can't turn these into the same query plan?

Thanks!

ps: Here's some SQL to create, populate, and test the above example.

pps: Thanks for SQLite! It's really phenomenal. I'm consistently amazed by what it can do, and inspired by the project's values. Cheers!

(2) By Dan Kennedy (dan) on 2021-02-23 15:54:58 in reply to 1.1 [link] [source]

Seems like a good optimization. Thanks!

https://sqlite.org/src/info/20689468100aed26

Dan.

(3) By miasuji on 2021-02-28 20:37:38 in reply to 2 [source]

Beautiful! Thanks so much for the quick follow-up.