Unexpected results from query - Order of operations LIMIT within subquery and ORDER BY in outer query

(1) By anonymous on 2022-11-04 16:17:23 [link] [source]

I ran across a scenario where a query is not returning the output I would expect, so I am wondering if this is by design or if something may be wrong with the query planner.

In this example:

select * from (select x, y from data a limit 100) b order by b.x asc;

The query plan shows as

QUERY PLAN
|--SCAN a
`--USE TEMP B-TREE FOR ORDER BY

And the result set appears to show that the order of operations were 1) SORT by x then 2) LIMIT the output. This is also apparent in the execution time.

The intention with the query is to first LIMIT the output to 100 and then SORT by x the limited result set so as to get different x values from the sample (which do exists when running the subquery alone).

I tried this same query in other DB engines and the result there is what I am expecting.

Can someone confirm if this is working as designed?

Thanks!

(2) By anonymous on 2022-11-04 16:30:45 in reply to 1 [link] [source]

By the way, forgot to add the version I am using. I was originally on 3.39.1 when I came across this, then upgraded to SQLite 3.39.4 2022-09-29 and still seeing the same behavior.

(3) By David Raymond (dvdraymond) on 2022-11-04 16:46:12 in reply to 1 [link] [source]

The short version is that it's not violating anything you told it to do. Your "limit 100" sub query is free to pick any 100 it wants since you're not ordering it. If you care at all about which 100 it picks you have to order it by... something. Whether that's another field, the rowid, random(), or anything.

But without an ORDER BY with the LIMIT it can pick whichever 100 it wants.

(4) By anonymous on 2022-11-04 16:59:03 in reply to 3 [link] [source]

Thanks David. I agree that the limit clause can pick whichever rows it wants (although I would assume it just reads the rows in order top to bottom, I may be wrong though), but the issue I see is that it is unnecessarily sorting the entire table first (in my case it's a 45M-row table, so it adds considerable overhead), then returning the first 100 rows from that result, instead of randomly picking 100 rows, then sorting that set.

(5) By David Raymond (dvdraymond) on 2022-11-04 17:11:48 in reply to 4 [link] [source]

Have you done an ANALYZE on the table before running the query?

(6) By anonymous on 2022-11-04 17:25:20 in reply to 5 [link] [source]

I had not, but just did now and same behavior.

(11) By David Raymond (dvdraymond) on 2022-11-04 19:03:11 in reply to 6 [link] [source]

Ok, so I have to admit, something weird does seem to be going on here.

I tried this on an old jumbo table I have with 47 million records. Doing this sort of select with the LIMIT 100 inside the subquery, and the ORDER BY outside takes 12 seconds. Even if I run it back to back so it's as cached as can be.

That seemed a bit long to me. So I did .stats on in the CLI, and it's giving me > Fullscan Steps: 47027655

I've got a 47 million record table, I'm limiting the query to 100, and it's doing a full scan of 47 million?

If I'm reading the EXPLAIN plan correctly, it is indeed doing the 100 sample before sorting. What feels like is happening is that rather than getting 100, stopping the scan, and then sorting them, it's going through the whole 47 million records, constantly keeping the 100 lowest records when sorted by the sort column. Which would make sense if it was "ORDER BY x LIMIT 100" outside of the sub-query. But because the LIMIT is inside the sub-query and separate from the ORDER BY, it's going through the entire table when it only needs to grab 100.

Could someone look at this EXPLAIN output and let me know if that is indeed what's happening?

My version is 3.39.2

sqlite> select * from (select addrprilo, crroute from zip4base limit 100) order by addrprilo;
QUERY PLAN
|--SCAN zip4base
`--USE TEMP B-TREE FOR ORDER BY

addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     21    0                    0   Start at 21
1     OpenEphemeral  2     4     0     k(1,NOCASE)    0   nColumn=4
2     Integer        100   1     0                    0   r[1]=100; LIMIT counter
3     OpenRead       1     2     0     9              0   root=2 iDb=0; zip4base
4     Rewind         1     15    0                    0
5       Column         1     8     2                    0   r[2]=zip4base.addrprilo
6       Sequence       2     3     0                    0   r[3]=cursor[2].ctr++
7       IfNotZero      1     11    0                    0   if r[1]!=0 then r[1]--, goto 11
8       Last           2     0     0                    0
9       IdxLE          2     14    2     1              0   key=r[2]
10      Delete         2     0     0                    0
11      Column         1     3     4                    0   r[4]=zip4base.crroute
12      MakeRecord     2     3     6                    0   r[6]=mkrec(r[2..4])
13      IdxInsert      2     6     2     3              0   key=r[6]
14    Next           1     5     0                    1
15    Sort           2     20    0                    0
16      Column         2     2     5                    0   r[5]=crroute
17      Column         2     0     4                    0   r[4]=addrprilo
18      ResultRow      4     2     0                    0   output=r[4..5]
19    Next           2     16    0                    0
20    Halt           0     0     0                    0
21    Transaction    0     0     56    0              1   usesStmtJournal=0
22    Goto           0     1     0                    0

100 rows of output here

Memory Used:                         69559440 (max 69579416) bytes
Number of Outstanding Allocations:   18951 (max 18964)
Number of Pcache Overflow Bytes:     69136264 (max 69153184) bytes
Largest Allocation:                  131072 bytes
Largest Pcache Allocation:           4360 bytes
Lookaside Slots Used:                61 (max 118)
Successful lookaside attempts:       1759
Lookaside failures due to size:      10
Lookaside failures due to OOM:       0
Pager Heap Usage:                    68883360 bytes
Page cache hits:                     1
Page cache misses:                   1179544
Page cache writes:                   0
Page cache spills:                   0
Schema Heap Usage:                   235856 bytes
Statement Heap/Lookaside Usage:      7104 bytes
Fullscan Steps:                      47027655
Sort Operations:                     1
Autoindex Inserts:                   0
Virtual Machine Steps:               282167649
Reprepare operations:                0
Number of times run:                 1
Memory used by prepared stmt:        7104
Run Time: real 12.189 user 8.328125 sys 3.796875

(12) By anonymous on 2022-11-04 19:54:37 in reply to 11 [link] [source]

Thank you for taking the time to test it out on your own dataset and adding more context, David.

(15.1) By Keith Medcalf (kmedcalf) on 2022-11-04 20:41:55 edited from 15.0 in reply to 11 [link] [source]

Yes, that is what is happening.

The reason is because your query select * from (select addrprilo, crroute from zip4base limit 100) order by addrprilo; is being flattened and being executed as if you had said: select addrprilo, crroute from zip4base order by addrprilo limit 100;

You need to phrase the query so that it is not flattenable.

Try:

  select * 
    from (
            select addrprillo,
                   crroute 
              from zip4base 
          order by rowid 
             limit 100
         ) 
order by addrprilo;

This selects the "lowest 100 rowid" and then does the sort.

(16) By Keith Medcalf (kmedcalf) on 2022-11-04 20:51:11 in reply to 15.1 [link] [source]

Note that either of the following will also work:

with data as materialized
     (
      select addrprilo,
             crroute
        from zip4base
       limit 100
     )
select *
  from data
order by 1
;

with data as
     (
        select addrprilo,
               crroute
          from zip4base
      order by rowid
         limit 100
     )
select *
  from data
order by 1
;

The lesson to learn is that if you want an order applied, you must specify an order-by clause. The first example above may not actually retrieve the first 100 records in order by rowid if some other index is used to traverse the table.

(7) By Chris Locke (chrisjlocke1) on 2022-11-04 17:58:57 in reply to 4 [link] [source]

I would assume it just reads the rows in order top to bottom

If I told you to go to my kitchen and get four mugs, would you get the four in order that I purchased them? In the order you saw them? If I put those mugs back and asked again, would you get the same ones?

The first rule of assumptions is don't assume anything. ;)

(9) By anonymous on 2022-11-04 18:29:49 in reply to 7 [link] [source]

don't assume anything

… which is why I said I may be wrong, and because that assumption is not relevant to the issue at hand.

(8) By Chris Locke (chrisjlocke1) on 2022-11-04 18:00:26 in reply to 4 [link] [source]

Are the rows it's picking sorted? (ie, the first 100 that would be returned had you sorted on the full table)

If you amend the sort does that affect the result?

(10) By anonymous on 2022-11-04 18:30:41 in reply to 8 [link] [source]

Are the rows it's picking sorted? (ie, the first 100 that would be returned had you sorted on the full table)

Yes.

(13) By anonymous on 2022-11-04 20:09:09 in reply to 1 [link] [source]

when you replace the select * with select b.* specific fields do you get different results?

(14) By anonymous on 2022-11-04 20:29:09 in reply to 13 [source]

No. Same behavior.

(17) By anonymous on 2022-11-05 00:13:26 in reply to 1 [link] [source]

Looking through the conditions that need to be met for the query optimizer to use flattening here, this simple query meets all conditions and therefore is being flattened, as pointed out by Keith.

In the case of my original query, I didn't care about ordering of the subquery's result set, so that's why I didn't specify order by rowid as suggested above, which does make the optimizer skip the flattening.

From the same docs:

When a subquery occurs in the FROM clause of a SELECT, the simplest behavior is to evaluate the subquery into a transient table, then run the outer SELECT against the transient table. Such a plan can be suboptimal since the transient table will not have any indexes and the outer query (which is likely a join) will be forced to do a full table scan on the transient table.

This simplest behavior is actually what was expected in this query and, at least in this instance, was the most optimal.

I'm not suggesting this behavior should change in Sqlite, especially if it is well documented as above. So the answer to my original question is then that it is working as designed. However, it might be worth considering if another condition should be added for flattening to occur to prevent queries like the above from being flattened, if deemed safe.

For instance, this specific condition:

13 The subquery and outer query do not both use LIMIT.

could potentially be complemented with:

The subquery uses LIMIT and outer query has other sources besides the transient table produced by the subquery.

Attempting to trick the optimizer was as simple as adding a WHERE clause, thus invalidating condition

19 If the subquery uses LIMIT then the outer query may not have a dummy WHERE clause.

sqlite> select b.* from (select match, name from data a limit 100) b WHERE 1 order by b.match asc;
QUERY PLAN
|--CO-ROUTINE b
|  `--SCAN a
|--SCAN b
`--USE TEMP B-TREE FOR ORDER BY

Thanks all for the input and discussion.

(20) By anonymous on 2022-11-05 18:06:13 in reply to 17 [link] [source]

to add to annon user OP above:

from the same doc on flattening, adding OFFSET 0 to the subquery is worth checking out, since it already employs the LIMIT clause.

It appears to me that all of these are not exactly stable in that there does not seem to be a canonical " do not flatten" method. Given that any of the special "do not flatten" rules could vanish, it leaves queries that could start performing poorly because an optimization rule changed(obsolete).

If the above is correct, then the issue could be addressed by choosing one or more rules which sql query authors could rely on. If such a rule(s) is/are known, stating so in the docs would be useful without boxing in the sqlite developers.

(21) By anonymous on 2022-11-05 21:43:59 in reply to 20 [link] [source]

Specifically, your original query modified as:

explain query plan select b.* from (select x, y from a limit 100,0 ) b order by b.x asc;

gives the same query plan as your WHERE clause 'do not flatten" modification, but it doesn't make you reach outside the sub-query to do it.

QUERY PLAN |--CO-ROUTINE b | --SCAN a |--SCAN b--USE TEMP B-TREE FOR ORDER BY

Where the WHERE clause could be advantageous is if your OFFSET isn't 0, yet the clarity of LIMIT 100,0 (or LIMIT 100 OFFSET 0) preserves your original query of not caring which about which or what order the 100 rows are returned.

(22) By anonymous on 2022-11-06 00:20:11 in reply to 21 [link] [source]

Indeed. This is a cleaner way of re-writing the original query to prevent flattening. Thanks for sharing.

And I also agree with your suggestion of having a way to specify a “do not flatten” option/clause in the query.

Alternatively, maybe just show that this is what’s happening in the query plan. That way, query authors know that the query was rewritten by the optimizer and maybe even learn the tricks that the optimizer is doing under the hood and implement it directly rather than relying on it being done for them. This would help when porting the queries to other db engines which don’t have this flattening feature.

(18) By Simon Slavin (slavin) on 2022-11-05 12:09:00 in reply to 1 [link] [source]

In addition to the answers which have already been posted, note that anything undocumented may change in a future version of SQLite. In other words, you may have figured out which 100 rows the current version of SQLite selects, but the next update to SQLite may change the way it works internally, which may change which 100 rows you get.

If you want SQLite to select 100 rows based on a criterion, tell it what you want.

(19) By anonymous on 2022-11-05 14:57:54 in reply to 18 [link] [source]

you may have figured out which 100 rows the current version of SQLite selects

I haven’t and, as stated in previous answers, it’s not relevant.

If you want SQLite to select 100 rows based on a criterion, tell it what you want.

Quite obviously.