SQLite Forum

Query planner fails to use obvious index
Login
This does make a difference.  At least in the case where STAT4 statistics are being collected.

```
sqlite> create table test (id integer primary key, key_id integer);
sqlite> create index test_key_idx on test(key_id);
sqlite> select min(id) from test where key_id >= 1;
QUERY PLAN
`--SEARCH TABLE test (~983040 rows)
┌─────────┐
│ min(id) │
├─────────┤
│         │
└─────────┘
sqlite> insert into test (key_id) select randomv(100)-98 from wholenumber where value between 1 and 100000;
QUERY PLAN
`--SCAN TABLE wholenumber VIRTUAL TABLE INDEX 10: (~24 rows)
sqlite> select min(id) from test where key_id >= 1;
QUERY PLAN
`--SEARCH TABLE test (~896 rows)
┌─────────┐
│ min(id) │
├─────────┤
│ 132     │
└─────────┘
sqlite> analyze;
sqlite> select min(id) from test where key_id >= 1;
QUERY PLAN
`--SEARCH TABLE test USING COVERING INDEX test_key_idx (key_id>?) (~1024 rows)
┌─────────┐
│ min(id) │
├─────────┤
│ 132     │
└─────────┘
```

You will note that no matter what you do, you will have to scan either the whole table or the whole index.  Without histogram data there is no way to determine whether it is more efficient to search the id's in order looking for a key_id greater or equal to 1, or whether to search the entire index starting from the first key_id >= 1 looking for the minimum id.

If you place a constraint which makes one or the other the obvious default solution, then that is the method used (ie, the constraint BETWEEN on key_id) places a constraint on the number of rows scanned, so therefore it is chosen as being the quickest method of culling candidates because it will (in the worst case) *always* result in less rows being scanned than the worst case for a full table scan.