SQLite Forum

Usage of suboptimum index
Login

(6) By Keith Medcalf (kmedcalf) on 2020-07-31 17:28:15 in reply to 5 updated by 6.1

< You could argue that this heuristic is counter-productive - if both indexes will match a single row, why not use the shorter one and save a little CPU doing key-comparisons (also, in other cases but not this one, you might save some IO as shorter indexes take up less space on disk)? I think the answer is that (a) those stats are only averages, not hard guarantees and (b) the stats might be out of date - they were current when ANALYZE was run but things might have changed since. So SQLite figures using the index that covers more fields is a better bet.

Well, actually, no, you could not argue that.

Those conditions must still be checked and passed before you can descend into the next table.  If the values are not in the index then additional IO needs to be done to seek and fetch them from the row.

Therefore having more where clause conditions consumed in an index lookup will always be lower cost than consuming less where clause conditions in the index because those conditions *must still be enforced*.

This is why a "covering index" is more efficient than a "non-covering" index.  That applies to where conditions, select list outputs, and descent keys as well.

(6.1) By Keith Medcalf (kmedcalf) on 2020-07-31 17:28:50 edited from 6.0 in reply to 5 [link]

> You could argue that this heuristic is counter-productive - if both indexes will match a single row, why not use the shorter one and save a little CPU doing key-comparisons (also, in other cases but not this one, you might save some IO as shorter indexes take up less space on disk)? I think the answer is that (a) those stats are only averages, not hard guarantees and (b) the stats might be out of date - they were current when ANALYZE was run but things might have changed since. So SQLite figures using the index that covers more fields is a better bet.

Well, actually, no, you could not argue that.

Those conditions must still be checked and passed before you can descend into the next table.  If the values are not in the index then additional IO needs to be done to seek and fetch them from the row.

Therefore having more where clause conditions consumed in an index lookup will always be lower cost than consuming less where clause conditions in the index because those conditions *must still be enforced*.

This is why a "covering index" is more efficient than a "non-covering" index.  That applies to where conditions, select list outputs, and descent keys as well.