index to help with selection and ordering
(1) By Mark Wagner (markxwagner) on 2021-05-26 23:19:10 [link] [source]
I'm sure this must be faq and perhaps obvious but I appear to not see it.
It seems difficult to create an index which helps with both selection and ordering. For example, imagine a table with a timestamp column and a name column. Now imagine wanting to select all rows whose timestamp is greater than X and then order that result by name.
From what I understand, only one index can be used per table on a given query. And a multi column index won't help because in this example both columns would need to be in the first position of the index.
Is there an approach that solves this goal of making both the selection and the ordering use an index?
The job of the query planner is to choose the appropriate index.
For example, if you have a table of 1 billion rows, a where condition which selects 5 rows, and a choice of whether to use (a) an index for the WHERE clause then sort the result or (b) scan the index in the result order and only output matching rows, then clearly one should choose the former (a) because is is more efficient to quickly locate the 5 rows and then sort them than to scan all 1 billion rows in the right order then only output 5 of them.
However, if the where clause selected all rows except for 5, then clearly the method chosen should be the latter as it is faster to output the rows in order after scanning them all, than it is to sort them.
However, there is a point at which it makes no difference which method is chosen.
So the question of how to do the unnecessary is of zero utility.
There is no such approach, and it is not feasible in any RDBMS.
It's a first-principle problem. If I told you to go through a phone book and find all the names starting with "wagner" and then sort those by phone number. Finding the names would be easy, but what kind of an index would help you sort that extracted list then by phone number? A main index on phone number, or even on name+phone-number, or even covering index would all offer Zero help for an extracted data set.
If however you have two single indexes on name and on phone number and you gather all names except the "wagner"s then scanning the table in row-order for phone number to produce a list ordered so would be helpful, but it's a sliding scale that is much more often than not heavily outweighed on the side of locating the requested rows by index rather than walking the index for sorting purposes.
I know of no RDBMS that even bothers with an optimization that suggests walking the index in sort order rather than lookup. Perhaps if no lookup index exists but one exists for the ordering? But by then your DB design and speed problems are much worse than it should be.
(5) By Mark Wagner (markxwagner) on 2021-05-27 07:14:21 in reply to 4 [link] [source]
Thinking through my scenario a bit more I conclude that the temp b=tree for ordor by is far less of an issue compared to the table scan. In other words, if the index is used to retrieve a reasonable number of rows (using table search) then the sorting is a pretty much a non-issue. If the select is returning a large number of rows then the order by is a problem but the whole query is problematic.