Getting a date range using "like"
I have a table with a timestamp column, and I'd like to query it for date ranges such as "all of 2010", "all of May 2020", "25th December 1991" etc. i.e. things that are prefixes of the date-time when expressed as a string. So naively I thought I would try to use "like" patterns:
timestamp like "2010-%" timestamp like "2020-05-%" timestamp like "1991-12-25%"
That gives the right answers. But I think it's always doing sequential scans. Is there any way that it can do this using an index? In principle a "like" query that just tests a prefix should be implementable using an index, shouldn't it?
An index is used if I rewrite the query as e.g. timestamp >= '2020-05' and timestamp < '2020-06', but forming that sort of query is a bit tedious in general due to the complexities of dates.
How would you do this?
(2) By TripeHound on 2021-08-08 15:30:46 in reply to 1 [link] [source]
Can't test at the moment, but I seem to remember that
LIKE uses a
NOCASE search, and a default index is case sensitive. You might try making the timestamp column, or the index on it,
(3) By TripeHound on 2021-08-08 20:38:30 in reply to 1 [link] [source]
Can confirm my hunch. Using default collation for field and index:
create table t ( timestamp text, data ) ; insert into t values ( '2021-08-02', 'date1' ), ( '2021-08-23', 'date2' ), ( '2021-09-01', 'date3' ) ; create index idx on t( timestamp ) ; .eqp on
produces a table-scan:
-- Using default index select * from t where timestamp like '2021-08-%' ; QUERY PLAN `--SCAN TABLE t 2021-08-02|date1 2021-08-23|date2
Replacing the index with a
COLLATE NOCASE one:
drop index idx ; create index idx on t( timestamp collate nocase ) ;
uses the index:
-- Using NOCASE index select * from t where timestamp like '2021-08-%' ; QUERY PLAN `--SEARCH TABLE t USING INDEX idx (timestamp>? AND timestamp<?) 2021-08-02|date1 2021-08-23|date2
Alternatively, creating the original field as
COLLATE NOCASE and using a default index:
drop table t ; create table t ( timestamp text collate nocase, data ) ; insert into t values ( '2021-08-02', 'date1' ), ( '2021-08-23', 'date2' ), ( '2021-09-01', 'date3' ) ; QUERY PLAN `--SCAN 3 CONSTANT ROWS create index idx on t( timestamp ) ;
also uses the index:
-- Using NOCASE collation select * from t where timestamp like '2021-08-%' ; QUERY PLAN `--SEARCH TABLE t USING INDEX idx (timestamp>? AND timestamp<?) 2021-08-02|date1 2021-08-23|date2
This should do the wanted thing with a case sensitive column:
timestamp GLOB '2010-*'
Thanks everyone. I have this working now. I was unaware that like is case-insensitive; in PostgreSQL there is a separate "ilike".
As a further complication, I just realised that I had created the column with type "datetime" - which gives it integer affinity. I've now changed it to text.
A declared type of
datetime gives that attribute NUMERIC affinity, not INTEGER affinity.
See https://sqlite.org/datatype3.html in particular section 3.1 and the last row in the table at 3.1.1
What Keith said, plus the fact that NUMERIC is perfect for a column with date and time values and should work perfectly for your case.
Also, in SQLite there is a pragma "case_sensitive_like" if you fancy changing that behaviour.