SQLite Forum

How to speed up this query?
Login

How to speed up this query?

(1) By anonymous on 2020-06-18 08:25:05 [link] [source]

Hello,

I have this query:

SELECT 
  (SELECT id FROM meta2 WHERE (meta2.parameter, meta2.value)=(a.parameter, a.value)) AS metaid, 
  t.id AS traceid, 
  NULL AS itemid, 
  a.id AS oldid 
FROM meta a 
LEFT JOIN trace t ON a.datasetid=t.datasetid 
WHERE a.itemid IS NULL AND a.traceid IS NULL AND a.datasetid IS NOT NULL;

It takes about 45 seconds and I'm wondering why.

An EXPLAIN QUERY plan gives the following result:

QUERY PLAN
|--SEARCH TABLE meta AS a USING INDEX metadata_idx_05 (traceid=? AND itemid=?)
|--SEARCH TABLE trace AS t USING COVERING INDEX trace_idx_03 (datasetid=?)
`--CORRELATED SCALAR SUBQUERY 1
   `--SCAN TABLE meta2

So why is it using "SCAN TABLE meta2" without an index?

Trying to find the problem, I have created almost any index:

CREATE INDEX idx_meta2_parameter_value on meta2 (parameter, value);
CREATE INDEX idx_meta2_value on meta2 (value);
CREATE INDEX idx_meta2_parameter on meta2 (parameter);
CREATE INDEX idxnc_meta2_parameter on meta2 (parameter collate nocase);
CREATE INDEX idxnc_meta2_value on meta2 (value collate nocase);
CREATE INDEX idxnc_meta2_parameter_value on meta2 (parameter collate nocase, value collate nocase);
CREATE UNIQUE INDEX uidxnc_meta2_parameter_value on meta2 (parameter collate nocase, value collate nocase);
CREATE INDEX idxnc_meta2_id_parameter_value on meta2 (id, parameter collate nocase, value collate nocase);

Why does the query not use any index? 

The table declaration is:

CREATE TABLE meta2 (
    id INTEGER PRIMARY KEY AUTOINCREMENT UNIQUE NOT NULL,
    parameter TEXT NOT NULL COLLATE NOCASE,
    value TEXT NOT NULL COLLATE NOCASE,
    UNIQUE (parameter, value)
    ON CONFLICT IGNORE
);


Results are the same in versions 3.31.1 and 3.30.1.

(2) By David Raymond (dvdraymond) on 2020-06-18 11:32:53 in reply to 1 [link] [source]

Well, in 3.32.2 and in 3.31.1, when I make up tables to match meta and trace, and run an EQP, it gives me what you're looking for:

QUERY PLAN
|--SEARCH TABLE meta AS a USING INDEX metadata_idx_05 (traceid=? AND itemid=?)
|--SEARCH TABLE trace AS t USING COVERING INDEX trace_idx_03 (datasetid=?)
`--CORRELATED SCALAR SUBQUERY 1
   `--SEARCH TABLE meta2 USING COVERING INDEX sqlite_autoindex_meta2_2 (parameter=? AND value=?)

So some comments and questions:
What are the full create statements for data and trace?
Have you run analyze?
Were there any changed compile-time options if you compiled it yourself?

"primary key unique" is redundant, and in older versions would actually create 2 indexes, one for primary key, and another for unique.

The unique constraint in the create table makes an index on (parameter, value), so you don't need to create a separate one. (Though adding it may have just been from your experimenting)

(3) By Clemens Ladisch (cladisch) on 2020-06-18 13:10:14 in reply to 1 [source]

If I had to guess, I'd suspect that the parameter/value comparison cannot be done with the index because the columns in the meta table use a different collation.

(4) By anonymous on 2020-06-19 10:27:05 in reply to 3 [link] [source]

Dear all and especially Clemens, thanks for your comments.

I have found the problem in an incorrect table declaration of table "meta". I had used (which happens quite frequently to me)

... parameter STRING NOT NULL, value STRING NOT NULL ...

instead of "TEXT" 🙈

So with that having been changed, it works as expected:

QUERY PLAN |--SCAN TABLE meta AS a --CORRELATED SCALAR SUBQUERY 1 --SEARCH TABLE meta2 USING COVERING INDEX uidxnc_meta2_parameter_value (parameter=? AND value=?)

It is, however, interesting that both partial queries (i.e. matching only on "parameter" or on "value", but not both) had always (even with the wrong table declaration) used the appropriate indexes:

QUERY PLAN |--SCAN TABLE meta AS a USING COVERING INDEX metadata_idx_09 --CORRELATED SCALAR SUBQUERY 1 --SEARCH TABLE meta2 USING COVERING INDEX idxnc_meta2_parameter (parameter=?)

QUERY PLAN |--SCAN TABLE meta AS a --CORRELATED SCALAR SUBQUERY 1 --SEARCH TABLE meta2 USING COVERING INDEX idxnc_meta2_value (value=?)

So I'm not sure whether the behavior I observed is a bug or a feature. But thanks for helping!