SQLite Forum

Performance issue for selecting by "index on function" in some cases.
Login

Performance issue for selecting by "index on function" in some cases.

(1.1) By Vyacheslav (slavasw8) on 2023-02-10 14:11:38 edited from 1.0 [source]

Hello!
It seems that there is a performance issue in SQLite. In some cases (index by function + "and" in 'where' condition) index is not used in a 'where' clause.
Initially the issue was reproduced only for queries that consist of more than 1 tables (join, etc..).
However, starting from version 3.36 (to be exact from rev. b775c976822b1951506bf01c8cc6b223ba079627) the issue spread to selects from single table.

Simple example, "abs()" can be replaced with any function:

CREATE TABLE "test1" (
    "Id"    INTEGER NOT NULL,
    "value"    INTEGER,
    PRIMARY KEY("Id" AUTOINCREMENT)
);
CREATE INDEX "test1_abs_value_index" ON "test1" ( abs(value) );

Lets compare results of different SQLite versions:
1. version 3.35
   EXPLAIN QUERY PLAN select * from test1 where abs(value) = abs(1) and value = 1;
   `--SEARCH TABLE test1 USING INDEX test1_abs_value_index (<expr>=?)

2. version 3.36
   EXPLAIN QUERY PLAN select * from test1 where abs(value) = abs(1) and value = 1;
   `--SCAN test1 

Using similar example with join clause shows that bug exists on any SQLite version:

CREATE TABLE IF NOT EXISTS "test2" ("Id"  INTEGER NOT NULL,  PRIMARY KEY("Id" AUTOINCREMENT));
CREATE TABLE IF NOT EXISTS "test1" (
    "Id"    INTEGER NOT NULL,
    "value"    INTEGER,
    PRIMARY KEY("Id" AUTOINCREMENT)
);
CREATE INDEX IF NOT EXISTS "test1_abs_value_index" ON "test1" ( abs(value) );

1. EXPLAIN QUERY PLAN select * from test1 join test2 on test1.id=test2.id where abs(value) = abs(1);
  |--SEARCH TABLE test1 USING INDEX test1_abs_value_index (<expr>=?)
  `--SEARCH TABLE test2 USING INTEGER PRIMARY KEY (rowid=?)

2. EXPLAIN QUERY PLAN select * from test1 join test2 on test1.id=test2.id where abs(value) = abs(1) and value=1;
  |--SCAN TABLE test1
  `--SEARCH TABLE test2 USING INTEGER PRIMARY KEY (rowid=?)

The issue appeared in propagateConstants() method, that is one of optimizations, but seems to change expression tree in some "incorrect" way.
There were changes on the revision b775c976822b1951506bf01c8cc6b223ba079627 that weaker condition (removed pTabList->nSrc>1)
so the propagateConstants() called for selects from single table.

(2) By Stephan Beal (stephan) on 2023-02-10 13:33:35 in reply to 1.0 [link] [source]

However, starting from version 2.36 (to be exact from rev. b775c976822b1951506bf01c8cc6b223ba079627) the issue spread to selects from single table.

There is neither an sqlite3 2.3x (2.8.x was the last of the 2.x series) nor does that hash resolve to any checkin in the canonical sqlite source tree. Presumably you're referring to github hashes, which aren't valid here: github has a read-only mirror of the canonical tree, provided for the benefit of folks who insist on git, but is otherwise unused by the sqlite devs.

(3) By Vyacheslav (slavasw8) on 2023-02-10 14:10:12 in reply to 2 [link] [source]

Sorry, version 3.35 and 3.36
Commit from github
https://github.com/sqlite/sqlite/commit/b775c976822b1951506bf01c8cc6b223ba079627

(4) By jose isaias cabrera (jicman) on 2023-02-10 14:39:50 in reply to 1.1 [link] [source]

3.41.0 behaves the same...

.version
SQLite 3.41.0 2023-02-01 15:41:07 66f29c403d28630bfaea9124bd63ee4a047b1fe4a7e27dc5d10d67d1601b15e0
gcc-11.3.0
CREATE TABLE "test1" (
    "Id"    INTEGER NOT NULL,
    "value"    INTEGER,
    PRIMARY KEY("Id" AUTOINCREMENT)
);
VM-steps: 54
Run Time: real 0.000 user 0.000000 sys 0.000000
CREATE INDEX "test1_abs_value_index" ON "test1" ( abs(value) );
VM-steps: 26
Run Time: real 0.000 user 0.000000 sys 0.000000
EXPLAIN QUERY PLAN select * from test1 where abs(value) = abs(1) and value = 1;
QUERY PLAN
`--SCAN test1
VM-steps: 0
Run Time: real 0.000 user 0.000000 sys 0.000000

(5) By Richard Hipp (drh) on 2023-02-10 18:24:55 in reply to 1.1 [link] [source]

The problem arises due to the introduction of the constant propagation optimization.

Here is a simplification of your problem:

CREATE TABLE t1(x INT);
INSERT INTO t1 VALUES(1);
CREATE INDEX idx ON t1(abs(x));
SELECT * FROM t1 WHERE abs(x)=1 AND x=1;

You are concerned that the query does a full table scan and does not use the index on abs(a). It does this becaue the constant propagation optimization has observed that x always have a value of 1, so it can replace "x" with "1" in most places of the query. So the query is transformed into:

SELECT * FROM t1 WHERE abs(1)=1 AND x=1;
      /* "x" becomes "1" --^   */

But now there are no terms in the WHERE clause that match the index and so the index cannot be used.

One simple work-around is to replace "x=1" in your original query with "+x=1". In other words, put a unary "+" in front of the "x". This defeats the constant propagation optimization, and causes the index to be used again.

(6) By Richard Hipp (drh) on 2023-02-10 21:56:29 in reply to 5 [link] [source]

The 44200596aa943963 check-in works around the problem and gives the same query plan with the constant propagation optimization and without.

(7) By Vyacheslav (slavasw8) on 2023-02-13 14:26:54 in reply to 6 [link] [source]

It works as expected after this commit.
Thanks.