SQLite User Forum

Performance regression min/max in 3.33.0?
Login

Performance regression min/max in 3.33.0?

(1) By Dave Brent (davebrent) on 2021-01-12 21:04:06 [source]

Hello, I have the following query that would execute almost instantly for various datasets (primarily using sqlite 3.28.0 on MacOS). But when switching to 3.33.0 or greater there is a very noticeable performance drop when running this query. Any ideas?

SELECT events.timestamp
FROM events
WHERE events.id IN (
  SELECT MIN(events.id) id
  FROM events
  INNER JOIN categories on (events.channel = categories.id)
  WHERE categories.name not in ("Lorem", "Ipsum")
UNION
  SELECT MAX(events.id) id
  FROM events
  INNER JOIN categories on (events.channel = categories.id)
  WHERE categories.name not in ("Lorem", "Ipsum")
);

For various reasons, there are no extra indexes on the tables in question.

(2) By Richard Hipp (drh) on 2021-01-12 21:38:50 in reply to 1 [link] [source]

Do you have a schema? Maybe some sample data?

(3.1) By Dave Brent (davebrent) on 2021-01-12 21:56:31 edited from 3.0 in reply to 2 [link] [source]

My schema is

sqlite> .schema
CREATE TABLE IF NOT EXISTS "categories" (
        "id"    integer,
        "name"  text NOT NULL,
        PRIMARY KEY("id"),
        UNIQUE("name")
);
CREATE TABLE IF NOT EXISTS "labels" (
        "id"    integer,
        "name"  text NOT NULL,
        PRIMARY KEY("id"),
        UNIQUE("name")
);
CREATE TABLE IF NOT EXISTS "events" (
        "id"    integer,
        "label" integer NOT NULL,
        "channel"       integer NOT NULL,
        "process"       integer NOT NULL,
        "thread"        integer NOT NULL,
        "timestamp"     integer NOT NULL,
        "phase" text NOT NULL CHECK("phase" IN ('B', 'E', 'I', 'P')),
        PRIMARY KEY("id"),
        FOREIGN KEY("label") REFERENCES "labels"("id"),
        FOREIGN KEY("channel") REFERENCES "categories"("id"),
        FOREIGN KEY("process") REFERENCES "process"("id"),
        FOREIGN KEY("thread") REFERENCES "thread"("id")
);
CREATE TABLE IF NOT EXISTS "process" (
        "id"    integer,
        "pid"   integer NOT NULL,
        PRIMARY KEY("id"),
        UNIQUE("pid")
);
CREATE TABLE IF NOT EXISTS "thread" (
        "id"    integer,
        "tid"   integer NOT NULL,
        PRIMARY KEY("id"),
        UNIQUE("tid")
);

And Ive uploaded an example (trimmed down) database here https://filebin.net/ncwkgasmyau99waa

(4) By Richard Hipp (drh) on 2021-01-13 00:27:53 in reply to 3.1 [link] [source]

Thank you for the data. I can now repro the problem. It appears to be caused by check-in b8ba2f17f938c035 which, ironically, was intended to make things run faster, not slower. The change is "safe" in the sense that it still gets the correct answer. But it appears to have lost one optimization as it picked up another.

It will probably take a few hours (and some sleep and some strong tea and dark chocolate) to fix this. So it might be a day or two before a solution is on trunk.

(5) By Dave Brent (davebrent) on 2021-01-13 08:40:08 in reply to 4 [link] [source]

No worries, thank you for even looking into it!

(6) By Richard Hipp (drh) on 2021-01-13 15:32:37 in reply to 1 [link] [source]

A possible fix to this problem can be seen in check-in 188772a1dbaf066f. Please try out the new code and report back whether or not it clears your problem.

The fix has not yet landed on trunk, because I want to spend more time testing and verifying it's correctness first.

(7) By Dave Brent (davebrent) on 2021-01-13 20:34:23 in reply to 6 [link] [source]

I can confirm it does indeed fix my problem, thats absolutely fair enough, thank you for such a quick patch!