Regression report from 3.46.1
(1) By Balaji Ramanathan (balaji) on 2024-12-21 01:39:15 [link] [source]
I have a hobby database in SQLite whose performance has gone down drastically in 3.47.2 compared to its performance in 3.46.1.
I have removed all identifying information in my database to anonymize it before running the following queries in 3.46.1.
sqlite> select count(*) from TripDetailsPlusCumulative;
4048
Run Time: real 1.254 user 0.140625 sys 0.031250
sqlite> select count(*) from TripSummary;
113
Run Time: real 108.010 user 40.875000 sys 7.437500
sqlite>
When I run the same queries in 3.47.2, I had to kill the queries after an hour, they never completed to provide me a timing.
I am happy to share this anonymized database if it will help with debugging this regression. You can find test.db at this URL: https://drive.google.com/file/d/1mEEK6tBC9UkbQYJ89p5tJo3NktpR1CkZ/view?usp=sharing
Thank you.
(2) By Roger Binns (rogerbinns) on 2024-12-21 15:18:52 in reply to 1 [link] [source]
That is a truly impressive schema you have there. I'm surprised it works at all! For those that haven't looked, it contains large numbers of views, union all, joins, text formatting, distinct, avg, count, sum, group by, order by, and case.
You can run EXPLAIN QUERY PLAN to see what work SQLite does to execute the queries above. The first has 449 nodes and the second 28,618! Turn logging on in the shell and you'll see SQLite also creates automatic indexes on the views. The first query is 32 co-routines deep while the second does that over and over again. This will make your queries very sensitive to exponential explosion in query time. I suggest reading about the query optimizer to get an idea what SQLite can do faster. For example removing unneeded ORDER BY will improve things.
I'd recommend using the query plans to reduce the complexity of what is going on.
(3) By Balaji Ramanathan (balaji) on 2024-12-21 15:48:03 in reply to 2 [link] [source]
Thank you, Roger. Yes, it is a pretty involved schema, but the tables are designed in 3NF (or close), and the views are quite performant for my needs. As I posted originally, the first view (TripDetailsPlusCumulative) usually runs in about a second on SQLite versions before 3.47. The TripSummary view involves a bunch of percentile calculations using window functions, so it is a little slower, but still runs in under 2 minutes. 3.47 breaks both the queries completely, and I think some query optimization between 3.46 and 3.47 had the unintended effect of creating an infinite loop somewhere since it is not clear if the views will ever produce results in 3.47.
Perhaps the queries can be simplified and their performance can be improved, but I would be happy to just have the previous performance back without this significant regression.
(4) By anonymous on 2024-12-23 08:39:48 in reply to 1 [link] [source]
Have you run analyze after upgrade to 3.47.2?
(5) By Roger Binns (rogerbinns) on 2024-12-23 16:23:27 in reply to 4 [link] [source]
I ran analyze on the database and it made no difference. The underlying issue here is not the usual SQLite query optimization - the schema is of staggering complexity combining (union all) many different views, joins, text formatting etc. The query plans are ~500 and ~28k steps long! I did check everything was indexed, but views don't have indexes, even though the query planner does then create automatic indexes on some of them. The combinatorial explosion in the query is what has to be addressed.
(6) By Balaji Ramanathan (balaji) on 2024-12-23 23:52:55 in reply to 5 [link] [source]
The schema, indexes, data in the tables, nothing has changed between 3.46.1 and 3.47. Only the code that is being executed has changed. So, it requires a bisection of all the check-ins between 3.46.1 and 3.47 to figure out which check-in broke it, and what the code change was that broke it.
(7) By Balaji Ramanathan (balaji) on 2024-12-23 23:56:42 in reply to 4 [link] [source]
Yes, I analyze and vacuum the database every time before closing the CLI.
(8) By Roger Binns (rogerbinns) on 2024-12-24 14:47:44 in reply to 6 [source]
So, it requires ...
You have choices.
Purchase professional support
Stick to the version of SQLite that works for you
Wait however long it takes for a future SQLite version that may restore your prior observed performance without any changes by you
Help to reduce the size and complexity of reproducing the problem, finding alternate query formulations that don't have the problem, query tweaks, and similar.
Note that number 3 may never happen. The code changes may have been for correctness (and so wouldn't be undone), or they may have made many other queries faster. The resolution may be having to give hints to the query planner.
Number 4 is the most constructive. Some examples of what you could do:
- Figure out which version the change occurred in (3.47.0, .1, .2)
- Use fossil bisesct on the SQLite source
- Compare query plans between versions
- Find a smaller reproducer - for example you can use a minimizer (cvise is my favourite) with the timeout command feeding a dump of the database and the query into the shell
- Explain some of your query complexity and ask for suggestions of simplification, or alternate approaches
- Ask for suggestions of alternate approaches - for example maintaining information via triggers, or materializing views in a temp table