SQLite User Forum

Performance regression in version 3.40.2 using views on indexed tables
Login

Performance regression in version 3.40.2 using views on indexed tables

(1) By Jiba (Jibalamy) on 2023-03-23 14:30:43 [link] [source]

Hello, 

I encounter a performance regression with version 3.40.2 (compared to 3.40.1), when querying a view that is the union of two tables, both having a similar index.

The entire database can be downloaded here:

https://filesender.renater.fr/?s=download&token=4e4f85a6-0790-47a9-8139-eba61944c029

The "quads" view is defined as follows:

CREATE VIEW quads AS
SELECT c,s,p,o,NULL AS d FROM objs UNION ALL
SELECT c,s,p,o,d         FROM datas

where both tables "objs" and "datas" have similar columns (data have "d" in addition), and both have an index on column "o":

CREATE INDEX index_objs_sp  ON objs(s,p);
CREATE INDEX index_datas_sp ON datas(s,p);


My query is :
SELECT o,d FROM quads WHERE s=303 AND p=9;

On 3.40.1, the query plan shows that indexes are used, and performances are good:

Run Time: real 0.009 user 0.002006 sys 0.000000
QUERY PLAN
|--CO-ROUTINE quads
|  `--COMPOUND QUERY
|     |--LEFT-MOST SUBQUERY
|     |  `--SEARCH objs USING INDEX index_objs_sp (s=? AND p=?)
|     `--UNION ALL
|        `--SEARCH datas USING INDEX index_datas_sp (s=? AND p=?)
`--SCAN quads

But with 3.40.2, indexes are not used and performances are much lower:

Run Time: real 0.120 user 0.120538 sys 0.000000
QUERY PLAN
|--CO-ROUTINE quads
|  `--COMPOUND QUERY
|     |--LEFT-MOST SUBQUERY
|     |  `--SCAN objs
|     `--UNION ALL
|        `--SCAN datas
`--SCAN quads

(2) By Richard Hipp (drh) on 2023-03-23 15:26:38 in reply to 1 [link] [source]

This is the same problem as was described by the prior forum post bcc4375032 on 2023-02-22. The issue has been resolved on trunk by check-in aa6bd6dff751223e (2023-02-22). The fix was not added to the 3.41.1 and 3.41.2 patches because it is only a performance concern, not an incorrect answer. The fix will appear in 3.42.0 release.

Please test this fix by downloading and using the latest trunk version of SQLite. You can always see the latest trunk version at https://sqlite.org/src/info/trunk. There are download lines to the right of the "Downloads:" label near the top of the page. The trunk version is almost always stable. Your beta-testing is appreciated.

You can also get a graphical view showing the most recent changes to SQLite and capture the latest trunk version from there: https://sqlite.org/src/timeline?n=30.

See the README file for additional instructions on how to obtain and compile the latest check-ins of SQLite.

(3) By Jiba (Jibalamy) on 2023-03-23 17:15:08 in reply to 2 [link] [source]

I confirm that it works with the trunk.

(4) By anonymous on 2023-03-23 20:30:34 in reply to 2 [source]

FWIW, this performance concern has been observed to change a key query on a 300MB database from O(10ms) to O(3s), so a 3-order-of-magnitude slowdown.

(5) By jmc (jmc123) on 2023-04-04 18:27:12 in reply to 2 [link] [source]

SQLite 3.41.2 with Tcl interface on WIN 7 or on WIN 10 Pro

Not sure if is the same pb here, but we are fighting since 2 days with what seems an infinite loop resulting from a heavy (8 way join) INSERT query in a stable application, involved tables' shema unchanged etc.
This specific query did its duty without pb on SQLite 3.40 in 58 ms and has'nt finished running 3 hours later in 3.41.2 version (had to kill the process afer that delay)

From our obervations :

if the query is modified with "CROSS JOIN" operator instead of ", " join operator, the "infinite loop" disapear

if the original query is simplified to be an only 7 way join (suppressing one table in the FROM clause, in various combinations) the "infinite loop" disapear

We wrote a simplified demo script with a 8 way join using ", " operator but unable to reproduce

When quitting the application, "PRAGMA analysis_limit = 400" and "PRAGMA dbfile.optimize" are run in the exit process but change nothing (involved tables are in "dbfile") : after re-launching the application, the "infinite loop" is still there

However, if "ANALYZE" is run, then the "infinite loop" disapear

To give a more precise idea of this sole query suddenly having a pb among hundreds queries, below is its copy & paste verbatim (not nice, sorry for this) :

db1 eval
"INSERT INTO Bulletins0 ( $TotChampsCibles )
 SELECT $TotChampsSources
 FROM SalariesHORAIRESREGRPT AS Tsal,
   LiaisonRubriquesAssiettesCotis, RecapMens2, Etablissements, Etablissements AS Etablissements_1, Societes, NomenclatCONTRATS, $tblmem
 WHERE Tsal.CodPlan   = '$res(CodPlan)'
   AND Tsal.RegimeCOT = '$res(RegimeCOT)'
   AND Tsal.AgeSalarie <= ifnull(nullif(Tsal.Sexe,2) * $res(AgeMax1), $res(AgeMax2))
   AND abs(min(Tsal.SousPeriode * $res(Rattach),0)) = $res(RattachGroupIndex)
   AND LiaisonRubriquesAssiettesCotis.CodPlan = '$res(CodPlan)'
   AND LiaisonRubriquesAssiettesCotis.CodDefSetAssiettes = Tsal.CodDefSetAssiettes
   AND LiaisonRubriquesAssiettesCotis.NumAssiette = $res(NumAssiette)
   AND '$::tclGlobalAnnee-$::tclGlobalMdec' BETWEEN LiaisonRubriquesAssiettesCotis.DatEffet And LiaisonRubriquesAssiettesCotis.FinValidite
   AND RecapMens2.Matricule = Tsal.Matricule AND RecapMens2.AnneeMois >= Tsal.DebutExercice
   AND Etablissements_1.CodEtabliss = RecapMens2.CodEtabliss
   AND CASE LiaisonRubriquesAssiettesCotis.NivCalc WHEN 1 THEN Etablissements_1.CodPlan = '$res(CodPlan)' AND Etablissements_1.CodSoc = Tsal.CodSoc
                                                   WHEN 2 THEN Etablissements_1.CodPlan = '$res(CodPlan)' AND Etablissements_1.CodSoc = Tsal.CodSoc AND Etablissements_1.Area = Tsal.Area
                                                   WHEN 3 THEN RecapMens2.CodEtabliss = Tsal.CodEtabliss
                                                   WHEN 4 THEN RecapMens2.RefContrat  = Tsal.RefContrat
       END
   AND Etablissements.CodEtabliss = Tsal.CodEtabliss
   AND Etablissements.RegCotEtab = '$res(RegCotEtab)'
   AND Societes.CodSoc = Tsal.CodSoc AND Societes.CodPlan = '$res(CodPlan)'
   AND $res(ZoneEffectif) BETWEEN $res(EffMin_1) AND $res(EffMax_1)
   AND NomenclatCONTRATS.CodPlan = '$res(CodPlan)' AND NomenclatCONTRATS.CodContrat = Tsal.CodContrat
   AND NomenclatCONTRATS.Droit IS NOT NULL
   AND ${tblmem}.IDPayeMens = Tsal.IDPayeMens AND ${tblmem}.NumAssiette = $res(NumAssiette)
 GROUP BY Tsal.IDPayeMens
 HAVING $Condition
 ON CONFLICT (IDPayeMens, RubrUser) DO UPDATE
    SET Taux   = $res(TxSalarial) * NOT $res(CotisForfaitSal),
        TxPatr = $res(TxPatr)     * NOT $res(CotisForfaitPat),
        CodCot = '$res(CodCot)',
        CotRow = $res(UID)
    WHERE Bulletins0.NPS = 2"


When stwitched from SQLite version 3.41.2 back to 3.40, the problem disapear


Below are the query plans generated on each version :

Version 3.40.0

average execution time : 58ms

19 0 0 {SCAN Tsal USING INDEX sqlite_autoindex_SalariesHORAIRESREGRPT_1}
42 0 0 {BLOOM FILTER ON LiaisonRubriquesAssiettesCotis (CodPlan=? AND CodDefSetAssiettes=? AND NumAssiette=?)}
58 0 0 {BLOOM FILTER ON Societes (CodSoc=? AND CodPlan=?)}
71 0 0 {BLOOM FILTER ON NomenclatCONTRATS (CodPlan=? AND CodContrat=?)}
82 0 0 {BLOOM FILTER ON Etablissements (CodEtabliss=?)}
90 0 0 {SEARCH LiaisonRubriquesAssiettesCotis USING PRIMARY KEY (CodPlan=? AND CodDefSetAssiettes=? AND NumAssiette=? AND FinValidite>?)}
120 0 0 {SEARCH Societes USING INDEX sqlite_autoindex_Societes_1 (CodSoc=? AND CodPlan=?)}
131 0 0 {SEARCH NomenclatCONTRATS USING PRIMARY KEY (CodPlan=? AND CodContrat=?)}
143 0 0 {SEARCH _Tranches USING INDEX sqlite_autoindex__Tranches_1 (IDPayeMens=? AND NumAssiette=?)}
165 0 0 {SEARCH RecapMens2 USING AUTOMATIC COVERING INDEX (Matricule=?)}
174 0 0 {SEARCH Etablissements USING INTEGER PRIMARY KEY (rowid=?)}
179 0 0 {SEARCH Etablissements_1 USING INTEGER PRIMARY KEY (rowid=?)}

#############################

Version 3.41.2 before ANALYZE (infinite loop ?)

average execution time : unknown (> 3 hours)

15 0 0 {SCAN Etablissements}
19 0 0 {SEARCH NomenclatCONTRATS USING PRIMARY KEY (CodPlan=?)}
25 0 0 {SCAN Etablissements_1}
43 0 0 {SEARCH RecapMens2 USING AUTOMATIC COVERING INDEX (CodEtabliss=?)}
48 0 0 {SCAN Societes}
73 0 0 {SEARCH _Tranches USING AUTOMATIC PARTIAL COVERING INDEX (NumAssiette=?)}
78 0 0 {SEARCH Tsal USING INDEX sqlite_autoindex_SalariesHORAIRESREGRPT_1 (IDPayeMens=?)}
118 0 0 {SEARCH LiaisonRubriquesAssiettesCotis USING PRIMARY KEY (CodPlan=? AND CodDefSetAssiettes=? AND NumAssiette=? AND FinValidite>?)}
177 0 0 {USE TEMP B-TREE FOR GROUP BY}

#############################

Version 3.41.2 after ANALYZE (ok)

average execution time : 60ms

19 0 0 {SCAN Tsal USING INDEX sqlite_autoindex_SalariesHORAIRESREGRPT_1}
42 0 0 {BLOOM FILTER ON LiaisonRubriquesAssiettesCotis (CodPlan=? AND CodDefSetAssiettes=? AND NumAssiette=?)}
58 0 0 {BLOOM FILTER ON Societes (CodSoc=? AND CodPlan=?)}
71 0 0 {BLOOM FILTER ON NomenclatCONTRATS (CodPlan=? AND CodContrat=?)}
82 0 0 {SEARCH LiaisonRubriquesAssiettesCotis USING PRIMARY KEY (CodPlan=? AND CodDefSetAssiettes=? AND NumAssiette=? AND FinValidite>?)}
109 0 0 {SEARCH Societes USING INDEX sqlite_autoindex_Societes_1 (CodSoc=? AND CodPlan=?)}
120 0 0 {SEARCH NomenclatCONTRATS USING PRIMARY KEY (CodPlan=? AND CodContrat=?)}
132 0 0 {SEARCH _Tranches USING INDEX sqlite_autoindex__Tranches_1 (IDPayeMens=? AND NumAssiette=?)}
154 0 0 {SEARCH RecapMens2 USING AUTOMATIC COVERING INDEX (Matricule=?)}
163 0 0 {SEARCH Etablissements USING INTEGER PRIMARY KEY (rowid=?)}
168 0 0 {SEARCH Etablissements_1 USING INTEGER PRIMARY KEY (rowid=?)}