v3.40.0 very slow with UNIONs (or so I think)
(1) By jose isaias cabrera (jicman) on 2022-11-22 21:51:04 [source]
Greetings!
I have query that Mr. Medcalf helped me create which returns all changes for a field for a given project. This query,
SELECT ProjID,
Updated_By,
InsertDate,
var,
oldv,
newv
FROM
(
SELECT ProjID,
Updated_By,
InsertDate,
'Finish_Date' as var,
(
SELECT
coalesce(Finish_Date,'') FROM Project_List
WHERE ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
coalesce(Finish_Date,'') as newv
FROM Project_List as o
UNION
SELECT ProjID,
Updated_By,
InsertDate,
'Ann_CapexP' as var,
(
SELECT
replace(round(Ann_CapexP),'.0','') FROM Project_List WHERE
ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
replace(round(Ann_CapexP),'.0','') as newv
FROM Project_List as o
)
WHERE oldv <> newv
AND ProjID = 'PR0000020614'
ORDER BY InsertDate ASC;
Grabs 8 records and takes 0.01 secs to execute. On v3.40.0, the same query grabs 8 records, but it takes 4.44 secs. This query is composed of just two fields. When I add all the desired fields for a record, v3.39.4 grabs 125 records and takes 0.02 secs to execute, while v3.40.0 grabs 125 records and takes 379.41 secs. I ran analyze, thinking that it would fix things, but, it does not. If you give me a spot I will try to get you the db, but everything is so locked up here that sometimes I have to do 'unorthodox' maneuvering to get things out. I had to revert back to 3.39.4. SQL versions being used are vanilla SQLite Windows DLL from the download site.
SQLiteVer: 3.39.4 2022-09-29 15:55:41
SQLiteVer: 3.40.0 2022-11-16 12:10:08
Thanks for the support.
josé
(2) By anonymous on 2022-11-22 22:30:34 in reply to 1 [link] [source]
Can you check the query plan for each, and then compare them? Seems like the query planner is choosing something less than optimal (obviously).
EXPLAIN ....
and
EXPLAIN QUERY PLAN ...
on the two versions.
(also, making sure the 3.40 version compile options are similar to your earlier configuration is worthwhile).
(3) By jose isaias cabrera (jicman) on 2022-11-23 02:53:29 in reply to 2 [link] [source]
Explain with SQLite version 3.39.4 2022-09-29 15:55:41
sqlite> explain
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> var,
...> oldv,
...> newv
...> FROM
...> (
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Finish_Date' as var,
...> (
...> SELECT
...> coalesce(Finish_Date,'') FROM Project_List
...> WHERE ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> coalesce(Finish_Date,'') as newv
...> FROM Project_List as o
...> UNION
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Ann_CapexP' as var,
...> (
...> SELECT
...> replace(round(Ann_CapexP),'.0','') FROM Project_List WHERE
...> ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> replace(round(Ann_CapexP),'.0','') as newv
...> FROM Project_List as o
...>
...> )
...> WHERE oldv <> newv
...> AND ProjID = 'PR0000020614'
...> ORDER BY InsertDate ASC;
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 181 0 0 Start at 181
1 InitCoroutine 1 152 2 0 (subquery-4)
2 OpenEphemeral 5 6 0 k(6,B,B,B,B,B,B) 0 nColumn=6
3 OpenRead 3 164394 0 56 0 root=164394 iDb=0; Project_List
4 OpenRead 6 110918 0 k(3,,,) 2 root=110918 iDb=0; PL_ProjID_BL_Start
5 String8 0 2 0 PR0000020614 0 r[2]='PR0000020614'
6 SeekGE 6 72 2 1 0 key=r[2]
7 IdxGT 6 72 2 1 0 key=r[2]
8 DeferredSeek 6 0 3 0 Move 3 to 6.rowid if needed
9 BeginSubrtn 0 4 0 0 r[4]=NULL
10 Null 0 5 5 0 r[5..5]=NULL; Init subquery result
11 Noop 7 3 0 0
12 Integer 1 6 0 0 r[6]=1
13 Ne 8 15 7 67 if r[7]!=r[8] goto 15
14 ZeroOrNull 7 6 8 0 r[6] = 0 OR NULL
15 MustBeInt 6 0 0 0 LIMIT counter
16 IfNot 6 32 0 0
17 OpenRead 4 164394 0 56 0 root=164394 iDb=0; Project_List
18 OpenRead 8 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
19 String8 0 9 0 PR0000020614 0 r[9]='PR0000020614'
20 IsNull 9 32 0 0 if r[9]==NULL goto 32
21 Column 3 55 10 0 r[10]=Project_List.InsertDate
22 IsNull 10 32 0 0 if r[10]==NULL goto 32
23 SeekLT 8 32 9 2 0 key=r[9..10]
24 Null 0 10 0 0 r[10]=NULL
25 IdxLE 8 32 9 2 0 key=r[9..10]
26 DeferredSeek 8 0 4 0 Move 4 to 8.rowid if needed
27 Column 4 7 5 0 r[5]=Project_List.Finish_Date
28 NotNull 5 30 0 0 if r[5]!=NULL goto 30
29 String8 0 5 0 0 r[5]=''
30 DecrJumpZero 6 32 0 0 if (--r[6])==0 goto 32
31 Prev 8 25 0 0
32 Return 4 10 1 0
33 Column 3 7 3 0 r[3]=Project_List.Finish_Date
34 NotNull 3 36 0 0 if r[3]!=NULL goto 36
35 String8 0 3 0 0 r[3]=''
36 Eq 3 71 5 BINARY-8 80 if r[5]==r[3] goto 71
37 Column 6 0 11 0 r[11]=Project_List.ProjID
38 Column 3 25 12 0 r[12]=Project_List.Updated_By
39 Column 3 55 13 0 r[13]=Project_List.InsertDate
40 String8 0 14 0 Finish_Date 0 r[14]='Finish_Date'
41 BeginSubrtn 0 17 0 0 r[17]=NULL
42 Null 0 18 18 0 r[18..18]=NULL; Init subquery result
43 Noop 9 3 0 0
44 Integer 1 19 0 0 r[19]=1
45 Ne 8 47 7 67 if r[7]!=r[8] goto 47
46 ZeroOrNull 7 19 8 0 r[19] = 0 OR NULL
47 MustBeInt 19 0 0 0 LIMIT counter
48 IfNot 19 64 0 0
49 OpenRead 4 164394 0 56 0 root=164394 iDb=0; Project_List
50 OpenRead 10 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
51 Column 6 0 20 0 r[20]=Project_List.ProjID
52 IsNull 20 64 0 0 if r[20]==NULL goto 64
53 Column 3 55 21 0 r[21]=Project_List.InsertDate
54 IsNull 21 64 0 0 if r[21]==NULL goto 64
55 SeekLT 10 64 20 2 0 key=r[20..21]
56 Null 0 21 0 0 r[21]=NULL
57 IdxLE 10 64 20 2 0 key=r[20..21]
58 DeferredSeek 10 0 4 0 Move 4 to 10.rowid if needed
59 Column 4 7 18 0 r[18]=Project_List.Finish_Date
60 NotNull 18 62 0 0 if r[18]!=NULL goto 62
61 String8 0 18 0 0 r[18]=''
62 DecrJumpZero 19 64 0 0 if (--r[19])==0 goto 64
63 Prev 10 57 0 0
64 Return 17 42 1 0
65 SCopy 18 15 0 0 r[15]=r[18]
66 Column 3 7 16 0 r[16]=Project_List.Finish_Date
67 NotNull 16 69 0 0 if r[16]!=NULL goto 69
68 String8 0 16 0 0 r[16]=''
69 MakeRecord 11 6 22 0 r[22]=mkrec(r[11..16])
70 IdxInsert 5 22 11 6 0 key=r[22]
71 Next 6 7 0 0
72 OpenRead 1 164394 0 56 0 root=164394 iDb=0; Project_List
73 OpenRead 11 110918 0 k(3,,,) 2 root=110918 iDb=0; PL_ProjID_BL_Start
74 String8 0 23 0 PR0000020614 0 r[23]='PR0000020614'
75 SeekGE 11 141 23 1 0 key=r[23]
76 IdxGT 11 141 23 1 0 key=r[23]
77 DeferredSeek 11 0 1 0 Move 1 to 11.rowid if needed
78 BeginSubrtn 0 24 0 0 r[24]=NULL
79 Null 0 25 25 0 r[25..25]=NULL; Init subquery result
80 Noop 12 3 0 0
81 Integer 1 26 0 0 r[26]=1
82 Ne 8 84 7 67 if r[7]!=r[8] goto 84
83 ZeroOrNull 7 26 8 0 r[26] = 0 OR NULL
84 MustBeInt 26 0 0 0 LIMIT counter
85 IfNot 26 101 0 0
86 OpenRead 2 164394 0 56 0 root=164394 iDb=0; Project_List
87 OpenRead 13 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
88 String8 0 27 0 PR0000020614 0 r[27]='PR0000020614'
89 IsNull 27 101 0 0 if r[27]==NULL goto 101
90 Column 1 55 28 0 r[28]=Project_List.InsertDate
91 IsNull 28 101 0 0 if r[28]==NULL goto 101
92 SeekLT 13 101 27 2 0 key=r[27..28]
93 Null 0 28 0 0 r[28]=NULL
94 IdxLE 13 101 27 2 0 key=r[27..28]
95 DeferredSeek 13 0 2 0 Move 2 to 13.rowid if needed
96 Column 2 35 32 0 r[32]=Project_List.Ann_CapexP
97 Function 0 32 29 round(1) 0 r[29]=func(r[32])
98 Function 6 29 25 replace(3) 0 r[25]=func(r[29..31])
99 DecrJumpZero 26 101 0 0 if (--r[26])==0 goto 101
100 Prev 13 94 0 0
101 Return 24 79 1 0
102 Column 1 35 36 0 r[36]=Project_List.Ann_CapexP
103 Function 0 36 33 round(1) 0 r[33]=func(r[36])
104 Function 6 33 22 replace(3) 0 r[22]=func(r[33..35])
105 Eq 22 140 25 BINARY-8 80 if r[25]==r[22] goto 140
106 Column 11 0 11 0 r[11]=Project_List.ProjID
107 Column 1 25 12 0 r[12]=Project_List.Updated_By
108 Column 1 55 13 0 r[13]=Project_List.InsertDate
109 String8 0 14 0 Ann_CapexP 0 r[14]='Ann_CapexP'
110 BeginSubrtn 0 37 0 0 r[37]=NULL
111 Null 0 38 38 0 r[38..38]=NULL; Init subquery result
112 Noop 14 3 0 0
113 Integer 1 39 0 0 r[39]=1
114 Ne 8 116 7 67 if r[7]!=r[8] goto 116
115 ZeroOrNull 7 39 8 0 r[39] = 0 OR NULL
116 MustBeInt 39 0 0 0 LIMIT counter
117 IfNot 39 133 0 0
118 OpenRead 2 164394 0 56 0 root=164394 iDb=0; Project_List
119 OpenRead 15 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
120 Column 11 0 40 0 r[40]=Project_List.ProjID
121 IsNull 40 133 0 0 if r[40]==NULL goto 133
122 Column 1 55 41 0 r[41]=Project_List.InsertDate
123 IsNull 41 133 0 0 if r[41]==NULL goto 133
124 SeekLT 15 133 40 2 0 key=r[40..41]
125 Null 0 41 0 0 r[41]=NULL
126 IdxLE 15 133 40 2 0 key=r[40..41]
127 DeferredSeek 15 0 2 0 Move 2 to 15.rowid if needed
128 Column 2 35 22 0 r[22]=Project_List.Ann_CapexP
129 Function 0 22 42 round(1) 0 r[42]=func(r[22])
130 Function 6 42 38 replace(3) 0 r[38]=func(r[42..44])
131 DecrJumpZero 39 133 0 0 if (--r[39])==0 goto 133
132 Prev 15 126 0 0
133 Return 37 111 1 0
134 SCopy 38 15 0 0 r[15]=r[38]
135 Column 1 35 48 0 r[48]=Project_List.Ann_CapexP
136 Function 0 48 45 round(1) 0 r[45]=func(r[48])
137 Function 6 45 16 replace(3) 0 r[16]=func(r[45..47])
138 MakeRecord 11 6 48 0 r[48]=mkrec(r[11..16])
139 IdxInsert 5 48 11 6 0 key=r[48]
140 Next 11 76 0 0
141 Rewind 5 150 0 0
142 Column 5 0 49 0 r[49]=ProjID
143 Column 5 1 50 0 r[50]=Updated_By
144 Column 5 2 51 0 r[51]=InsertDate
145 Column 5 3 52 0 r[52]=var
146 Column 5 4 53 0 r[53]=oldv
147 Column 5 5 54 0 r[54]=newv
148 Yield 1 0 0 0
149 Next 5 142 0 0
150 Close 5 0 0 0
151 EndCoroutine 1 0 0 0
152 SorterOpen 16 8 0 k(1,B) 0
153 InitCoroutine 1 0 2 0
154 Yield 1 169 0 0 next row of (subquery-4)
155 Copy 53 55 0 2 r[55]=r[53]; (subquery-4).oldv
156 Copy 54 56 0 2 r[56]=r[54]; (subquery-4).newv
157 Eq 56 168 55 BINARY-8 80 if r[55]==r[56] goto 168
158 Copy 49 56 0 2 r[56]=r[49]; (subquery-4).ProjID
159 Ne 57 168 56 BINARY-8 81 if r[56]!=r[57] goto 168
160 Copy 49 59 0 2 r[59]=r[49]; (subquery-4).ProjID
161 Copy 50 60 0 2 r[60]=r[50]; (subquery-4).Updated_By
162 Copy 52 61 0 2 r[61]=r[52]; (subquery-4).var
163 Copy 53 62 0 2 r[62]=r[53]; (subquery-4).oldv
164 Copy 54 63 0 2 r[63]=r[54]; (subquery-4).newv
165 Copy 51 58 0 2 r[58]=r[51]; (subquery-4).InsertDate
166 MakeRecord 58 6 65 0 r[65]=mkrec(r[58..63])
167 SorterInsert 16 65 58 6 0 key=r[65]
168 Goto 0 154 0 0
169 OpenPseudo 17 66 8 0 8 columns in r[66]
170 SorterSort 16 180 0 0
171 SorterData 16 66 17 0 r[66]=data
172 Column 17 5 64 0 r[64]=newv
173 Column 17 4 63 0 r[63]=oldv
174 Column 17 3 62 0 r[62]=var
175 Column 17 0 61 0 r[61]=InsertDate
176 Column 17 2 60 0 r[60]=Updated_By
177 Column 17 1 59 0 r[59]=ProjID
178 ResultRow 59 6 0 0 output=r[59..64]
179 SorterNext 16 171 0 0
180 Halt 0 0 0 0
181 Transaction 0 0 1155 0 1 usesStmtJournal=0
182 Integer 1 7 0 0 r[7]=1
183 Integer 0 8 0 0 r[8]=0
184 String8 0 30 0 .0 0 r[30]='.0'
185 String8 0 31 0 0 r[31]=''
186 String8 0 34 0 .0 0 r[34]='.0'
187 String8 0 35 0 0 r[35]=''
188 String8 0 43 0 .0 0 r[43]='.0'
189 String8 0 44 0 0 r[44]=''
190 String8 0 46 0 .0 0 r[46]='.0'
191 String8 0 47 0 0 r[47]=''
192 String8 0 57 0 PR0000020614 0 r[57]='PR0000020614'
193 Goto 0 1 0 0
Run Time: real 0.316 user 0.031250 sys 0.078125
Explain with SQLite version 3.40.0 2022-11-16 12:10:08
sqlite> explain
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> var,
...> oldv,
...> newv
...> FROM
...> (
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Finish_Date' as var,
...> (
...> SELECT
...> coalesce(Finish_Date,'') FROM Project_List
...> WHERE ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> coalesce(Finish_Date,'') as newv
...> FROM Project_List as o
...> UNION
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Ann_CapexP' as var,
...> (
...> SELECT
...> replace(round(Ann_CapexP),'.0','') FROM Project_List WHERE
...> ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> replace(round(Ann_CapexP),'.0','') as newv
...> FROM Project_List as o
...>
...> )
...> WHERE oldv <> newv
...> AND ProjID = 'PR0000020614'
...> ORDER BY InsertDate ASC;
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 117 0 0 Start at 117
1 InitCoroutine 1 88 2 0 (subquery-4)
2 OpenEphemeral 5 6 0 k(6,B,B,B,B,B,B) 0 nColumn=6
3 OpenRead 3 164394 0 56 0 root=164394 iDb=0; Project_List
4 Rewind 3 40 0 0
5 Column 3 0 2 0 r[2]= cursor 3 column 0
6 Column 3 25 3 0 r[3]= cursor 3 column 25
7 Column 3 55 4 0 r[4]= cursor 3 column 55
8 String8 0 5 0 Finish_Date 0 r[5]='Finish_Date'
9 BeginSubrtn 0 8 0 0 r[8]=NULL
10 Null 0 9 9 0 r[9..9]=NULL; Init subquery result
11 Noop 6 3 0 0
12 Integer 1 10 0 0 r[10]=1
13 Ne 12 15 11 67 if r[11]!=r[12] goto 15
14 ZeroOrNull 11 10 12 0 r[10] = 0 OR NULL
15 MustBeInt 10 0 0 0 LIMIT counter
16 IfNot 10 32 0 0
17 OpenRead 4 164394 0 56 0 root=164394 iDb=0; Project_List
18 OpenRead 7 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
19 Column 3 0 13 0 r[13]= cursor 3 column 0
20 IsNull 13 32 0 0 if r[13]==NULL goto 32
21 Column 3 55 14 0 r[14]= cursor 3 column 55
22 IsNull 14 32 0 0 if r[14]==NULL goto 32
23 SeekLT 7 32 13 2 0 key=r[13..14]
24 Null 0 14 0 0 r[14]=NULL
25 IdxLE 7 32 13 2 0 key=r[13..14]
26 DeferredSeek 7 0 4 0 Move 4 to 7.rowid if needed
27 Column 4 7 9 0 r[9]= cursor 4 column 7
28 NotNull 9 30 0 0 if r[9]!=NULL goto 30
29 String8 0 9 0 0 r[9]=''
30 DecrJumpZero 10 32 0 0 if (--r[10])==0 goto 32
31 Prev 7 25 0 0
32 Return 8 10 1 0
33 SCopy 9 6 0 0 r[6]=r[9]
34 Column 3 7 7 0 r[7]= cursor 3 column 7
35 NotNull 7 37 0 0 if r[7]!=NULL goto 37
36 String8 0 7 0 0 r[7]=''
37 MakeRecord 2 6 15 0 r[15]=mkrec(r[2..7])
38 IdxInsert 5 15 2 6 0 key=r[15]
39 Next 3 5 0 1
40 OpenRead 1 164394 0 56 0 root=164394 iDb=0; Project_List
41 Rewind 1 77 0 0
42 Column 1 0 2 0 r[2]= cursor 1 column 0
43 Column 1 25 3 0 r[3]= cursor 1 column 25
44 Column 1 55 4 0 r[4]= cursor 1 column 55
45 String8 0 5 0 Ann_CapexP 0 r[5]='Ann_CapexP'
46 BeginSubrtn 0 16 0 0 r[16]=NULL
47 Null 0 17 17 0 r[17..17]=NULL; Init subquery result
48 Noop 8 3 0 0
49 Integer 1 18 0 0 r[18]=1
50 Ne 12 52 11 67 if r[11]!=r[12] goto 52
51 ZeroOrNull 11 18 12 0 r[18] = 0 OR NULL
52 MustBeInt 18 0 0 0 LIMIT counter
53 IfNot 18 69 0 0
54 OpenRead 2 164394 0 56 0 root=164394 iDb=0; Project_List
55 OpenRead 9 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
56 Column 1 0 19 0 r[19]= cursor 1 column 0
57 IsNull 19 69 0 0 if r[19]==NULL goto 69
58 Column 1 55 20 0 r[20]= cursor 1 column 55
59 IsNull 20 69 0 0 if r[20]==NULL goto 69
60 SeekLT 9 69 19 2 0 key=r[19..20]
61 Null 0 20 0 0 r[20]=NULL
62 IdxLE 9 69 19 2 0 key=r[19..20]
63 DeferredSeek 9 0 2 0 Move 2 to 9.rowid if needed
64 Column 2 35 15 0 r[15]= cursor 2 column 35
65 Function 0 15 21 round(1) 0 r[21]=func(r[15])
66 Function 6 21 17 replace(3) 0 r[17]=func(r[21..23])
67 DecrJumpZero 18 69 0 0 if (--r[18])==0 goto 69
68 Prev 9 62 0 0
69 Return 16 47 1 0
70 SCopy 17 6 0 0 r[6]=r[17]
71 Column 1 35 27 0 r[27]= cursor 1 column 35
72 Function 0 27 24 round(1) 0 r[24]=func(r[27])
73 Function 6 24 7 replace(3) 0 r[7]=func(r[24..26])
74 MakeRecord 2 6 27 0 r[27]=mkrec(r[2..7])
75 IdxInsert 5 27 2 6 0 key=r[27]
76 Next 1 42 0 1
77 Rewind 5 86 0 0
78 Column 5 0 28 0 r[28]=ProjID
79 Column 5 1 29 0 r[29]=Updated_By
80 Column 5 2 30 0 r[30]=InsertDate
81 Column 5 3 31 0 r[31]=var
82 Column 5 4 32 0 r[32]=oldv
83 Column 5 5 33 0 r[33]=newv
84 Yield 1 0 0 0
85 Next 5 78 0 0
86 Close 5 0 0 0
87 EndCoroutine 1 0 0 0
88 SorterOpen 10 8 0 k(1,B) 0
89 InitCoroutine 1 0 2 0
90 Yield 1 105 0 0 next row of (subquery-4)
91 Copy 32 34 0 2 r[34]=r[32]
92 Copy 33 35 0 2 r[35]=r[33]
93 Eq 35 104 34 BINARY-8 80 if r[34]==r[35] goto 104
94 Copy 28 35 0 2 r[35]=r[28]
95 Ne 36 104 35 BINARY-8 81 if r[35]!=r[36] goto 104
96 Copy 28 38 0 2 r[38]=r[28]
97 Copy 29 39 0 2 r[39]=r[29]
98 Copy 31 40 0 2 r[40]=r[31]
99 Copy 32 41 0 2 r[41]=r[32]
100 Copy 33 42 0 2 r[42]=r[33]
101 Copy 30 37 0 2 r[37]=r[30]
102 MakeRecord 37 6 44 0 r[44]=mkrec(r[37..42])
103 SorterInsert 10 44 37 6 0 key=r[44]
104 Goto 0 90 0 0
105 OpenPseudo 11 45 8 0 8 columns in r[45]
106 SorterSort 10 116 0 0
107 SorterData 10 45 11 0 r[45]=data
108 Column 11 5 43 0 r[43]=newv
109 Column 11 4 42 0 r[42]=oldv
110 Column 11 3 41 0 r[41]=var
111 Column 11 0 40 0 r[40]=InsertDate
112 Column 11 2 39 0 r[39]=Updated_By
113 Column 11 1 38 0 r[38]=ProjID
114 ResultRow 38 6 0 0 output=r[38..43]
115 SorterNext 10 107 0 0
116 Halt 0 0 0 0
117 Transaction 0 0 1155 0 1 usesStmtJournal=0
118 Integer 1 11 0 0 r[11]=1
119 Integer 0 12 0 0 r[12]=0
120 String8 0 22 0 .0 0 r[22]='.0'
121 String8 0 23 0 0 r[23]=''
122 String8 0 25 0 .0 0 r[25]='.0'
123 String8 0 26 0 0 r[26]=''
124 String8 0 36 0 PR0000020614 0 r[36]='PR0000020614'
125 Goto 0 1 0 0
Run Time: real 0.197 user 0.031250 sys 0.078125
Thanks for the support.
(4) By jose isaias cabrera (jicman) on 2022-11-23 03:02:18 in reply to 2 [link] [source]
Explain query plan with SQLite version 3.39.4 2022-09-29 15:55:41
sqlite> explain query plan
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> var,
...> oldv,
...> newv
...> FROM
...> (
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Finish_Date' as var,
...> (
...> SELECT
...> coalesce(Finish_Date,'') FROM Project_List
...> WHERE ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> coalesce(Finish_Date,'') as newv
...> FROM Project_List as o
...> UNION
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Ann_CapexP' as var,
...> (
...> SELECT
...> replace(round(Ann_CapexP),'.0','') FROM Project_List WHERE
...> ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> replace(round(Ann_CapexP),'.0','') as newv
...> FROM Project_List as o
...>
...> )
...> WHERE oldv <> newv
...> AND ProjID = 'PR0000020614'
...> ORDER BY InsertDate ASC;
QUERY PLAN
|--CO-ROUTINE (subquery-4)
| `--COMPOUND QUERY
| |--LEFT-MOST SUBQUERY
| | |--SEARCH o USING INDEX PL_ProjID_BL_Start (ProjID=?)
| | |--CORRELATED SCALAR SUBQUERY 1
| | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| | `--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| `--UNION USING TEMP B-TREE
| |--SEARCH o USING INDEX PL_ProjID_BL_Start (ProjID=?)
| |--CORRELATED SCALAR SUBQUERY 3
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| `--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
|--SCAN (subquery-4)
`--USE TEMP B-TREE FOR ORDER BY
Run Time: real 0.012 user 0.000000 sys 0.000000
sqlite>
Explain query plan with SQLite version 3.40.0 2022-11-16 12:10:08
sqlite> explain query plan
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> var,
...> oldv,
...> newv
...> FROM
...> (
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Finish_Date' as var,
...> (
...> SELECT
...> coalesce(Finish_Date,'') FROM Project_List
...> WHERE ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> coalesce(Finish_Date,'') as newv
...> FROM Project_List as o
...> UNION
...>
...> SELECT ProjID,
...> Updated_By,
...> InsertDate,
...> 'Ann_CapexP' as var,
...> (
...> SELECT
...> replace(round(Ann_CapexP),'.0','') FROM Project_List WHERE
...> ProjID = o.ProjID
...> AND InsertDate < o.InsertDate
...> ORDER BY InsertDate DESC
...> LIMIT 1
...> ) AS oldv,
...> replace(round(Ann_CapexP),'.0','') as newv
...> FROM Project_List as o
...>
...> )
...> WHERE oldv <> newv
...> AND ProjID = 'PR0000020614'
...> ORDER BY InsertDate ASC;
QUERY PLAN
|--CO-ROUTINE (subquery-4)
| `--COMPOUND QUERY
| |--LEFT-MOST SUBQUERY
| | |--SCAN o
| | `--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| `--UNION USING TEMP B-TREE
| |--SCAN o
| `--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
|--SCAN (subquery-4)
`--USE TEMP B-TREE FOR ORDER BY
Run Time: real 0.009 user 0.000000 sys 0.000000
sqlite>
Thanks for the support.
(5) By anonymous on 2022-11-23 03:54:24 in reply to 4 [link] [source]
Not an expert, yet obviously, 3.39.4 is using indexes (SEARCH o USING INDEX PL_ProjID_BL_Start (ProjID=?)) that 3.40 is not (scan o).
Assume you have read https://www.sqlite.org/queryplanner-ng.html#howtofix
IMO:
You might try inserting INDEXED BY clauses on table o sub-queries as in https://www.sqlite.org/lang_indexedby.html
As per the first doc, if that works, then figure out how to write the query without the INDEXED BY clauses. In otherwords, using it as a debug tool isn't bad if you use it to find out why the query planner is getting confused (and scanning instead).
(6) By anonymous on 2022-11-23 04:44:31 in reply to 4 [link] [source]
You might also experiment with OFFSET 0 (added to the LIMIT 1) to avoid having the query flattened and see if that helps.
also check out the doc https://www.sqlite.org/optoverview.html , particularly section 11. An OFFSET 0 (even when only in the sub-query) will avoid flattening, rather than LIMIT which must be in both inner an outer queries to avoid flattening).
If adding OFFSET 0 to those LIMIT 1 clauses helps then I suspect the slowdown has to do with query flattening and then the query planner choosing a scan rather than the previous behaviour.
It will be interesting to see what you come up with.
(7) By anonymous on 2022-11-23 05:03:43 in reply to 4 [link] [source]
One last item (as if you didn't have enough to check already!):
refer to https://www.sqlite.org/changes.html
Particularly at query planner changes from the change log item 3-d, and 3-e.
My hunch, presuming analyze was run properly, is that it's within those changes.
It's just a guess though.. so I'm interested what you find.
(9) By jose isaias cabrera (jicman) on 2022-11-23 13:40:35 in reply to 7 [link] [source]
Thanks for all the explanations, but, what I expect from a library or software is that if something is correctly written, syntax-wise, and it was working well in a previous versions, the next version should be very close to the same. I would even expect, as in this situation to lose a few hundreds of a second, because of a fix, but to go from 0.08 seconds to 379.41 seconds, that's a huge difference. I will keep using 3.39.4 until a new version gives me close to what I have now. Thanks for your input.
(10) By Richard Hipp (drh) on 2022-11-23 13:51:54 in reply to 9 [link] [source]
It is unlikely to be "fixed" unless you provide us with a test case. A suitable test case might be either of these:
A database file together with a query that runs significantly slower in 3.40.
An SQL script that first constructs a database from scratch, and then runs a query that is significantly slower in 3.40.
The second option is preferred.
(11) By jose isaias cabrera (jicman) on 2022-11-23 16:05:09 in reply to 10 [link] [source]
I tried creating a script, but, it's not working. Probably because the . Also, there are a lot of records, and I was only trying to use 100 records, and that may not be sufficient to cause the slow down for the query. Option one is the choice that I can provide. Where can I place the DB? It's 827MB. Thanks.
josé
(14) By anonymous on 2022-11-23 18:16:52 in reply to 11 [link] [source]
While I well understand your frustration, following the advice of Dr Hipp (and the other well known forum members) is recommended. That extra effort will not only increase the chance of others putting in effort, but it is one of ways open source can actually work.
For your own purposes, you will want the ability to generate a db from scratch, preferably with test data. If you do not already have that, now (or yesterday) is a great time to set it up.
keith noted the WHERE was not being pushed to the outer select (i presume that was what was previously using an index and a scan in 3.40). You might look at placing one there as a debugging effort.
in any event, Dr Hipp and Keith are the best of the best, so my advice is follow their advice.
(16) By jose isaias cabrera (jicman) on 2022-11-23 19:20:23 in reply to 14 [link] [source]
Thanks for the understanding. :-)
I have been using SQLite since 2006, and I had never had a regression. Actually I've seen queries response faster. But, when I ran this report and it just hang, I thought something was really wrong. :-) I was not frustrated. I was just trying to get to the bottom of it. I even downsized the DB (from 847MB to 243) to provide a sample DB that has the problem to Dr. Hipp and the developers. So, if the developers are still interested, I can provide the DB with the scripts that will show the problem right away.
Right now, I am happy with the UNION ALL change. I had to compile a new version of the front-end program to replace the UNION vs UNION ALL, but that is a piece of cake. :-) Thanks for hanging in there with your support. :-)
josé
(8.1) By Keith Medcalf (kmedcalf) on 2022-11-23 07:50:43 edited from 8.0 in reply to 4 [link] [source]
The later version is not pushing the outer WHERE clause into the branches of the UNION.
(12) By Keith Medcalf (kmedcalf) on 2022-11-23 18:10:15 in reply to 1 [link] [source]
Try this one which pushes the where clause down by manually ...
SELECT ProjID,
Updated_By,
InsertDate,
var,
oldv,
newv
FROM (
SELECT ProjID,
Updated_By,
InsertDate,
'Finish_Date' as var,
(
SELECT coalesce(Finish_Date,'')
FROM Project_List
WHERE ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
coalesce(Finish_Date,'') as newv
FROM Project_List as o
WHERE ProjID = 'PR0000020614'
UNION
SELECT ProjID,
Updated_By,
InsertDate,
'Ann_CapexP' as var,
(
SELECT replace(round(Ann_CapexP),'.0','')
FROM Project_List
WHERE ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
replace(round(Ann_CapexP),'.0','') as newv
FROM Project_List as o
WHERE ProjID = 'PR0000020614'
)
WHERE oldv <> newv
ORDER BY InsertDate ASC;
You could also push down the WHERE oldv <> newv
into each branch as AND oldv <> newv
See what happens.
(13) By Keith Medcalf (kmedcalf) on 2022-11-23 18:15:15 in reply to 12 [link] [source]
Or this:
SELECT ProjID,
Updated_By,
InsertDate,
var,
oldv,
newv
FROM (
(SELECT * FROM (
SELECT ProjID,
Updated_By,
InsertDate,
'Finish_Date' as var,
(
SELECT coalesce(Finish_Date,'')
FROM Project_List
WHERE ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
coalesce(Finish_Date,'') as newv
FROM Project_List as o
WHERE ProjID = 'PR0000020614'
)
WHERE newv <> oldv
)
UNION ALL
(SELECT * FROM (
SELECT ProjID,
Updated_By,
InsertDate,
'Ann_CapexP' as var,
(
SELECT replace(round(Ann_CapexP),'.0','')
FROM Project_List
WHERE ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
replace(round(Ann_CapexP),'.0','') as newv
FROM Project_List as o
WHERE ProjID = 'PR0000020614'
)
WHERE newv <> oldv
)
)
ORDER BY InsertDate ASC;
(17) By anonymous on 2022-11-24 00:40:34 in reply to 13 [link] [source]
Keith, is the reason that the union all (vs union) does the trick in this case due to transient indices not being used on UNION ALL? ( section 2.8 of https://www.sqlite.org/tempfiles.html#transient_indices ).
If it's not too much to explain, could you describe how your UNION ALL version accomplishes the pushing out of the WHERE, and if/why is that solution preferable?
Also, any guidance on when to reach for such a solution would be educational as well.
thanks!
(18) By Keith Medcalf (kmedcalf) on 2022-11-24 01:20:09 in reply to 17 [link] [source]
A UNION B merges A and B returning only DISTINCT rows.
A UNION ALL B merges A and B returning all rows.
UNION uses a B-Tree so that it can tell if the row is a duplicate.
UNION ALL does not.
I don't know why (A UNION B) acts as an optimization barrier while (A UNION ALL B) does not.
Richard may be able to answer that one.
(19) By jose isaias cabrera (jicman) on 2022-11-24 02:53:52 in reply to 18 [link] [source]
This is the explain for the UNION ALL on 3.40.0
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 173 0 0 Start at 173
1 InitCoroutine 1 73 2 0 left SELECT
2 Noop 5 8 0 0
3 OpenRead 3 164394 0 56 0 root=164394 iDb=0; Project_List
4 OpenRead 6 207716 0 k(3,,,) 2 root=207716 iDb=0; PL_ProjID_InsertDate_New
5 String8 0 5 0 PR0000020614 0 r[5]='PR0000020614'
6 SeekGE 6 72 5 1 0 key=r[5]
7 IdxGT 6 72 5 1 0 key=r[5]
8 DeferredSeek 6 0 3 0 Move 3 to 6.rowid if needed
9 BeginSubrtn 0 7 0 0 r[7]=NULL
10 Null 0 8 8 0 r[8..8]=NULL; Init subquery result
11 Noop 7 3 0 0
12 Integer 1 9 0 0 r[9]=1
13 Ne 11 15 10 67 if r[10]!=r[11] goto 15
14 ZeroOrNull 10 9 11 0 r[9] = 0 OR NULL
15 MustBeInt 9 0 0 0 LIMIT counter
16 IfNot 9 32 0 0
17 OpenRead 4 164394 0 56 0 root=164394 iDb=0; Project_List
18 OpenRead 8 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
19 String8 0 12 0 PR0000020614 0 r[12]='PR0000020614'
20 IsNull 12 32 0 0 if r[12]==NULL goto 32
21 Column 6 1 13 0 r[13]= cursor 6 column 1
22 IsNull 13 32 0 0 if r[13]==NULL goto 32
23 SeekLT 8 32 12 2 0 key=r[12..13]
24 Null 0 13 0 0 r[13]=NULL
25 IdxLE 8 32 12 2 0 key=r[12..13]
26 DeferredSeek 8 0 4 0 Move 4 to 8.rowid if needed
27 Column 4 7 8 0 r[8]= cursor 4 column 7
28 NotNull 8 30 0 0 if r[8]!=NULL goto 30
29 String8 0 8 0 0 r[8]=''
30 DecrJumpZero 9 32 0 0 if (--r[9])==0 goto 32
31 Prev 8 25 0 0
32 Return 7 10 1 0
33 Column 3 7 6 0 r[6]= cursor 3 column 7
34 NotNull 6 36 0 0 if r[6]!=NULL goto 36
35 String8 0 6 0 0 r[6]=''
36 Eq 6 71 8 BINARY-8 80 if r[8]==r[6] goto 71
37 Column 6 0 14 0 r[14]= cursor 6 column 0
38 Column 3 25 15 0 r[15]= cursor 3 column 25
39 Column 6 1 16 0 r[16]= cursor 6 column 1
40 String8 0 17 0 Finish_Date 0 r[17]='Finish_Date'
41 BeginSubrtn 0 20 0 0 r[20]=NULL
42 Null 0 21 21 0 r[21..21]=NULL; Init subquery result
43 Noop 9 3 0 0
44 Integer 1 22 0 0 r[22]=1
45 Ne 11 47 10 67 if r[10]!=r[11] goto 47
46 ZeroOrNull 10 22 11 0 r[22] = 0 OR NULL
47 MustBeInt 22 0 0 0 LIMIT counter
48 IfNot 22 64 0 0
49 OpenRead 4 164394 0 56 0 root=164394 iDb=0; Project_List
50 OpenRead 10 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
51 Column 6 0 23 0 r[23]= cursor 6 column 0
52 IsNull 23 64 0 0 if r[23]==NULL goto 64
53 Column 6 1 24 0 r[24]= cursor 6 column 1
54 IsNull 24 64 0 0 if r[24]==NULL goto 64
55 SeekLT 10 64 23 2 0 key=r[23..24]
56 Null 0 24 0 0 r[24]=NULL
57 IdxLE 10 64 23 2 0 key=r[23..24]
58 DeferredSeek 10 0 4 0 Move 4 to 10.rowid if needed
59 Column 4 7 21 0 r[21]= cursor 4 column 7
60 NotNull 21 62 0 0 if r[21]!=NULL goto 62
61 String8 0 21 0 0 r[21]=''
62 DecrJumpZero 22 64 0 0 if (--r[22])==0 goto 64
63 Prev 10 57 0 0
64 Return 20 42 1 0
65 Copy 21 18 0 0 r[18]=r[21]
66 Column 3 7 19 0 r[19]= cursor 3 column 7
67 NotNull 19 69 0 0 if r[19]!=NULL goto 69
68 String8 0 19 0 0 r[19]=''
69 ClrSubtype 19 0 0 0 r[19].subtype = 0
70 Yield 1 0 0 0
71 Next 6 7 0 0
72 EndCoroutine 1 0 0 0
73 InitCoroutine 2 167 74 0 right SELECT
74 Noop 11 8 0 0
75 OpenRead 1 164394 0 56 0 root=164394 iDb=0; Project_List
76 OpenRead 12 207716 0 k(3,,,) 2 root=207716 iDb=0; PL_ProjID_InsertDate_New
77 String8 0 25 0 PR0000020614 0 r[25]='PR0000020614'
78 SeekGE 12 144 25 1 0 key=r[25]
79 IdxGT 12 144 25 1 0 key=r[25]
80 DeferredSeek 12 0 1 0 Move 1 to 12.rowid if needed
81 BeginSubrtn 0 27 0 0 r[27]=NULL
82 Null 0 28 28 0 r[28..28]=NULL; Init subquery result
83 Noop 13 3 0 0
84 Integer 1 29 0 0 r[29]=1
85 Ne 11 87 10 67 if r[10]!=r[11] goto 87
86 ZeroOrNull 10 29 11 0 r[29] = 0 OR NULL
87 MustBeInt 29 0 0 0 LIMIT counter
88 IfNot 29 104 0 0
89 OpenRead 2 164394 0 56 0 root=164394 iDb=0; Project_List
90 OpenRead 14 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
91 String8 0 30 0 PR0000020614 0 r[30]='PR0000020614'
92 IsNull 30 104 0 0 if r[30]==NULL goto 104
93 Column 12 1 31 0 r[31]= cursor 12 column 1
94 IsNull 31 104 0 0 if r[31]==NULL goto 104
95 SeekLT 14 104 30 2 0 key=r[30..31]
96 Null 0 31 0 0 r[31]=NULL
97 IdxLE 14 104 30 2 0 key=r[30..31]
98 DeferredSeek 14 0 2 0 Move 2 to 14.rowid if needed
99 Column 2 35 35 0 r[35]= cursor 2 column 35
100 Function 0 35 32 round(1) 0 r[32]=func(r[35])
101 Function 6 32 28 replace(3) 0 r[28]=func(r[32..34])
102 DecrJumpZero 29 104 0 0 if (--r[29])==0 goto 104
103 Prev 14 97 0 0
104 Return 27 82 1 0
105 Column 1 35 39 0 r[39]= cursor 1 column 35
106 Function 0 39 36 round(1) 0 r[36]=func(r[39])
107 Function 6 36 26 replace(3) 0 r[26]=func(r[36..38])
108 Eq 26 143 28 BINARY-8 80 if r[28]==r[26] goto 143
109 Column 12 0 40 0 r[40]= cursor 12 column 0
110 Column 1 25 41 0 r[41]= cursor 1 column 25
111 Column 12 1 42 0 r[42]= cursor 12 column 1
112 String8 0 43 0 Ann_CapexP 0 r[43]='Ann_CapexP'
113 BeginSubrtn 0 46 0 0 r[46]=NULL
114 Null 0 47 47 0 r[47..47]=NULL; Init subquery result
115 Noop 15 3 0 0
116 Integer 1 48 0 0 r[48]=1
117 Ne 11 119 10 67 if r[10]!=r[11] goto 119
118 ZeroOrNull 10 48 11 0 r[48] = 0 OR NULL
119 MustBeInt 48 0 0 0 LIMIT counter
120 IfNot 48 136 0 0
121 OpenRead 2 164394 0 56 0 root=164394 iDb=0; Project_List
122 OpenRead 16 207716 0 k(3,,,) 0 root=207716 iDb=0; PL_ProjID_InsertDate_New
123 Column 12 0 49 0 r[49]= cursor 12 column 0
124 IsNull 49 136 0 0 if r[49]==NULL goto 136
125 Column 12 1 50 0 r[50]= cursor 12 column 1
126 IsNull 50 136 0 0 if r[50]==NULL goto 136
127 SeekLT 16 136 49 2 0 key=r[49..50]
128 Null 0 50 0 0 r[50]=NULL
129 IdxLE 16 136 49 2 0 key=r[49..50]
130 DeferredSeek 16 0 2 0 Move 2 to 16.rowid if needed
131 Column 2 35 26 0 r[26]= cursor 2 column 35
132 Function 0 26 51 round(1) 0 r[51]=func(r[26])
133 Function 6 51 47 replace(3) 0 r[47]=func(r[51..53])
134 DecrJumpZero 48 136 0 0 if (--r[48])==0 goto 136
135 Prev 16 129 0 0
136 Return 46 114 1 0
137 Copy 47 44 0 0 r[44]=r[47]
138 Column 1 35 57 0 r[57]= cursor 1 column 35
139 Function 0 57 54 round(1) 0 r[54]=func(r[57])
140 Function 6 54 45 replace(3) 0 r[45]=func(r[54..56])
141 ClrSubtype 45 0 0 0 r[45].subtype = 0
142 Yield 2 0 0 0
143 Next 12 79 0 0
144 EndCoroutine 2 0 0 0
145 Noop 0 0 0 0 Output routine for A
146 ResultRow 14 6 0 0 output=r[14..19]
147 Return 3 0 0 0
148 Noop 0 0 0 0 Output routine for B
149 ResultRow 40 6 0 0 output=r[40..45]
150 Return 4 0 0 0
151 Noop 0 0 0 0 eof-A subroutine
152 Gosub 4 149 0 0
153 Yield 2 172 0 0
154 Goto 0 152 0 0
155 Noop 0 0 0 0 eof-B subroutine
156 Gosub 3 146 0 0
157 Yield 1 172 0 0
158 Goto 0 156 0 0
159 Noop 0 0 0 0 A-lt-B subroutine
160 Gosub 3 146 0 0
161 Yield 1 152 0 0
162 Goto 0 169 0 0
163 Noop 0 0 0 0 A-gt-B subroutine
164 Gosub 4 149 0 0
165 Yield 2 156 0 0
166 Goto 0 169 0 0
167 Yield 1 153 0 0
168 Yield 2 156 0 0
169 Permutation 0 0 0 [2] 0
170 Compare 14 40 1 k(2,B,) 1 r[14] <-> r[40]
171 Jump 160 160 164 0
172 Halt 0 0 0 0
173 Transaction 0 0 1157 0 1 usesStmtJournal=0
174 Integer 1 10 0 0 r[10]=1
175 Integer 0 11 0 0 r[11]=0
176 String8 0 33 0 .0 0 r[33]='.0'
177 String8 0 34 0 0 r[34]=''
178 String8 0 37 0 .0 0 r[37]='.0'
179 String8 0 38 0 0 r[38]=''
180 String8 0 52 0 .0 0 r[52]='.0'
181 String8 0 53 0 0 r[53]=''
182 String8 0 55 0 .0 0 r[55]='.0'
183 String8 0 56 0 0 r[56]=''
184 Goto 0 1 0 0
This is the explain query plan with UNION ALL on 3.40.0
QUERY PLAN
`--MERGE (UNION ALL)
|--LEFT
| |--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?)
| |--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| `--CORRELATED SCALAR SUBQUERY 1
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
`--RIGHT
|--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?)
|--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
`--CORRELATED SCALAR SUBQUERY 3
`--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
Perhaps these will say something compared with the previous ones.
(20) By anonymous on 2022-11-24 04:24:17 in reply to 19 [link] [source]
That is interesting (by the way, explain query plan seems more understandable at the moment)! I took all three of your explain query plan outputs to compare and MERGE (UNION ALL) seems to work like a cross join in disabling some of the flattening. (see 7.1 of https://www.sqlite.org/optoverview.html ). I am wondering why the OFFSET 0 or WHERE clause on outer select didn't seem to effect it (or perhaps it does and I missed that 'memo'?). I might tackle generating random entries (via series, and random) to make a test db as I would like to see what is happening here. Could you provide a schema sql so I don't have to create one from scratch (I may discard extra columns to reduce complexity)? I understand the distributions won't be the same, however it might show me what's going on. The summary of the three explain query plans: 3.40 union all (fast 3.40 query) QUERY PLAN `--MERGE (UNION ALL) |--LEFT | |--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) | |--CORRELATED SCALAR SUBQUERY 1 | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | `--CORRELATED SCALAR SUBQUERY 1 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) `--RIGHT |--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) |--CORRELATED SCALAR SUBQUERY 3 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) `--CORRELATED SCALAR SUBQUERY 3 `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) 3.39.4 (fast query) QUERY PLAN |--CO-ROUTINE (subquery-4) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | |--SEARCH o USING INDEX PL_ProjID_BL_Start (ProjID=?) | | |--CORRELATED SCALAR SUBQUERY 1 | | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | | `--CORRELATED SCALAR SUBQUERY 1 | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | `--UNION USING TEMP B-TREE | |--SEARCH o USING INDEX PL_ProjID_BL_Start (ProjID=?) | |--CORRELATED SCALAR SUBQUERY 3 | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | `--CORRELATED SCALAR SUBQUERY 3 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) |--SCAN (subquery-4) `--USE TEMP B-TREE FOR ORDER BY 3.40 slow query QUERY PLAN |--CO-ROUTINE (subquery-4) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | |--SCAN o | | `--CORRELATED SCALAR SUBQUERY 1 | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | `--UNION USING TEMP B-TREE | |--SCAN o | `--CORRELATED SCALAR SUBQUERY 3 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) |--SCAN (subquery-4) `--USE TEMP B-TREE FOR ORDER BY My general thinking is it might be worth getting to the bottom of this rather than stopping because it now 'works' as it seems a fragile position. So if you want to help investigate further, a schema sql create script would save me some time. thanks.
(30) By jose isaias cabrera (jicman) on 2022-11-25 17:03:31 in reply to 20 [link] [source]
Happy belated Thanksgiving to all. I thank God for your wonderful tool, Dr. Hipp, and I am only doing good things with it. :-)
... Could you provide a schema sql so I don't have to create one from scratch (I may discard extra columns to reduce complexity)? ...
This script below will show the EXPLAIN QUERY PLANs differences/problems. It does not show the slowness of the query because there are over 165K of records on the table where the query is ran against. However, if you replicate this data with a script by just adding the same data with a different ProjID (PR0000000001 to PR0000030000), it to create enough data for some of the slowness of the query response.
CREATE TABLE Project_List
(
ProjID, Finish_Date, BL_Finish, Updated_By, InsertDate,
PRIMARY KEY (ProjID, Finish_Date, BL_Finish)
);
CREATE INDEX PL_ProjID_InsertDate_New ON "Project_List" (ProjID, InsertDate);
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Kathy','2021-08-10_07-26-53');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Pamela','2021-08-17_06-28-38');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Barbara','2021-08-17_11-29-18');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Barbara','2021-08-17_13-46-26');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Pamela','2021-08-20_04-44-04');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Pamela','2021-08-25_07-31-24');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Barbara','2021-09-24_03-55-27');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Miguel','2021-10-15_03-59-51');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Miguel','2022-01-04_05-35-28');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Miguel','2022-02-08_13-04-48');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Externo','2022-02-16_04-59-37');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Jose','2022-02-22_08-20-39');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Pamela','2022-03-14_11-37-33');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Esther','2022-04-08_04-27-49');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Barbara','2022-04-25_05-17-31');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Pamela','2022-04-25_12-01-17');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','','Pamela','2022-05-13_04-05-26');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','','Externo','2022-05-20_04-12-13');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','','Miguel','2022-05-26_04-53-14');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','','Esther','2022-06-13_12-14-45');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','','Jose','2022-06-27_11-56-58');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','','Jose','2022-06-29_04-33-34');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Pamela','2022-06-29_10-09-49');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Barbara','2022-07-05_08-56-08');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Daniel','2022-07-06_07-09-57');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Daniel','2022-07-25_05-32-23');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Jose','2022-08-05_04-50-11');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Barbara','2022-08-10_04-15-47');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Barbara','2022-08-22_10-59-48');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Pamela','2022-08-23_11-48-46');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Externo','2022-08-31_05-47-18');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Jose','2022-09-06_05-56-24');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Esther','2022-09-13_09-41-39');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Barbara','2022-09-16_07-25-06');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Pamela','2022-10-03_12-19-50');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Barbara','2022-10-17_07-02-07');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Barbara','2022-10-31_05-09-06');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-08-31','2023-08-31','Jose','2022-11-03_05-57-56');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','2023-08-31','Donna','2022-11-10_08-57-13');
INSERT INTO Project_List(ProjID,Finish_Date,BL_Finish,Updated_By,InsertDate) VALUES('PR0000020614','2023-09-30','2023-09-30','Barbara','2022-11-16_11-48-08');
And use this query,
SELECT ProjID,
Updated_By,
InsertDate,
var,
oldv,
newv
FROM
(
SELECT ProjID,
Updated_By,
InsertDate,
'BL_Finish' as var,
(
SELECT
coalesce(BL_Finish,'') FROM Project_List
WHERE ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
coalesce(BL_Finish,'') as newv
FROM Project_List as o
UNION
SELECT ProjID,
Updated_By,
InsertDate,
'Finish_Date' as var,
(
SELECT
coalesce(Finish_Date,'') FROM Project_List
WHERE ProjID = o.ProjID
AND InsertDate < o.InsertDate
ORDER BY InsertDate DESC
LIMIT 1
) AS oldv,
coalesce(Finish_Date,'') as newv
FROM Project_List as o
)
WHERE oldv <> newv
AND ProjID = 'PR0000020614'
ORDER BY InsertDate ASC;
Just change the UNION to UNION ALL, and as Keith also suggested, moving the WHEREs into each SELECT, that also works fast. I am using Windows 10, so, I have downloaded the SQLite tools from the download site, and I have unzipped them, and opened a few command prompts and just CD to the parent directory of each tool and run SQLite3 executable from there and can test them. Just an FYI. Thanks all for your support.
(33) By Keith Medcalf (kmedcalf) on 2022-11-25 17:54:09 in reply to 30 [link] [source]
You are missing something from the primary key because running the script as shown generates many primary key constraint violations. Making the whole caboodle the primary key seems to allow the data to be inserted.
(34) By jose isaias cabrera (jicman) on 2022-11-25 17:58:41 in reply to 33 [link] [source]
Yeah, that is correct. The problem is that there are a bunch of other fields that I took off. And the constraints are ok in the original, but with just these 3 fields, they are popping out. I should have taken those out. Sorry.
(43) By anonymous on 2022-11-25 22:13:18 in reply to 30 [link] [source]
The posts locked in moderation were my (lame) guess at your schema. Once I was able to get the query plans to match your reported behavior I tried to reduce them. I was able duplicate your resulting query plans on the versions you posted.
In the meantime Dr Hipp and Keith (who are always miles ahead) were able to pin point an existing test case (the INTERSECT example). I refactored the reduced query into something more closely matching that, but keeping the same query plan.
At this point the reduced query is moot (now that we have your test rigging).
Thus, not sure how useful any of my previous posts are at this point, but admins can feel free to delete any to all of those posts for clarity (or edit).
regards
(44) By jose isaias cabrera (jicman) on 2022-11-26 01:57:18 in reply to 43 [link] [source]
Dr. Hipp fixed the problem that I was having. Thanks.
josé
(46) By anonymous on 2022-11-26 03:45:09 in reply to 44 [link] [source]
Still seeing anomalies when second part of union is a scan constant row (see my reply to Dr Hipp for an example below).
The example I provided to Dr Hipp was further reduction of your query (beyond my initial crack at it), isolating the to one side of the union. This appears to mean that if you had a union that added a row with only constants, that it reverts to the slow query you experienced (in otherwords, works like 3.40).
I could be wrong (and often am), but I think the story isn't quite finished on this (would love to be wrong).
(21) By anonymous on 2022-11-24 05:52:54 in reply to 19 [link] [source]
I worked up a small demo (you can use it on the fiddle). I can replicate the query plans you saw, both with UNION ALL, and with UNION. The INDEXED BY clause also get's good results. However the OFFSET 0 does not. This is the script (the creation script is commented out so I can change the query on the same dataset). Also, no analyse has been run.. it operates as you descibed out of the box: ---- script create table Project_List ( ProjID int, Updated_By text, InsertDate TEXT, Finish_Date TEXT, Ann_CapexP real); create index PL_ProjID_InsertDate_New on Project_List (ProjID, InsertDate); insert into Project_List SELECT value as ProjID, hex(randomblob(10)) as Updated_By, abs(random()) % (10000000 - 1) + 1 as Ann_CapexP, cast (round(julianday('now')) +3 as integer) as InsertDate, cast (round(julianday('now')) +5 as integer) as Finish_Date FROM generate_series(1,100,1); select count(*) from Project_List; The query with poor performance: explain query plan SELECT ProjID, Updated_By, InsertDate, var, oldv, newv FROM ( SELECT ProjID, Updated_By, InsertDate, 'Finish_Date' as var, ( SELECT coalesce(Finish_Date,'') FROM Project_List WHERE ProjID = o.ProjID AND InsertDate < o.InsertDate ORDER BY InsertDate DESC LIMIT 1 ) AS oldv, coalesce(Finish_Date,'') as newv FROM Project_List as o -- INDEXED BY PL_ProjID_InsertDate_New UNION ALL SELECT ProjID, Updated_By, InsertDate, 'Ann_CapexP' as var, ( SELECT replace(round(Ann_CapexP),'.0','') FROM Project_List WHERE ProjID = o.ProjID AND InsertDate < o.InsertDate ORDER BY InsertDate DESC LIMIT 1 ) AS oldv, replace(round(Ann_CapexP),'.0','') as newv FROM Project_List as o -- INDEXED BY PL_ProjID_InsertDate_New ) WHERE oldv <> newv AND ProjID = 'PR0000020614' ORDER BY InsertDate ASC; ---- output of 3.40 online fiddle with original sql --------output ----- QUERY PLAN |--CO-ROUTINE (subquery-4) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | |--SCAN o | | `--CORRELATED SCALAR SUBQUERY 1 | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | `--UNION USING TEMP B-TREE | |--SCAN o | `--CORRELATED SCALAR SUBQUERY 3 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) |--SCAN (subquery-4) `--USE TEMP B-TREE FOR ORDER BY ----- using union all instead of union output (uncomment -- ALL) QUERY PLAN `--MERGE (UNION ALL) |--LEFT | |--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) | |--CORRELATED SCALAR SUBQUERY 1 | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | `--CORRELATED SCALAR SUBQUERY 1 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) `--RIGHT |--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) |--CORRELATED SCALAR SUBQUERY 3 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) `--CORRELATED SCALAR SUBQUERY 3 `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) ------ using INDEXED BY on two nested sub-queries (uncomment INDEXED BY) QUERY PLAN |--CO-ROUTINE (subquery-4) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | |--SCAN o USING INDEX PL_ProjID_InsertDate_New | | `--CORRELATED SCALAR SUBQUERY 1 | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) | `--UNION USING TEMP B-TREE | |--SCAN o USING INDEX PL_ProjID_InsertDate_New | `--CORRELATED SCALAR SUBQUERY 3 | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) |--SCAN (subquery-4) `--USE TEMP B-TREE FOR ORDER BY So there are few ways to get the query to use the index, perhaps the most heavy handed is INDEXED BY because it makes it clear what you trying to do. There also are query plan performance differences. My rough guess from random runs is that the UNION ALL is faster than the INDEXED BY but I only used a small number of records. Also, INDEXED BY seems to more closely match the 3.39.4 query plan (exactly?). I can distill the above down to a smaller test case if Dr Hipp thinks it's worthwhile to take a look at it.
(22) By anonymous on 2022-11-24 06:39:24 in reply to 19 [link] [source]
Please ignore the section in my post below about INDEXED BY performance as I noticed the query plan in 3.40 only changed to SCAN USING INDEX rather than SEARCH USING INDEX.
I clearly have more work to do on that query so I'll have another look tomorrow evening.
Once(if?) I have that sorted out I will try to take a crack at a minimised schema and query that shows the same behaviour. And then try some performance comparisons (it is too early for that IMO).
What I appear to be able to assert is that the 3.40 query plan of union all version is different than the 3.39.4 original problematic query plan by OP (The MERGE (UNION ALL) rather than co-routine). I will double check that when I pick this up again.
If someone wants to sort it out before I get to it, please feel welcome! :) I used the online fiddle and scripts I posted.
(23) By Richard Hipp (drh) on 2022-11-24 12:35:36 in reply to 18 [link] [source]
See https://sqlite.org/src/artifact/8a709a8e19?ln=5012-5017 and https://sqlite.org/src/info/346a3b12b861ce7b.
The following is a simplification of the script that the dbsqlfuzz fuzzer found that demonstrated the problem:
CREATE TABLE t1(a,b,c COLLATE NOCASE); INSERT INTO t1 VALUES(1,'a','a'); INSERT INTO t1 VALUES(9.9000000000000003552,'b','B'); INSERT INTO t1 VALUES(NULL,'C','c'); INSERT INTO t1 VALUES('hello','d','D'); INSERT INTO t1 VALUES(X'616263','e','e'); .mode qbox .echo on SELECT a,b,CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END FROM t1 INTERSECT SELECT a,b,c FROM t1 WHERE -3.7e+921*11<>b ORDER BY a,b,c; SELECT * FROM (SELECT a,b,CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END FROM t1 INTERSECT SELECT a,b,c FROM t1 WHERE -3.7e+921*11<>b ORDER BY a,b,c) WHERE "a" ISNULL AND "b"='C' AND "CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END"='C'
The second query is a wrapper around the first query that attempts to pick off a single row of the first query. However, it returns no rows. Try it using SQLite version 3.39 or earlier and you will get the wrong answer. Do the same on 3.40 and the correct answer comes out. The fix we put in place for this problem was the 346a3b12b861ce7b patch.
(24.1) By Keith Medcalf (kmedcalf) on 2022-11-24 16:14:34 edited from 24.0 in reply to 23 [link] [source]
Very interesting, except both versions 3.39.0 and 3.38.0 return correct responses.
SQLite version 3.38.0 2022-02-22 18:58:40
Enter ".help" for usage hints.
sqlite> .version
SQLite 3.38.0 2022-02-22 18:58:40 40fa792d359f84c3b9e9d6623743e1a59826274e221df1bde8f47086968a1bab
zlib version 1.2.11
gcc-5.2.0
sqlite> CREATE TABLE t1(a,b,c COLLATE NOCASE);
sqlite> INSERT INTO t1 VALUES(1,'a','a');
sqlite> INSERT INTO t1 VALUES(9.9000000000000003552,'b','B');
sqlite> INSERT INTO t1 VALUES(NULL,'C','c');
sqlite> INSERT INTO t1 VALUES('hello','d','D');
sqlite> INSERT INTO t1 VALUES(X'616263','e','e');
sqlite>
sqlite> .mode qbox
sqlite> .eqp on
sqlite> SELECT a,b,CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END
...> FROM t1
...> INTERSECT
...> SELECT a,b,c
...> FROM t1
...> WHERE -3.7e+921*11<>b
...> ORDER BY a,b,c;
QUERY PLAN
`--MERGE (INTERSECT)
|--LEFT
| |--SCAN t1
| `--USE TEMP B-TREE FOR ORDER BY
`--RIGHT
|--SCAN t1
`--USE TEMP B-TREE FOR ORDER BY
┌───────────┬─────┬───────────────────────────────────────────────────────┐
│ a │ b │ CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END │
├───────────┼─────┼───────────────────────────────────────────────────────┤
│ NULL │ 'C' │ 'C' │
│ 1 │ 'a' │ 'a' │
│ 9.9 │ 'b' │ 'b' │
│ 'hello' │ 'd' │ 'd' │
│ x'616263' │ 'e' │ 'e' │
└───────────┴─────┴───────────────────────────────────────────────────────┘
sqlite>
sqlite> SELECT *
...> FROM (
...> SELECT a,b,CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END
...> FROM t1
...> INTERSECT
...> SELECT a,b,c
...> FROM t1
...> WHERE -3.7e+921*11<>b
...> ORDER BY a,b,c
...> )
...> WHERE "a" ISNULL
...> AND "b"='C'
...> AND "CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END"='C'
...> ;
QUERY PLAN
|--CO-ROUTINE SUBQUERY 2
| `--MERGE (INTERSECT)
| |--LEFT
| | |--SCAN t1
| | `--USE TEMP B-TREE FOR RIGHT PART OF ORDER BY
| `--RIGHT
| `--SCAN t1
`--SCAN SUBQUERY 2
┌──────┬─────┬───────────────────────────────────────────────────────┐
│ a │ b │ CASE c WHEN 943 THEN 967 WHEN 897 THEN 533 ELSE b END │
├──────┼─────┼───────────────────────────────────────────────────────┤
│ NULL │ 'C' │ 'C' │
└──────┴─────┴───────────────────────────────────────────────────────┘
sqlite>
Unless, of course, the executables for these versions as located on your website are not actually what they purport to be.
(25.1) By Keith Medcalf (kmedcalf) on 2022-11-24 18:06:44 edited from 25.0 in reply to 23 [link] [source]
Your simplified example is bogus (ie, does not demonstrate any issue). However, the actual testcase does in fact demonstrate an issue in that the specific subquery has a difficulty when a de-collationified expression is pushed down into the subquery.
(26) By Richard Hipp (drh) on 2022-11-24 18:21:57 in reply to 25.1 [link] [source]
The problem was introduced by a different bug-fix that happened earlier in the 3.40 development cycle. So it never appeared in a release. The previous bug-fix was at check-in ed14863dd72e35fa. So the example above only failed for the 69 check-ins over 12 days in mid-October.
I'll investigate and see how long the previous bug was valid for....
(27) By Richard Hipp (drh) on 2022-11-24 18:42:48 in reply to 26 [link] [source]
An alternative demonstration of the problem is this script:
CREATE TABLE t1(a,b,c COLLATE NOCASE); INSERT INTO t1 VALUES(1,'a','a'); INSERT INTO t1 VALUES(9.9000000000000003552,'b','B'); INSERT INTO t1 VALUES(NULL,'C','c'); INSERT INTO t1 VALUES('hello','d','D'); INSERT INTO t1 VALUES(X'616263','e','e'); .echo on .mode qbox SELECT a,b,c FROM t1 INTERSECT SELECT a,b, b FROM t1 WHERE 'eT"3qRkL+oJMJjQ9z0'>=b ORDER BY a,b,c; SELECT * FROM (SELECT a,b,c FROM t1 INTERSECT SELECT a,b, b FROM t1 WHERE 'eT"3qRkL+oJMJjQ9z0'>=b ORDER BY a,b,c) WHERE "a" ISNULL AND "b"='C' AND "c"='c';
The second query should return one row. But it returns zero rows ever since the push-down optimization was added for version 3.8.11 in 2015. Recent enhancements to the dbsqlfuzz fuzzer found this problem and brought it to your attention.
(29) By anonymous on 2022-11-25 06:56:40 in reply to 27 [link] [source]
To show the query plan difference I added:
create index i on t1 (a,b,c);
With the index, from what I checked, the show query plan details the same behaviour of the UNION example across the various sqlite versions and matches closely with the reduced query I posted earlier this evening.
Thus, it appears to me that this is indeed the issue you identified with a push down optimisation being incorrectly applied (as per you post above).
summary: Previously, the push down query optimisation that was in place was incorrect. in 3.40 it fixes correctness, but the queries that depended on that optimization are going to be (much) slower until re-written?
(28) By anonymous on 2022-11-25 06:17:27 in reply to 25.1 [link] [source]
(Apologies ahead of time if this is noise.) I reduced the original Project_List query to the following: drop table if exists t ; create table t (a text); create index i on t (a); explain query plan SELECT a FROM ( SELECT a FROM t as o UNION SELECT 1 as a ) WHERE a = 1; then tested on three versions as follows: 3.34.1: CO-ROUTINE 2 COMPOUND QUERY LEFT-MOST SUBQUERY SEARCH TABLE t AS o USING COVERING INDEX i (a=?) UNION USING TEMP B-TREE SCAN CONSTANT ROW 3.39.1: CO-ROUTINE (subquery-2) COMPOUND QUERY LEFT-MOST SUBQUERY SEARCH o USING COVERING INDEX i (a=?) UNION USING TEMP B-TREE SCAN CONSTANT ROW SCAN (subquery-2) 3.40.0 (online fiddle and a custom 3.40.0 build with QPSG enabled) QUERY PLAN |--CO-ROUTINE (subquery-2) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | `--SCAN o | `--UNION USING TEMP B-TREE | `--SCAN CONSTANT ROW `--SCAN (subquery-2) It's clearly NOT pushing the where into the subquery. What isn't clear is whether it should be. As Dr Hipp has suggested that it's related to the INTERSECT example , I will look at refactoring the query to INTERSECT and comparing both output, and query plans across the above sqlite versions. (admins: feel free to delete my partially retracted 'in limbo' post yesterday, the script and query was overly complicated, leading to mistakes and dead end investigations).
(15) By jose isaias cabrera (jicman) on 2022-11-23 19:09:04 in reply to 12 [link] [source]
This change above grabs 125 records in 0.1 seconds using SQLiteVer: 3.40.0 2022-11-16 12:10:08. However, if I use the original one, and replace UNION with UNION ALL, the result is 125 records in 0.09 seconds using the same SQLiteVer: 3.40.0 2022-11-16 12:10:08. So, I will keep the UNION ALL change. Thanks.
(31) By Richard Hipp (drh) on 2022-11-25 17:07:24 in reply to 1 [link] [source]
Can you retry your original query using the latest trunk check-in of SQLite and let me know whether or not the performance issue has been resolved?
(32.1) By jose isaias cabrera (jicman) on 2022-11-25 18:02:02 edited from 32.0 in reply to 31 [link] [source]
Where is sqlite3.c on the trunk? I usually download the snapshots, and in those, the sqlite3.c is right in the top directory. I usually use this command,
i686-w64-mingw32-gcc -shared -static-libgcc sqlite3.c -o sqlite3.dll
But, I can't find the sqlite3.c source.
$ pwd
/home/e608313/b/sqlite/SQLite-adbca344
$ ls -l
total 2.6M
-rw-r--r-- 1 e608313 Domain Users 263 Nov 25 12:05 LICENSE.md
-rw-r--r-- 1 e608313 Domain Users 49K Nov 25 12:36 Makefile
-rw-r--r-- 1 e608313 Domain Users 49K Nov 25 12:05 Makefile.in
-rw-r--r-- 1 e608313 Domain Users 3.4K Nov 25 12:05 Makefile.linux-gcc
-rw-r--r-- 1 e608313 Domain Users 81K Nov 25 12:05 Makefile.msc
-rw-r--r-- 1 e608313 Domain Users 16K Nov 25 12:05 README.md
-rw-r--r-- 1 e608313 Domain Users 7 Nov 25 12:05 VERSION
-rw-r--r-- 1 e608313 Domain Users 276K Nov 25 12:05 aclocal.m4
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 art/
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 autoconf/
-rw-r--r-- 1 e608313 Domain Users 48K Nov 25 12:05 config.guess
-rw-r--r-- 1 e608313 Domain Users 79K Nov 25 12:36 config.log
-rwxr-xr-x 1 e608313 Domain Users 55K Nov 25 12:36 config.status*
-rw-r--r-- 1 e608313 Domain Users 31K Nov 25 12:05 config.sub
-rwxr-xr-x 1 e608313 Domain Users 402K Nov 25 12:05 configure*
-rw-r--r-- 1 e608313 Domain Users 24K Nov 25 12:05 configure.ac
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 contrib/
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 doc/
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 ext/
-rwxr-xr-x 1 e608313 Domain Users 5.5K Nov 25 12:05 install-sh*
-rwxr-xr-x 1 e608313 Domain Users 255K Nov 25 12:36 libtool*
-rw-r--r-- 1 e608313 Domain Users 240K Nov 25 12:05 ltmain.sh
-rw-r--r-- 1 e608313 Domain Users 1.6K Nov 25 12:05 magic.txt
-rw-r--r-- 1 e608313 Domain Users 37K Nov 25 12:05 main.mk
-rw-r--r-- 1 e608313 Domain Users 160K Nov 25 12:05 manifest
-rw-r--r-- 1 e608313 Domain Users 65 Nov 25 12:05 manifest.uuid
-rw-r--r-- 1 e608313 Domain Users 937 Nov 25 12:05 mkso.sh
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 mptest/
-rw-r--r-- 1 e608313 Domain Users 1.8K Nov 25 12:05 spec.template
-rw-r--r-- 1 e608313 Domain Users 258 Nov 25 12:05 sqlite.pc.in
-rw-r--r-- 1 e608313 Domain Users 8.8K Nov 25 12:05 sqlite3.1
-rw-r--r-- 1 e608313 Domain Users 263 Nov 25 12:36 sqlite3.pc
-rw-r--r-- 1 e608313 Domain Users 267 Nov 25 12:05 sqlite3.pc.in
-rw-r--r-- 1 e608313 Domain Users 4.0K Nov 25 12:29 sqlite_cfg.h
-rw-r--r-- 1 e608313 Domain Users 3.7K Nov 25 12:05 sqlite_cfg.h.in
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 src/
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 test/
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 tool/
drwxr-xr-x 1 e608313 Domain Users 0 Nov 25 12:05 vsixtest/
(35) By jose isaias cabrera (jicman) on 2022-11-25 18:24:55 in reply to 32.1 [link] [source]
Never mind... I had to run,
./configure
make
The problem is no longer there. I grabbed 125 records in 0.03 seconds. SQLiteVer: 3.41.0 2022-11-25 17:05:55. I am going to build the sqlite3 tool to send EXPLAIN QUERY PLAN.
(37) By jose isaias cabrera (jicman) on 2022-11-25 18:40:37 in reply to 35 [link] [source]
New EXPLAIN QUERY PLAN on the same query as previously stated using SQLite version 3.41.0 2022-11-25 17:05:55 is:
QUERY PLAN
|--CO-ROUTINE (subquery-4)
| `--COMPOUND QUERY
| |--LEFT-MOST SUBQUERY
| | |--SEARCH o USING INDEX PL_ProjID_BL_Start (ProjID=?)
| | |--CORRELATED SCALAR SUBQUERY 1
| | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| | `--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| `--UNION USING TEMP B-TREE
| |--SEARCH o USING INDEX PL_ProjID_BL_Start (ProjID=?)
| |--CORRELATED SCALAR SUBQUERY 3
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
| `--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?)
|--SCAN (subquery-4)
`--USE TEMP B-TREE FOR ORDER BY
(36) By Keith Medcalf (kmedcalf) on 2022-11-25 18:25:41 in reply to 31 [link] [source]
I get the following:
Without latest checkin:
>sqlite < testit.sql
QUERY PLAN
|--CO-ROUTINE (subquery-4)
| `--COMPOUND QUERY
| |--LEFT-MOST SUBQUERY
| | |--SCAN o (~1048576 rows)
| | `--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
| `--UNION USING TEMP B-TREE
| |--SCAN o (~1048576 rows)
| `--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
|--SCAN (subquery-4) (~524288 rows)
`--USE TEMP B-TREE FOR ORDER BY
┌────────────────┬────────────┬───────────────────────┬───────────────┬──────────────┬──────────────┐
│ ProjID │ Updated_By │ InsertDate │ var │ oldv │ newv │
├────────────────┼────────────┼───────────────────────┼───────────────┼──────────────┼──────────────┤
│ 'PR0000020614' │ 'Externo' │ '2022-05-20_04-12-13' │ 'Finish_Date' │ '2023-09-30' │ '2023-08-31' │
│ 'PR0000020614' │ 'Pamela' │ '2022-06-29_10-09-49' │ 'BL_Finish' │ '' │ '2023-08-31' │
│ 'PR0000020614' │ 'Donna' │ '2022-11-10_08-57-13' │ 'Finish_Date' │ '2023-08-31' │ '2023-09-30' │
│ 'PR0000020614' │ 'Barbara' │ '2022-11-16_11-48-08' │ 'BL_Finish' │ '2023-08-31' │ '2023-09-30' │
└────────────────┴────────────┴───────────────────────┴───────────────┴──────────────┴──────────────┘
VM-steps: 129600104
Run Time: real 6.681 user 6.343750 sys 0.328125
QUERY PLAN
`--MERGE (UNION ALL)
|--LEFT
| |--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) (~9 rows)
| |--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
| `--CORRELATED SCALAR SUBQUERY 1
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
`--RIGHT
|--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) (~9 rows)
|--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
`--CORRELATED SCALAR SUBQUERY 3
`--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
┌────────────────┬────────────┬───────────────────────┬───────────────┬──────────────┬──────────────┐
│ ProjID │ Updated_By │ InsertDate │ var │ oldv │ newv │
├────────────────┼────────────┼───────────────────────┼───────────────┼──────────────┼──────────────┤
│ 'PR0000020614' │ 'Externo' │ '2022-05-20_04-12-13' │ 'Finish_Date' │ '2023-09-30' │ '2023-08-31' │
│ 'PR0000020614' │ 'Pamela' │ '2022-06-29_10-09-49' │ 'BL_Finish' │ '' │ '2023-08-31' │
│ 'PR0000020614' │ 'Donna' │ '2022-11-10_08-57-13' │ 'Finish_Date' │ '2023-08-31' │ '2023-09-30' │
│ 'PR0000020614' │ 'Barbara' │ '2022-11-16_11-48-08' │ 'BL_Finish' │ '2023-08-31' │ '2023-09-30' │
└────────────────┴────────────┴───────────────────────┴───────────────┴──────────────┴──────────────┘
VM-steps: 3013
Run Time: real 0.011 user 0.000000 sys 0.000000
With the latest checkin:
>sqlite3 < testit.sql
QUERY PLAN
|--CO-ROUTINE (subquery-4)
| `--COMPOUND QUERY
| |--LEFT-MOST SUBQUERY
| | |--SEARCH o USING COVERING INDEX sqlite_autoindex_Project_List_1 (ProjID=?) (~9 rows)
| | |--CORRELATED SCALAR SUBQUERY 1
| | | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
| | `--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
| `--UNION USING TEMP B-TREE
| |--SEARCH o USING COVERING INDEX sqlite_autoindex_Project_List_1 (ProjID=?) (~9 rows)
| |--CORRELATED SCALAR SUBQUERY 3
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
| `--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
|--SCAN (subquery-4) (~4 rows)
`--USE TEMP B-TREE FOR ORDER BY
┌────────────────┬────────────┬───────────────────────┬───────────────┬──────────────┬──────────────┐
│ ProjID │ Updated_By │ InsertDate │ var │ oldv │ newv │
├────────────────┼────────────┼───────────────────────┼───────────────┼──────────────┼──────────────┤
│ 'PR0000020614' │ 'Externo' │ '2022-05-20_04-12-13' │ 'Finish_Date' │ '2023-09-30' │ '2023-08-31' │
│ 'PR0000020614' │ 'Pamela' │ '2022-06-29_10-09-49' │ 'BL_Finish' │ '' │ '2023-08-31' │
│ 'PR0000020614' │ 'Donna' │ '2022-11-10_08-57-13' │ 'Finish_Date' │ '2023-08-31' │ '2023-09-30' │
│ 'PR0000020614' │ 'Barbara' │ '2022-11-16_11-48-08' │ 'BL_Finish' │ '2023-08-31' │ '2023-09-30' │
└────────────────┴────────────┴───────────────────────┴───────────────┴──────────────┴──────────────┘
VM-steps: 3044
Run Time: real 0.008 user 0.000000 sys 0.000000
QUERY PLAN
`--MERGE (UNION ALL)
|--LEFT
| |--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) (~9 rows)
| |--CORRELATED SCALAR SUBQUERY 1
| | `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
| `--CORRELATED SCALAR SUBQUERY 1
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
`--RIGHT
|--SEARCH o USING INDEX PL_ProjID_InsertDate_New (ProjID=?) (~9 rows)
|--CORRELATED SCALAR SUBQUERY 3
| `--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
`--CORRELATED SCALAR SUBQUERY 3
`--SEARCH Project_List USING INDEX PL_ProjID_InsertDate_New (ProjID=? AND InsertDate<?) (~2 rows)
┌────────────────┬────────────┬───────────────────────┬───────────────┬──────────────┬──────────────┐
│ ProjID │ Updated_By │ InsertDate │ var │ oldv │ newv │
├────────────────┼────────────┼───────────────────────┼───────────────┼──────────────┼──────────────┤
│ 'PR0000020614' │ 'Externo' │ '2022-05-20_04-12-13' │ 'Finish_Date' │ '2023-09-30' │ '2023-08-31' │
│ 'PR0000020614' │ 'Pamela' │ '2022-06-29_10-09-49' │ 'BL_Finish' │ '' │ '2023-08-31' │
│ 'PR0000020614' │ 'Donna' │ '2022-11-10_08-57-13' │ 'Finish_Date' │ '2023-08-31' │ '2023-09-30' │
│ 'PR0000020614' │ 'Barbara' │ '2022-11-16_11-48-08' │ 'BL_Finish' │ '2023-08-31' │ '2023-09-30' │
└────────────────┴────────────┴───────────────────────┴───────────────┴──────────────┴──────────────┘
VM-steps: 3013
Run Time: real 0.007 user 0.000000 sys 0.000000
Using this schema and the data posted above duplicated 30000 times.
CREATE TABLE Project_List
(
ProjID, Finish_Date, BL_Finish, Updated_By, InsertDate,
PRIMARY KEY (ProjID, Finish_Date, BL_Finish, Updated_By, InsertDate)
);
CREATE INDEX PL_ProjID_InsertDate_New ON "Project_List" (ProjID, InsertDate);
(38) By jose isaias cabrera (jicman) on 2022-11-25 18:50:39 in reply to 36 [link] [source]
Thanks, Keith. How do you get VM-Steps to show with queries? I have .timer on, but VM-Step is not on mine.
I would have expected that UNION ALL would be slower, or would, at least, have more VM-Steps. Hmmmm. So, UNION ALL is the winner.
(39.1) By Larry Brasfield (larrybr) on 2022-11-25 19:06:24 edited from 39.0 in reply to 38 [link] [source]
Plain UNION has to do extra work to eliminate duplicates. With UNION ALL, that work is avoided. The amount of processing and main memory access is much reduced.
(41) By jose isaias cabrera (jicman) on 2022-11-25 19:53:00 in reply to 39.1 [link] [source]
There you go. I learned 17 new things today, including this one. Thanks Mr. Brasfield.
(40) By Richard Hipp (drh) on 2022-11-25 19:33:11 in reply to 38 [link] [source]
How do you get VM-Steps to show
Use: ".stats vmstep
"
(42) By jose isaias cabrera (jicman) on 2022-11-25 19:54:57 in reply to 40 [link] [source]
Ok, 18 new things I have learned today. Adding it to my .sqlite3rc file. Thanks, Dr. Hipp.
(45) By anonymous on 2022-11-26 03:23:28 in reply to 31 [link] [source]
With 3.41.0 built from tarball today, the change resolves the OP test query (and the reduction I posted yesterday, if query typo of UNION ALL is corrected to UNION ). However, with 3.41.0 (from tarball) we are still seeing the issue in the test query posted yesterday (now out of moderation today): script: drop table if exists t ; create table t (a text); create index i on t (a); insert into t select value as a from generate_series(1,100,1); explain query plan SELECT a FROM (SELECT a FROM t as o UNION select 1 as o) WHERE a = 1; results: 1 slow query plan: QUERY PLAN |--CO-ROUTINE (subquery-2) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | `--SCAN o | `--UNION USING TEMP B-TREE | `--SCAN CONSTANT ROW `--SCAN (subquery-2) a slight variation: explain query plan SELECT a FROM (SELECT a FROM t as o UNION select 1 as a) WHERE a = 1; results: 1 1 with identical explain query plan as previous query: QUERY PLAN |--CO-ROUTINE (subquery-2) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | `--SCAN o | `--UNION USING TEMP B-TREE | `--SCAN CONSTANT ROW `--SCAN (subquery-2) Was the fix/change intended to handle this case? It would seem that the scan constant row could allow the same push down as the fix/change implements? The above two queries were created by reducing the query from the OP, and then isolated to the first part of the union (making the second part of the union the scan constant row) for simplicity.
(47) By anonymous on 2022-11-26 04:56:13 in reply to 31 [link] [source]
update: I see the column affinity requirement is new (check-in : https://sqlite.org/src/info/1ad41840c5e0fa70). A good chance that is what's happening on the example with scan constant row. explain query plan SELECT a FROM (SELECT a FROM t as o UNION select cast('1' as text) as o) WHERE a = 1; results: 1 and has a query plan of: |--CO-ROUTINE (subquery-2) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | `--SEARCH o USING COVERING INDEX i (a=?) | `--UNION USING TEMP B-TREE | `--SCAN CONSTANT ROW `--SCAN (subquery-2) SELECT a FROM (SELECT a FROM t as o UNION select cast('1' as text) as a) WHERE a = 1; has identical results. So this seems to work according to current check-in documentation. Missed that there is a new requirement of the column affinity. The only other detail that is a mystery is: SELECT a FROM (SELECT a FROM t as o UNION select '1' as o) WHERE a = 1; This seems to fail the new column affinity check. Apparently '1' isn't considered the same as cast('1' as text). This seems to be true even with when the column affinity matches the data type inserted: drop table if exists t ; create table t (a text); create index i on t (a); insert into t select cast (value as text) as a from generate_series(1,100,1); the query: SELECT a FROM (SELECT a FROM t as o UNION select '1' as o) WHERE a = 1; still causes a scan, even with both column affinity and inserted as quoted literal being text.
(48.2) By Keith Medcalf (kmedcalf) on 2022-11-26 06:14:17 edited from 48.1 in reply to 47 [link] [source]
Apparently '1' isn't considered the same as cast('1' as text).
That would depend on your definiton of "the same".
'1' is an expression. It has value '1'. It is text. It has no affinity.
cast('1' as text) is an expression. It is text. Its value is '1'. It has text affinity.
You may determine what it means to have an affinity or not have an affinity and what this implies by reading the fine documentation on the concepts of datatypes and affinities here: https://sqlite.org/datatype3.html
1 (bare, by its lonesomne) is an expression. It is an integer. It has no affinity.
cast(1 as integer) is an expression. It is an integer. It has integer affinity.
'1' == 1 is False
cast('1' as text) == 1 is True
'1' == cast(1 as integer) is True
cast('1' as text) == cast(1 as integer) is True
That is to say, with clarity, that if one or the other side of a comparison has an affinity, then the side that has no affinity is converted to the same affinity as the other side. If both sides have affinity, then if one of them is numeric (integer, real, numeric) then it wins and the other side is converted to numeric affinity and a numeric comparison is performed. If neither side has affinity and and not the same type, they are neither IS nor ==.
Similar rules apply to the collation which is used only when the value is of type text (although any datatype may have a collation). That is:
1 (bare, by its lonesome) is an expression with the integer value 1 and has no affinity (and the default collation).
1 collate nocase is an expression with the integer value 1 and has no affinity however, it has the collation sequence nocase in the event that it gets text affinity applied.
(49) By anonymous on 2022-11-26 07:25:58 in reply to 48.2 [link] [source]
Sorry, yes, agree with you on the affinity. I could have certainly written it up tighter in my description. I was not suggesting that the items should have different affinity, but that the affinity of the items (or semantically more accurate, lack of affinity) was now affecting the query, where, prior to 3.41/3.40 they did not (in the same way). By 'considered the same as...' I was referring to the handling by the query planner. The affinity (and lack of) now effects that query where previously it handled 1, '1' or cast(1 as whatever) and the union worked. That it now requires cast() is new behavior that will make queries much slower until everyone wraps the constants in cast()s. It was a surprise that there is a now a restriction on optimization in this check-in (https://sqlite.org/src/info/1ad41840c5e0fa70) regarding column affinity. In 3.39.4, the query works without the cast() to use the optimization, while 3.40 does not due to rule 8, and in the trunk 3.41 it does not due the push down optimization due to it not meeting the 'new' affinity rule 9 (in check-in cited). If I understand you correctly, the 'new' requirements for that push down optimization require the 'new' affinity rule due to the collation rules (or perhaps to attempt not to get involved in this 'edge' case). given table t (a text), index on t (a); SELECT a FROM (SELECT a FROM t as o UNION select cast('1' as text) as a) WHERE a = 1; works while SELECT a FROM (SELECT a FROM t as o UNION select '1' as a) WHERE a = 1; does not. In 3.39.4 and prior, both worked (well, both used the push down optimization), so I am simply pointing out there is likely going to be some broken queries out there. And the break will be a slow down like the OP, no error, just seems to hang. Or folks can write casts around all unions with constants, matching each column's affinity (if they know where to look for the slow down). I imagine there are going to be some frustrated folks out there on this. Seems to me that in the above query, the a scan constant row could be cast to column affinity, but I can guess even that would have issues. However as it is, a union constant without a cast won't ever meet rule 9 (as you point out, it has no affinity). That seems like a less than desirable result of rule 9. We seem to have one foot on Postel's law, and another is more STRICT :) Anyway, thanks for the explanation and reference (though I've read/referred to that page so many times over the years I think I have it memorised by now).
(50) By anonymous on 2022-11-26 08:52:06 in reply to 48.2 [link] [source]
(2nd response) Keith, re: "That is to say, with clarity, that if one or the other side of a comparison has an affinity, then the side that has no affinity is converted to the same affinity as the other side." That certainly has been the behaviour for a very very long time (since sqlite epoch 0). And it's why the query (and many like it) worked in previous versions (before 3.40/3.41) and took advantage of the flexible typing of sqlite. Requiring the select on the union to match the column affinity to use the push down optimization makes flexible typing a lot less useful. A lot less. In the OP query, without push down you had to write either a query with potentially different results (union all vs union is not the same set), or do the push down of the where manually. And the slowdown came to them as a surprise. Rule 9 also raises the question of how to match affinity on a STRICT table column of affinity/datatype "ANY". I may be missing something, but can one write a union cast('hi' as ANY) as a? (typeof() says it's integer). And even if we could figure out how to write that to match the column affinity, is that something we want to force instead of 'hi' as a? consider the following (done with 3.41 created from trunk this afternoon): create table t (a ANY) strict; create index i on t(a); explain query plan select a from (select a from t as o union select cast('1' as ANY) as o) where a = 1; query plan? you guessed it, left most sub-query is scan o (not search using index i). In other words those queries are going to be slow but the sql writer will say 'I cast it to the column type!'. In any event, having outlined what I see as the issue, seems to me a good place to put rule 9 in effect is on strict tables (with handling for ANY), and follow Postel (the pre 3.40 behavior) on the dynamic/flexible typed tables. Perhaps take a another look at casting the constants (maybe wrap them in casts)on non strict tables. I hope I am wrong, but I don't think we quite have the post 3.39.4 push down optimisation rules where they need to be to avoid a lot of pain for folks. I think I've outlined about all I can, so I will leave things at that, as I have confidence in folks like yourself, Dr Hipp and everyone working on sqlite. regards
(51) By anonymous on 2023-02-23 21:37:31 in reply to 50 [link] [source]
The check in 1ad41840 removes rule 9 (allows the where push down optimisation) and restores usage of an index. With SQLite made from source tarball SQLite-a4aacdd3 : create table t (a ANY) strict; create index i on t(a); for the two queries: explain query plan select a from (select a from t as o union select cast('1' as ANY) as o) where a = 1; and explain query plan select a from (select a from t union select 1) where a = 1; the query plans are identical, and now use the available index: QUERY PLAN |--CO-ROUTINE (subquery-2) | `--COMPOUND QUERY | |--LEFT-MOST SUBQUERY | | `--SEARCH o USING COVERING INDEX i (a=?) | `--UNION USING TEMP B-TREE | `--SCAN CONSTANT ROW `--SCAN (subquery-2) This change should prevent/address unanticipated slowdowns post 3.39.4. A big thanks to Dr Hipp and everyone for making it work AND making it faster.