Reuse HAVING clause computation in the existing table
(1) By a_aniq on 2022-01-22 14:25:27 [link] [source]
I want to have the option to append another column of the calculation done by the HAVING clause to the existing table using an optional parameter. Currently the query has to run calculations twice to create the grouped table.
(2) By MBL (UserMBL) on 2022-01-22 16:31:53 in reply to 1 [link] [source]
can you please supply an example? best with schema and output of explain
and explain query plan
. For me it is still unclear what you see as being wrong.
(3) By a_aniq on 2022-01-24 02:00:53 in reply to 2 [link] [source]
Currently you have to type this query:
SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50
Here SUM(item_count)
is being computed twice
Instead if we could have something like:
SELECT totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50 APPEND AS totalitems
Here the SUM is computed only once and appended to the end of the table after filtering is done. Here the computation is simple, but consider the unnecessary additional overhead with the first command when the computation gets complex.
(4.1) By SeverKetor on 2022-01-24 03:06:29 edited from 4.0 in reply to 3 [link] [source]
Am I missing something, or wouldn't
SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING totalitems > 50
do exactly what you want already? I don't have a handy test DB around, but it worked just fine with a query using COUNT instead of SUM.
Ah, I should have held off on posting. I did a little more testing, and even if my suggestion didn't work (which it does), it's a non-issue anyway. I made a dummy function that prints its argument before returning it, and SQLite was already only computing the SUM once (even though the function wasn't marked deterministic?).
>>> list(db.execute('SELECT a, SUM(testfunction(b)) AS c FROM test GROUP BY a HAVING SUM(testfunction(b)) > 1'))
2
3
1
5
6
[(1, 5), (3, 11)]
It only recalculated the SUM when I added a second test function for the HAVING clause. I'd also like to know if that reuse of the SUM values from non-deterministic functions was intentional
(6) By Keith Medcalf (kmedcalf) on 2022-01-24 03:04:26 in reply to 4.0 [link] [source]
This is exactly the same except that it requires more typing ...
(7) By Keith Medcalf (kmedcalf) on 2022-01-24 03:12:01 in reply to 4.0 [source]
You could accomplish this as follows, but notice that it is more inefficient than simply expressing what you want directly.
sqlite> select item_sum from (select sum(item_count) as item_sum from itemlist group by user_id) where item_sum > 50;
QUERY PLAN
|--CO-ROUTINE SUBQUERY 1
| `--SCAN itemlist USING COVERING INDEX idx (~1048576 rows)
`--SCAN SUBQUERY 1 (~88 rows)
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 53 0 0 Start at 53
1 InitCoroutine 1 40 2 0 SUBQUERY 1
2 Noop 2 2 0 0
3 Integer 0 5 0 0 r[5]=0; clear abort flag
4 Null 0 8 8 0 r[8..8]=NULL
5 Gosub 7 36 0 0
6 OpenRead 3 3 0 k(3,,,) 0 root=3 iDb=0; idx
7 ColumnsUsed 3 0 0 3 0
8 Explain 8 0 0 SCAN itemlist USING COVERING INDEX idx (~1048576 rows) 0
9 Noop 0 0 0 0 Begin WHERE-loop0: itemlist
10 Rewind 3 24 10 0 0
11 Noop 0 0 0 0 Begin WHERE-core
12 Column 3 0 9 0 r[9]=itemlist.user_id
13 Compare 8 9 1 k(1,B) 0 r[8] <-> r[9]
14 Jump 15 19 15 0
15 Move 9 8 1 0 r[8]=r[9]
16 Gosub 6 29 0 0 output one row
17 IfPos 5 39 0 0 if r[5]>0 then r[5]-=0, goto 39; check abort flag
18 Gosub 7 36 0 0 reset accumulator
19 Column 3 1 10 0 r[10]=itemlist.item_count
20 AggStep 0 10 2 sum(1) 1 accum=r[2] step(r[10])
21 Integer 1 4 0 0 r[4]=1; indicate data in accumulator
22 Noop 0 0 0 0 End WHERE-core
23 Next 3 11 0 1
24 Noop 0 0 0 0 End WHERE-loop0: itemlist
25 Gosub 6 29 0 0 output final row
26 Goto 0 39 0 0
27 Integer 1 5 0 0 r[5]=1; set abort flag
28 Return 6 0 0 0
29 IfPos 4 31 0 0 if r[4]>0 then r[4]-=0, goto 31; Groupby result generator entry point
30 Return 6 0 0 0
31 AggFinal 2 1 0 sum(1) 0 accum=r[2] N=1
32 Le 11 30 2 BINARY-8 80 if r[2]<=r[11] goto 30
33 Copy 2 12 0 0 r[12]=r[2]
34 Yield 1 0 0 0
35 Return 6 0 0 0 end groupby result generator
36 Null 0 2 3 0 r[2..3]=NULL
37 Integer 0 4 0 0 r[4]=0; indicate accumulator empty
38 Return 7 0 0 0
39 EndCoroutine 1 0 0 0
40 Explain 40 0 0 SCAN SUBQUERY 1 (~88 rows) 0
41 Noop 0 0 0 0 Begin WHERE-loop0: subquery_1
42 InitCoroutine 1 0 2 0
43 Yield 1 51 0 0 next row of subquery_1
44 Copy 12 13 0 0 r[13]=r[12]; subquery_1.item_sum
45 Le 11 50 13 BINARY-8 80 if r[13]<=r[11] goto 50
46 Noop 0 0 0 0 Begin WHERE-core
47 Copy 12 14 0 0 r[14]=r[12]; subquery_1.item_sum
48 ResultRow 14 1 0 0 output=r[14]
49 Noop 0 0 0 0 End WHERE-core
50 Goto 0 43 0 0
51 Noop 0 0 0 0 End WHERE-loop0: subquery_1
52 Halt 0 0 0 0
53 Transaction 0 0 2 0 1 usesStmtJournal=0
54 Integer 50 11 0 0 r[11]=50
55 Goto 0 1 0 0
sqlite>
(5) By Keith Medcalf (kmedcalf) on 2022-01-24 03:03:27 in reply to 3 [link] [source]
This is untrue. Yopu have to type the aggregate function twice, but it is only calculated once:
SQLite version 3.38.0 2022-01-22 22:55:14
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table itemlist (user_id, item_count);
sqlite> .eqp full
sqlite> SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50;
QUERY PLAN
|--SCAN itemlist (~1048576 rows)
`--USE TEMP B-TREE FOR GROUP BY
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 47 0 0 Start at 47
1 SorterOpen 1 2 0 k(1,B) 0
2 Integer 0 4 0 0 r[4]=0; clear abort flag
3 Null 0 7 7 0 r[7..7]=NULL
4 Gosub 6 43 0 0
5 OpenRead 0 2 0 2 0 root=2 iDb=0; itemlist
6 ColumnsUsed 0 0 0 3 0
7 Explain 7 0 0 SCAN itemlist (~1048576 rows) 0
8 Noop 0 0 0 0 Begin WHERE-loop0: itemlist
9 Rewind 0 17 0 0
10 Noop 0 0 0 0 Begin WHERE-core
11 Column 0 0 9 0 r[9]=itemlist.user_id
12 Column 0 1 10 0 r[10]=itemlist.item_count
13 MakeRecord 9 2 11 0 r[11]=mkrec(r[9..10])
14 SorterInsert 1 11 0 0 key=r[11]
15 Noop 0 0 0 0 End WHERE-core
16 Next 0 10 0 1
17 Noop 0 0 0 0 End WHERE-loop0: itemlist
18 OpenPseudo 2 11 2 0 2 columns in r[11]
19 SorterSort 1 46 0 0 GROUP BY sort
20 SorterData 1 11 2 0 r[11]=data
21 Column 2 0 8 0 r[8]=
22 Compare 7 8 1 k(1,B) 0 r[7] <-> r[8]
23 Jump 24 28 24 0
24 Move 8 7 1 0 r[7]=r[8]
25 Gosub 5 36 0 0 output one row
26 IfPos 4 46 0 0 if r[4]>0 then r[4]-=0, goto 46; check abort flag
27 Gosub 6 43 0 0 reset accumulator
28 Column 2 1 12 0 r[12]=itemlist.item_count
29 AggStep 0 12 1 sum(1) 1 accum=r[1] step(r[12])
30 Integer 1 3 0 0 r[3]=1; indicate data in accumulator
31 SorterNext 1 20 0 0
32 Gosub 5 36 0 0 output final row
33 Goto 0 46 0 0
34 Integer 1 4 0 0 r[4]=1; set abort flag
35 Return 5 0 0 0
36 IfPos 3 38 0 0 if r[3]>0 then r[3]-=0, goto 38; Groupby result generator entry point
37 Return 5 0 0 0
38 AggFinal 1 1 0 sum(1) 0 accum=r[1] N=1
39 Le 13 37 1 80 if r[1]<=r[13] goto 37
40 Copy 1 14 0 0 r[14]=r[1]
41 ResultRow 14 1 0 0 output=r[14]
42 Return 5 0 0 0 end groupby result generator
43 Null 0 1 2 0 r[1..2]=NULL
44 Integer 0 3 0 0 r[3]=0; indicate accumulator empty
45 Return 6 0 0 0
46 Halt 0 0 0 0
47 Transaction 0 0 1 0 1 usesStmtJournal=0
48 Integer 50 13 0 0 r[13]=50
49 Goto 0 1 0 0
sqlite>
If you define an appropriate overloaded index then this becomes clearer:
sqlite> create index idx on itemlist(user_id, item_count);
sqlite> SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50;
QUERY PLAN
`--SCAN itemlist USING COVERING INDEX idx (~1048576 rows)
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 39 0 0 Start at 39
1 Noop 1 2 0 0
2 Integer 0 4 0 0 r[4]=0; clear abort flag
3 Null 0 7 7 0 r[7..7]=NULL
4 Gosub 6 35 0 0
5 OpenRead 2 3 0 k(3,,,) 0 root=3 iDb=0; idx
6 ColumnsUsed 2 0 0 3 0
7 Explain 7 0 0 SCAN itemlist USING COVERING INDEX idx (~1048576 rows) 0
8 Noop 0 0 0 0 Begin WHERE-loop0: itemlist
9 Rewind 2 23 9 0 0
10 Noop 0 0 0 0 Begin WHERE-core
11 Column 2 0 8 0 r[8]=itemlist.user_id
12 Compare 7 8 1 k(1,B) 0 r[7] <-> r[8]
13 Jump 14 18 14 0
14 Move 8 7 1 0 r[7]=r[8]
15 Gosub 5 28 0 0 output one row
16 IfPos 4 38 0 0 if r[4]>0 then r[4]-=0, goto 38; check abort flag
17 Gosub 6 35 0 0 reset accumulator
18 Column 2 1 9 0 r[9]=itemlist.item_count
19 AggStep 0 9 1 sum(1) 1 accum=r[1] step(r[9])
20 Integer 1 3 0 0 r[3]=1; indicate data in accumulator
21 Noop 0 0 0 0 End WHERE-core
22 Next 2 10 0 1
23 Noop 0 0 0 0 End WHERE-loop0: itemlist
24 Gosub 5 28 0 0 output final row
25 Goto 0 38 0 0
26 Integer 1 4 0 0 r[4]=1; set abort flag
27 Return 5 0 0 0
28 IfPos 3 30 0 0 if r[3]>0 then r[3]-=0, goto 30; Groupby result generator entry point
29 Return 5 0 0 0
30 AggFinal 1 1 0 sum(1) 0 accum=r[1] N=1
31 Le 10 29 1 80 if r[1]<=r[10] goto 29
32 Copy 1 11 0 0 r[11]=r[1]
33 ResultRow 11 1 0 0 output=r[11]
34 Return 5 0 0 0 end groupby result generator
35 Null 0 1 2 0 r[1..2]=NULL
36 Integer 0 3 0 0 r[3]=0; indicate accumulator empty
37 Return 6 0 0 0
38 Halt 0 0 0 0
39 Transaction 0 0 2 0 1 usesStmtJournal=0
40 Integer 50 10 0 0 r[10]=50
41 Goto 0 1 0 0
sqlite>