Reuse HAVING clause computation in the existing table

(1) By a_aniq on 2022-01-22 14:25:27 [link] [source]

I want to have the option to append another column of the calculation done by the HAVING clause to the existing table using an optional parameter. Currently the query has to run calculations twice to create the grouped table.

(2) By MBL (UserMBL) on 2022-01-22 16:31:53 in reply to 1 [link] [source]

can you please supply an example? best with schema and output of explain and explain query plan. For me it is still unclear what you see as being wrong.

(3) By a_aniq on 2022-01-24 02:00:53 in reply to 2 [link] [source]

Currently you have to type this query:

SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50

Here SUM(item_count) is being computed twice

Instead if we could have something like:

SELECT totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50 APPEND AS totalitems

Here the SUM is computed only once and appended to the end of the table after filtering is done. Here the computation is simple, but consider the unnecessary additional overhead with the first command when the computation gets complex.

(4.1) By SeverKetor on 2022-01-24 03:06:29 edited from 4.0 in reply to 3 [link] [source]

Am I missing something, or wouldn't

SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING totalitems > 50

do exactly what you want already? I don't have a handy test DB around, but it worked just fine with a query using COUNT instead of SUM.

Ah, I should have held off on posting. I did a little more testing, and even if my suggestion didn't work (which it does), it's a non-issue anyway. I made a dummy function that prints its argument before returning it, and SQLite was already only computing the SUM once (even though the function wasn't marked deterministic?).

>>> list(db.execute('SELECT a, SUM(testfunction(b)) AS c FROM test GROUP BY a HAVING SUM(testfunction(b)) > 1'))
2
3
1
5
6
[(1, 5), (3, 11)]

It only recalculated the SUM when I added a second test function for the HAVING clause. I'd also like to know if that reuse of the SUM values from non-deterministic functions was intentional

(6) By Keith Medcalf (kmedcalf) on 2022-01-24 03:04:26 in reply to 4.0 [link] [source]

This is exactly the same except that it requires more typing ...

(7) By Keith Medcalf (kmedcalf) on 2022-01-24 03:12:01 in reply to 4.0 [source]

You could accomplish this as follows, but notice that it is more inefficient than simply expressing what you want directly.

sqlite> select item_sum from (select sum(item_count) as item_sum from itemlist group by user_id) where item_sum > 50;
QUERY PLAN
|--CO-ROUTINE SUBQUERY 1
|  `--SCAN itemlist USING COVERING INDEX idx (~1048576 rows)
`--SCAN SUBQUERY 1 (~88 rows)
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     53    0                    0   Start at 53
1     InitCoroutine  1     40    2                    0   SUBQUERY 1
2     Noop           2     2     0                    0
3     Integer        0     5     0                    0   r[5]=0; clear abort flag
4     Null           0     8     8                    0   r[8..8]=NULL
5     Gosub          7     36    0                    0
6     OpenRead       3     3     0     k(3,,,)        0   root=3 iDb=0; idx
7     ColumnsUsed    3     0     0     3              0
8     Explain        8     0     0     SCAN itemlist USING COVERING INDEX idx (~1048576 rows)  0
9     Noop           0     0     0                    0   Begin WHERE-loop0: itemlist
10    Rewind         3     24    10    0              0
11      Noop           0     0     0                    0   Begin WHERE-core
12      Column         3     0     9                    0   r[9]=itemlist.user_id
13      Compare        8     9     1     k(1,B)         0   r[8] <-> r[9]
14      Jump           15    19    15                   0
15      Move           9     8     1                    0   r[8]=r[9]
16      Gosub          6     29    0                    0   output one row
17      IfPos          5     39    0                    0   if r[5]>0 then r[5]-=0, goto 39; check abort flag
18      Gosub          7     36    0                    0   reset accumulator
19      Column         3     1     10                   0   r[10]=itemlist.item_count
20      AggStep        0     10    2     sum(1)         1   accum=r[2] step(r[10])
21      Integer        1     4     0                    0   r[4]=1; indicate data in accumulator
22      Noop           0     0     0                    0   End WHERE-core
23    Next           3     11    0                    1
24    Noop           0     0     0                    0   End WHERE-loop0: itemlist
25    Gosub          6     29    0                    0   output final row
26    Goto           0     39    0                    0
27    Integer        1     5     0                    0   r[5]=1; set abort flag
28    Return         6     0     0                    0
29    IfPos          4     31    0                    0   if r[4]>0 then r[4]-=0, goto 31; Groupby result generator entry point
30    Return         6     0     0                    0
31    AggFinal       2     1     0     sum(1)         0   accum=r[2] N=1
32    Le             11    30    2     BINARY-8       80  if r[2]<=r[11] goto 30
33    Copy           2     12    0                    0   r[12]=r[2]
34    Yield          1     0     0                    0
35    Return         6     0     0                    0   end groupby result generator
36    Null           0     2     3                    0   r[2..3]=NULL
37    Integer        0     4     0                    0   r[4]=0; indicate accumulator empty
38    Return         7     0     0                    0
39    EndCoroutine   1     0     0                    0
40    Explain        40    0     0     SCAN SUBQUERY 1 (~88 rows)  0
41    Noop           0     0     0                    0   Begin WHERE-loop0: subquery_1
42    InitCoroutine  1     0     2                    0
43      Yield          1     51    0                    0   next row of subquery_1
44      Copy           12    13    0                    0   r[13]=r[12]; subquery_1.item_sum
45      Le             11    50    13    BINARY-8       80  if r[13]<=r[11] goto 50
46      Noop           0     0     0                    0   Begin WHERE-core
47      Copy           12    14    0                    0   r[14]=r[12]; subquery_1.item_sum
48      ResultRow      14    1     0                    0   output=r[14]
49      Noop           0     0     0                    0   End WHERE-core
50    Goto           0     43    0                    0
51    Noop           0     0     0                    0   End WHERE-loop0: subquery_1
52    Halt           0     0     0                    0
53    Transaction    0     0     2     0              1   usesStmtJournal=0
54    Integer        50    11    0                    0   r[11]=50
55    Goto           0     1     0                    0
sqlite>

(5) By Keith Medcalf (kmedcalf) on 2022-01-24 03:03:27 in reply to 3 [link] [source]

This is untrue. Yopu have to type the aggregate function twice, but it is only calculated once:

SQLite version 3.38.0 2022-01-22 22:55:14
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table itemlist (user_id, item_count);
sqlite> .eqp full
sqlite> SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50;
QUERY PLAN
|--SCAN itemlist (~1048576 rows)
`--USE TEMP B-TREE FOR GROUP BY
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     47    0                    0   Start at 47
1     SorterOpen     1     2     0     k(1,B)         0
2     Integer        0     4     0                    0   r[4]=0; clear abort flag
3     Null           0     7     7                    0   r[7..7]=NULL
4     Gosub          6     43    0                    0
5     OpenRead       0     2     0     2              0   root=2 iDb=0; itemlist
6     ColumnsUsed    0     0     0     3              0
7     Explain        7     0     0     SCAN itemlist (~1048576 rows)  0
8     Noop           0     0     0                    0   Begin WHERE-loop0: itemlist
9     Rewind         0     17    0                    0
10      Noop           0     0     0                    0   Begin WHERE-core
11      Column         0     0     9                    0   r[9]=itemlist.user_id
12      Column         0     1     10                   0   r[10]=itemlist.item_count
13      MakeRecord     9     2     11                   0   r[11]=mkrec(r[9..10])
14      SorterInsert   1     11    0                    0   key=r[11]
15      Noop           0     0     0                    0   End WHERE-core
16    Next           0     10    0                    1
17    Noop           0     0     0                    0   End WHERE-loop0: itemlist
18    OpenPseudo     2     11    2                    0   2 columns in r[11]
19    SorterSort     1     46    0                    0   GROUP BY sort
20      SorterData     1     11    2                    0   r[11]=data
21      Column         2     0     8                    0   r[8]=
22      Compare        7     8     1     k(1,B)         0   r[7] <-> r[8]
23      Jump           24    28    24                   0
24      Move           8     7     1                    0   r[7]=r[8]
25      Gosub          5     36    0                    0   output one row
26      IfPos          4     46    0                    0   if r[4]>0 then r[4]-=0, goto 46; check abort flag
27      Gosub          6     43    0                    0   reset accumulator
28      Column         2     1     12                   0   r[12]=itemlist.item_count
29      AggStep        0     12    1     sum(1)         1   accum=r[1] step(r[12])
30      Integer        1     3     0                    0   r[3]=1; indicate data in accumulator
31    SorterNext     1     20    0                    0
32    Gosub          5     36    0                    0   output final row
33    Goto           0     46    0                    0
34    Integer        1     4     0                    0   r[4]=1; set abort flag
35    Return         5     0     0                    0
36    IfPos          3     38    0                    0   if r[3]>0 then r[3]-=0, goto 38; Groupby result generator entry point
37    Return         5     0     0                    0
38    AggFinal       1     1     0     sum(1)         0   accum=r[1] N=1
39    Le             13    37    1                    80  if r[1]<=r[13] goto 37
40    Copy           1     14    0                    0   r[14]=r[1]
41    ResultRow      14    1     0                    0   output=r[14]
42    Return         5     0     0                    0   end groupby result generator
43    Null           0     1     2                    0   r[1..2]=NULL
44    Integer        0     3     0                    0   r[3]=0; indicate accumulator empty
45    Return         6     0     0                    0
46    Halt           0     0     0                    0
47    Transaction    0     0     1     0              1   usesStmtJournal=0
48    Integer        50    13    0                    0   r[13]=50
49    Goto           0     1     0                    0
sqlite>

If you define an appropriate overloaded index then this becomes clearer:

sqlite> create index idx on itemlist(user_id, item_count);
sqlite> SELECT SUM(item_count) AS totalitems FROM itemlist GROUP BY user_id HAVING SUM(item_count) > 50;
QUERY PLAN
`--SCAN itemlist USING COVERING INDEX idx (~1048576 rows)
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     39    0                    0   Start at 39
1     Noop           1     2     0                    0
2     Integer        0     4     0                    0   r[4]=0; clear abort flag
3     Null           0     7     7                    0   r[7..7]=NULL
4     Gosub          6     35    0                    0
5     OpenRead       2     3     0     k(3,,,)        0   root=3 iDb=0; idx
6     ColumnsUsed    2     0     0     3              0
7     Explain        7     0     0     SCAN itemlist USING COVERING INDEX idx (~1048576 rows)  0
8     Noop           0     0     0                    0   Begin WHERE-loop0: itemlist
9     Rewind         2     23    9     0              0
10      Noop           0     0     0                    0   Begin WHERE-core
11      Column         2     0     8                    0   r[8]=itemlist.user_id
12      Compare        7     8     1     k(1,B)         0   r[7] <-> r[8]
13      Jump           14    18    14                   0
14      Move           8     7     1                    0   r[7]=r[8]
15      Gosub          5     28    0                    0   output one row
16      IfPos          4     38    0                    0   if r[4]>0 then r[4]-=0, goto 38; check abort flag
17      Gosub          6     35    0                    0   reset accumulator
18      Column         2     1     9                    0   r[9]=itemlist.item_count
19      AggStep        0     9     1     sum(1)         1   accum=r[1] step(r[9])
20      Integer        1     3     0                    0   r[3]=1; indicate data in accumulator
21      Noop           0     0     0                    0   End WHERE-core
22    Next           2     10    0                    1
23    Noop           0     0     0                    0   End WHERE-loop0: itemlist
24    Gosub          5     28    0                    0   output final row
25    Goto           0     38    0                    0
26    Integer        1     4     0                    0   r[4]=1; set abort flag
27    Return         5     0     0                    0
28    IfPos          3     30    0                    0   if r[3]>0 then r[3]-=0, goto 30; Groupby result generator entry point
29    Return         5     0     0                    0
30    AggFinal       1     1     0     sum(1)         0   accum=r[1] N=1
31    Le             10    29    1                    80  if r[1]<=r[10] goto 29
32    Copy           1     11    0                    0   r[11]=r[1]
33    ResultRow      11    1     0                    0   output=r[11]
34    Return         5     0     0                    0   end groupby result generator
35    Null           0     1     2                    0   r[1..2]=NULL
36    Integer        0     3     0                    0   r[3]=0; indicate accumulator empty
37    Return         6     0     0                    0
38    Halt           0     0     0                    0
39    Transaction    0     0     2     0              1   usesStmtJournal=0
40    Integer        50    10    0                    0   r[10]=50
41    Goto           0     1     0                    0
sqlite>