Help rewriting a query to match the speed of a 2nd query

(1) By jose isaias cabrera (jicman) on 2021-03-24 13:39:53 [link] [source]

Greetings!

This query is very slow:

SELECT a.ProjID, total(b.AnnualDossier) AS Dossier FROM Project_List AS a 
LEFT JOIN ABT_Budget AS b ON a.ProjID = b.ProjID
AND b.InsertDate = 
(
    SELECT MAX(InsertDate) FROM ABT_Budget WHERE b.ProjID = ProjID
    OR
    b.InsertDate IS Null
)
WHERE a.ProjID IN 
(
           'PR0000014888',
           'PR0000015204',
           'PR0000016639',
           'PR0000016678',
           'PR0000016724'
)
AND a.InsertDate = 
(
    SELECT MAX(InsertDate) FROM Project_List WHERE ProjID = a.ProjID
)
GROUP BY a.ProjID;
PR0000014888|0.0
PR0000015204|0.0
PR0000016639|0.0
PR0000016678|24939.539891
PR0000016724|0.0
Run Time: real 18.511 user 5.859375 sys 12.593750
sqlite>

while this one is super fast:

SELECT a.ProjID, total(b.AnnualDossier) AS Dossier FROM Project_List AS a 
LEFT JOIN ABT_Budget AS b ON a.ProjID = b.ProjID
AND b.InsertDate = 
(
    SELECT MAX(InsertDate) FROM ABT_Budget  
    WHERE b.ProjID IN 
    (
               'PR0000014888',
               'PR0000015204',
               'PR0000016639',
               'PR0000016678',
               'PR0000016724'
    )
    OR
    b.InsertDate IS Null
)
WHERE a.ProjID IN 
(
           'PR0000014888',
           'PR0000015204',
           'PR0000016639',
           'PR0000016678',
           'PR0000016724'
)
AND a.InsertDate = 
(
    SELECT MAX(InsertDate) FROM Project_List WHERE ProjID = a.ProjID
)
GROUP BY a.ProjID;
PR0000014888|0.0
PR0000015204|0.0
PR0000016639|0.0
PR0000016678|0.0
PR0000016724|0.0
Run Time: real 0.032 user 0.015625 sys 0.000000

Programmatically, I can do this easily, but is there a way to rewrite the second without the the duplication of IN groups? Thanks.

josé

(2) By David Raymond (dvdraymond) on 2021-03-24 14:09:01 in reply to 1 [source]

but is there a way to rewrite the second without the the duplication of IN groups?

If you mean just without having to type/bind the values twice, then how about

with ProjIDs as (values ('PR0000014888'),
                        ('PR0000015204'),
                        ('PR0000016639'),
                        ('PR0000016678'),
                        ('PR0000016724'))
select ...
where b.ProjID in ProjIDs
...
where a.ProjID in ProjIDs
...

(3) By jose isaias cabrera (jicman) on 2021-03-24 14:28:46 in reply to 2 [link] [source]

If you mean just without having to type/bind the values twice, then how about

Thanks. The WITH is so elegant.

josé

(4) By Keith Medcalf (kmedcalf) on 2021-03-24 17:33:56 in reply to 3 [link] [source]

Except it is a stupid suggestion. Your two queries produce different results because they are, well, different. Which one is correct?

Note that a SCALAR SUBQUERY only returns the FIRST row retrieved so trying to have it return multiple rows is asinine stupid and will not have the effect intended. (Well, actually it will, the result intended being Garbage In Garbage Out, and that will be the result).

You are probably missing an index.

(5) By Keith Medcalf (kmedcalf) on 2021-03-24 17:37:45 in reply to 4 [link] [source]

You probably want to get rid of the "OR b.InsertDate IS Null" since it makes zero sense being in the correlate.