Short-circuit evaluation must not be omit depending on source
(1) By vorimi on 2024-05-06 14:47:30 [link] [source]
Hi devs,
I found out that the query compiler generates different instructions if the values are comming from a different source (subquery).
repro table:
CREATE TABLE user (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
name VARCHAR(255) DEFAULT NULL COLLATE NOCASE
);
query /wo subquery source:
select
id,
name
from
user
where
(
name like '%john%'
and regexp_like(
name, '%john%'
)
)
query /w subquery source:
select
id,
name
from
user
where
(
select
(
__atk4_reuse_left__ like __atk4_reuse_right__
and regexp_like(
__atk4_reuse_left__, __atk4_reuse_right__
)
)
from
(
select
name __atk4_reuse_left__,
'%john%' __atk4_reuse_right__
) __atk4_reuse_tmp__
)
The 2nd query is effectively the same and having exactly the same structure.
The regexp_like
is custom & slow UDF.
The 1st query is fast (~15 ms @100k rows). It match the data firstly using LIKE and if the match was successful, it further matches the data using UDF. This is what I expect.
The 2nd query is slow (~150 ms @100k rows). The explain outputs below explain the reason. The 2nd query, even if the "where is exactly the same", is missing short-circuit evaluation in the generated instructions.
I would be grateful if this can be fixed and the IfNot
instructions generated for the 2nd query the same way as for the 1st query.
explain 1st query
addr,opcode,p1,p2,p3,p4,p5,comment
0,Init,0,14,0,,0,
1,OpenRead,0,2,0,2,0,
2,Rewind,0,13,0,,0,
3,Column,0,1,3,NULL,0,
4,Function,1,2,1,like(2),0,
5,IfNot,1,12,1,,0,
6,Column,0,1,4,NULL,0,
7,Function,2,4,1,regexp_like(-1),0,
8,IfNot,1,12,1,,0,
9,Rowid,0,6,0,,0,
10,Column,0,1,7,NULL,0,
11,ResultRow,6,2,0,,0,
12,Next,0,3,0,,1,
13,Halt,0,0,0,,0,
14,Transaction,0,0,1,0,1,
15,String8,0,2,0,%john%,0,
16,String8,0,5,0,%john%,0,
17,Goto,0,1,0,,0,
explain 2nd query:
addr,opcode,p1,p2,p3,p4,p5,comment
0,Init,0,29,0,,0,
1,OpenRead,0,2,0,2,0,
2,Rewind,0,28,0,,0,
3,BeginSubrtn,0,2,0,,0,
4,Null,0,3,3,,0,
5,InitCoroutine,4,10,6,,0,
6,Column,0,1,5,NULL,0,
7,String8,0,6,0,%john%,0,
8,Yield,4,0,0,,0,
9,EndCoroutine,4,0,0,,0,
10,Integer,1,7,0,,0,
11,InitCoroutine,4,0,6,,0,
12,Yield,4,22,0,,0,
13,Copy,6,9,0,,2,
14,Copy,5,10,0,,2,
15,Function,0,9,8,like(2),0,
16,Copy,5,9,0,,2,
17,Copy,6,10,0,,2,
18,Function,0,9,11,regexp_like(-1),0,
19,And,11,8,3,,0,
20,DecrJumpZero,7,22,0,,0,
21,Goto,0,12,0,,0,
22,Return,2,4,1,,0,
23,IfNot,3,27,1,,0,
24,Rowid,0,12,0,,0,
25,Column,0,1,13,NULL,0,
26,ResultRow,12,2,0,,0,
27,Next,0,3,0,,1,
28,Halt,0,0,0,,0,
29,Transaction,0,0,1,0,1,
30,Goto,0,1,0,,0,
(2) By Larry Brasfield (larrybr) on 2024-05-06 14:55:40 in reply to 1 [source]
In general, you should not rely upon short-circuit evaluation for the binary logic operators. To get reliable short-circuit evaluation, the CASE expression should be used.