Incorrect deletion caused by subquery in where clause
(1) By Zuming Jiang (zuming_jiang) on 2023-03-15 14:42:59 [source]
Set up the database
CREATE TABLE t0 (vkey INTEGER , pkey INTEGER , c1 INTEGER);
INSERT INTO t0 VALUES(2,1,-20);
INSERT INTO t0 VALUES(2,2,NULL);
INSERT INTO t0 VALUES(2,3,0);
INSERT INTO t0 VALUES(8,4,95);
Test Case 1
delete from t0
where
(t0.vkey <= t0.c1) and
(t0.vkey <> (select vkey from t0 order by vkey limit 1 offset 2));
--- delete 1 row: {8|4|95}
It deletes 1 row {8|4|95}
.
Then I negate the predicate (WHERE clause). The test case becomes:
Test Case 2
delete from t0
where not (
(t0.vkey <= t0.c1) and
(t0.vkey <> (select vkey from t0 order by vkey limit 1 offset 2))
);
--- delete 3 row: {2|1|-20}, {2|3|0}, {8|4|95}
Test Case 2 should not delete the rows deleted by Test Case 1.
However, Test Case 2 deletes the row {8|4|95}
, which is also deleted by Test Case 1.
(2) By Keith Medcalf (kmedcalf) on 2023-03-15 16:29:49 in reply to 1 [link] [source]
Fascinating because a select works properly.
CREATE TABLE t0 (vkey INTEGER , pkey INTEGER , c1 INTEGER);
INSERT INTO t0 VALUES(2,1,-20);
INSERT INTO t0 VALUES(2,2,NULL);
INSERT INTO t0 VALUES(2,3,0);
INSERT INTO t0 VALUES(8,4,95);
select *
from t0
where (
(t0.vkey <= t0.c1) and
(t0.vkey <> (select vkey from t0 order by vkey limit 1 offset 2))
);
select *
from t0
where not (
(t0.vkey <= t0.c1) and
(t0.vkey <> (select vkey from t0 order by vkey limit 1 offset 2))
);
sqlite> select *
...> from t0
...> where (
(x1...> (t0.vkey <= t0.c1) and
(x1...> (t0.vkey <> (select vkey from t0 order by vkey limit 1 offset 2))
(x1...> );
QUERY PLAN
|--SCAN t0 (~917504 rows)
`--SCALAR SUBQUERY 1
|--SCAN t0 (~1048576 rows)
`--USE TEMP B-TREE FOR ORDER BY
┌──────┬──────┬────┐
│ vkey │ pkey │ c1 │
├──────┼──────┼────┤
│ 8 │ 4 │ 95 │
└──────┴──────┴────┘
VM-steps: 91
Run Time: real 0.012 user 0.000000 sys 0.000000
sqlite>
sqlite> select *
...> from t0
...> where not (
(x1...> (t0.vkey <= t0.c1) and
(x1...> (t0.vkey <> (select vkey from t0 order by vkey limit 1 offset 2))
(x1...> );
QUERY PLAN
|--SCAN t0 (~983040 rows)
`--SCALAR SUBQUERY 1
|--SCAN t0 (~1048576 rows)
`--USE TEMP B-TREE FOR ORDER BY
┌──────┬──────┬──────┐
│ vkey │ pkey │ c1 │
├──────┼──────┼──────┤
│ 2 │ 1 │ -20 │
│ 2 │ 2 │ NULL │
│ 2 │ 3 │ 0 │
└──────┴──────┴──────┘
VM-steps: 108
Run Time: real 0.016 user 0.000000 sys 0.000000
sqlite>
(3) By Richard Hipp (drh) on 2023-03-15 16:53:57 in reply to 2 [link] [source]
The problem appears to be that the subquery in the WHERE clause is delayed until after one of the rows is deleted (due to the short-curcuit logic in the AND operator) and hence gets the wrong answer. We are working the problem.
(4) By Richard Hipp (drh) on 2023-03-15 18:31:14 in reply to 2 [link] [source]
Should now be patched on trunk.
(5) By Zuming Jiang (zuming_jiang) on 2023-03-15 19:47:51 in reply to 4 [link] [source]
Thanks for fixing it!
(6) By anonymous on 2023-03-16 02:05:07 in reply to 3 [link] [source]
Why isn't my AND operator's short circuit logic working ?
select (unixepoch() & 0) and (0 ->> '$ERROR');
And I got :
Runtime error: JSON path error near 'ERROR'
(7.1) By Larry Brasfield (larrybr) on 2023-03-16 02:17:55 edited from 7.0 in reply to 6 [link] [source]
Please refrain from thread-hijacking. It is widely considered rude, including here.
(8.2) By Keith Medcalf (kmedcalf) on 2023-03-16 04:12:49 edited from 8.1 in reply to 6 [link] [source]
Despite the impolite thread hijacking, the short-circuit evaluation of expressions of the form a AND b or c OR d cannot be short-circuited during projection expression evaluation.
In the expression a AND b the result (True/False/NULL) of the overall expression depends on the result of the evaluation of both "a" and "b".
However, when used in the WHERE clause, the expression a AND b can only result in True or False, where null means False. Therefore only if the LHS is True (in the case of AND, or NULL/False in the case of OR) does the RHS need to be evaluated.
The same short-circuit can be applied in CHECK constraints with the difference that NULL means True for the purposes of short-cicuiting evaluation of the RHS of the condition.
Note that it is not entirely true that short-circuiting could not be done in the projection evaluation. If the LHS of an AND is False, the result will be False, and if the LHS of an OR is True then the result will be True, and there is no need to evaluate the RHS expression. However, no short-circuiting is done in the projection expressions -- if you want that you have to code it explicitly.
(9) By Larry Brasfield (larrybr) on 2023-03-16 04:45:10 in reply to 8.2 [link] [source]
Despite hesitance to contribute to thread hijacking, I add this which may get the new topic to move to where it is on-topic.
The "short-circuiting" possibilities were discussed several months ago, with one result being a reliable way to attain that behavior. If there is more to be said on the topic, that would be a good place to expand (and expound) upon it.