Unexpected output from the SELECT
(1) By Yu Liang (LY1598773890) on 2021-10-02 21:27:15 [source]
Hi guys!
We encounter the following query, which gives us an unexpected output.
CREATE TABLE v0 ( c1, c2, c3, c4, c5 );
INSERT INTO v0 VALUES ( 1, NULL, 0, 100, 200 );
INSERT INTO v0 VALUES ( 0, 0, 165, NULL, 300 );
/* v1 is a direct copy of v0 */
CREATE TABLE v1 ( c6 UNIQUE, c7, c8, c9, c10 );
INSERT INTO v1 SELECT * FROM v0 AS a28;
SELECT a67.c7 > a67.c9 FROM v1 AS a67, v0 as a66 WHERE a66.c3 = a67.c10 AND a66.c3 IS a67.c6;
/* The subquery, returns NULL */
SELECT * FROM v0 AS a66 WHERE EXISTS ( SELECT a67.c7 > a67.c9 FROM v1 AS a67 WHERE a66.c3 = a67.c10 AND a66.c3 IS a67.c6 );
/* Returns '1| |0|100|200' */
/* Unexpected. Since the WHERE clause returns FALSE, we expect empty output. */
Interestingly, either removing the constraints UNIQUE from column c6, or removing the condition 'AND a66.c3 IS a67.c6' in the subquery, can fix the unexpected result. (The second SELECT returns NULL as expected then)
Bisecting shows that this unexpected output is introduced by Fossil commit: e038ce8955e785af
Detailed bisecting log:
1 BAD 2021-10-02 18:22:24 5906a0152deded61
2 BAD 2021-01-01 01:44:06 e5d7209e118a8453
3 BAD 2019-12-31 23:17:35 eca7ec9cda4606c4
4 BAD 2018-12-31 21:43:55 b57c545a384ab5d6
5 BAD 2017-12-30 18:32:27 01d4e866fb7b01ae
7 BAD 2016-04-11 14:49:39 60ad80e3af4bae9f
8 BAD 2015-09-09 19:27:10 89bfdbfe943adce8
10 BAD 2015-07-03 14:34:25 2b7567229e3ba7e2
11 BAD 2015-06-12 13:04:51 2e8ad2ead9d146e3
12 BAD 2015-05-26 00:28:08 c415bb7bbfd531b1
13 BAD 2015-05-22 12:37:37 a6eb2a39357c35dc
14 BAD 2015-05-16 18:31:44 ee4b74250ad7a406
16 BAD 2015-05-15 20:14:00 1a4628c66c632d2a
17 BAD 2015-05-15 19:59:23 e038ce8955e785af
18 GOOD 2015-05-15 04:13:15 56ef98a04765c34c CURRENT
15 GOOD 2015-05-14 15:39:18 be438d049dd9d7aa
9 GOOD 2015-05-12 12:24:50 07c7d3925cbcf44c
6 GOOD 2014-12-31 20:35:11 ec264bdee5ab8047
Any insights to this query is welcome. Thank you.
(2.1) By Simon Slavin (slavin) on 2021-10-03 13:43:44 edited from 2.0 in reply to 1 [link] [source]
Deleted(3) By Richard Hipp (drh) on 2021-10-03 10:37:07 in reply to 1 [link] [source]
Simplified SQL:
CREATE TABLE t1(a INT); INSERT INTO t1 VALUES(0),(3); CREATE TABLE t2(b INT UNIQUE, c INT); INSERT INTO t2 VALUES(1,4),(0,5); SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE a=c AND a IS b);
The one-row answer returned is incorrect. The correct answer is no rows. This can be seen by disabling the "transitive constraint" query optimization:
CREATE TABLE t1(a INT); INSERT INTO t1 VALUES(0),(3); CREATE TABLE t2(b INT UNIQUE, c INT); INSERT INTO t2 VALUES(1,4),(0,5); .echo on SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE a=c AND a IS b); .testctrl optimizations 0x80 SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE a=c AND a IS b);
The first of the two identical queries returns one row, but the second (with the 0x80 optimization disabled) returns zero rows.
The equivalent PostgreSQL query is:
CREATE TABLE t1(a INT); INSERT INTO t1 VALUES(0),(3); CREATE TABLE t2(b INT UNIQUE, c INT); INSERT INTO t2 VALUES(1,4),(0,5); SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE a=c AND a IS NOT DISTINCT FROM b);
PostgreSQL gives a zero-row result, further confirming that something is wrong.
(4) By Richard Hipp (drh) on 2021-10-04 11:25:22 in reply to 3 [link] [source]
There are two major paths through the query planner in SQLite.
The whereShortCut()
path is a very quick planner that is appropriate
for simple queries (only a single table in the FROM clause), and there
is a slower path that does a much more detailed analysis of the query and
is used for joins and queries against virtual tables and other situations
where a more detailed analysis is appropriate.
The problem described by this thread is essentially the same as the problem
previously described by Wang Ke in
forum thread eb8613976a, except that in this
case the problem was in whereShortCut() rather than in the full-analysis
path.
Both problems should be fixed as of check-in 8b24c177061c3836.
The changes at check-in e038ce8955e785af which the bisect above identified as the problem source actually have nothing to do with the problem at all. That change merely helped to expose the problem.
(5) By Yu Liang (LY1598773890) on 2021-10-04 17:01:59 in reply to 4 [link] [source]
Thank you for the explanation and the reference to the Wang Ke's post.
So this is a bug related to the equivalence transfer optimization. This query is different from Wang Ke's one in the ways that it compares the two tables in its subquery, where the parent query only has one single table in the FROM clause so it uses the whereShortCut() path. The root problem is the same.