Confusion of some subquery results
(1.2) By Yu Liang (LY1598773890) on 2021-04-29 04:35:03 edited from 1.1 [source]
For query:
CREATE TABLE v0 ( v2 CHAR(30), v1 CHAR(30) );
INSERT INTO v0 ( v1 ) VALUES ( 10 ),( 0 );
CREATE INDEX v19 ON v0 ( v2, v1 );
INSERT INTO v0 VALUES ( 10, 10 );
INSERT INTO v0 VALUES ( 0, 10 );
SELECT * FROM v0;
SELECT '----------------------------------------';
SELECT * FROM v0 WHERE ( ( v1 IN ( SELECT v2 FROM v0 ORDER BY v2 ) ));
SELECT '----------------------------------------';
SELECT * FROM v0 WHERE ( v2 = 0 );
SELECT '----------------------------------------';
SELECT * FROM v0 WHERE ( ( v1 IN ( SELECT v2 FROM v0 ORDER BY v2 ) ) AND v2 = 0);
We got the result:
|10
|0
10|10
0|10
------------
|10
|0
10|10
0|10
------------
0|10
------------
(EMPTY RESULT)
Since the second and the third SELECT statements return some results, we expect some results being outputted from the last SELECT statement too. However, there is no outputs from the last SELECT statement.
A few more interesting behaviors that we observe is that:
-If we delete the CREATE INDEX statement, the inconsistency is gone and the last SELECT successfully output 0|10 just like the third SELECT statement.
-If we change the data type of v1, v2 to be INT, the inconsistency is also gone and the last SELECT successfully output 0|10 just like the third SELECT statement.
We are not sure whether this is a bug or this is just an affinity behavior that is known and expected. We are really appreciated for any explanation provided. Thank you.
(2) By Richard Hipp (drh) on 2021-04-29 11:56:28 in reply to 1.2 [link] [source]
This is a bug in the affinity behavior, more succinctly demonstrated by the following test script:
CREATE TABLE t1(a TEXT, b TEXT); INSERT INTO t1 VALUES(null,10),(0,10),(10,10); CREATE INDEX t1ab ON t1(a,b); .print -------- expect one row -------- SELECT * FROM t1 WHERE b in (SELECT a FROM t1) AND a=0;
All of the data is text (because of the TEXT types on all columns). So if you change the final comparison from "a=0" to "a='0'" it works. And that is what most applications would do. So this is not a serious bug. But it is still a bug and will be fixed shortly.
A bisects shows that the problem originated with the new "IN-early-out" optimization on the IN operator that is described by item 3b of the version 3.25.0 change-log.
(3) By Yu Liang (LY1598773890) on 2021-04-29 14:35:42 in reply to 2 [link] [source]
Thank you for the explanation and the confirming.
(4) By Richard Hipp (drh) on 2021-04-29 15:51:39 in reply to 1.2 [link] [source]
Fixed by check-in eb40248ce606b792.