problem report: incorrect join behavior
(1) By Mark Brand (mabrand) on 2021-07-22 10:45:44 [link]
Hi, In version 3.36 and the latest snapshot, it looks like a LEFT JOIN behaves like a plain JOIN in some circumstances. I have the following minimal case, which returns 1 row instead of the expected 2: SELECT *, sqlite_version() FROM ( SELECT NULL val FROM (SELECT 1) UNION ALL SELECT 'missing' FROM (SELECT 1) ) a LEFT JOIN (SELECT 1) ON a.val IS NULL /* returns val 1 sqlite_version() 1 3.37.0 */
(2) By David Raymond (dvdraymond) on 2021-07-22 13:20:56 in reply to 1 [link]
See [this optimization](https://www.sqlite.org/optoverview.html#leftjoinreduction) > A LEFT JOIN can sometimes be converted into an ordinary JOIN if there are terms in the WHERE clause that guarantee that the two joins will give identical results. In particular, if any column in the right-hand table of the LEFT JOIN must be non-NULL in order for the WHERE clause to be true, then the LEFT JOIN is demoted to an ordinary JOIN. In your case the ON condition only references the left table, so there's no point in it being an outer join, and thus it treats it as an inner join.
(3) By Rico Mariani (rmariani) on 2021-07-22 13:36:50 in reply to 2 [link]
That seems to be wrong. The ON condition of a left join should only result in left columns being null, no rows may be removed from a left join. Even if you did `x left join y on 0` you still get all of `x`. ``` select * from (select 1) a left outer join (select 2) b on false; 1| ``` So I think the optimization is too aggressive.
(4) By Rico Mariani (rmariani) on 2021-07-22 13:39:06 in reply to 2 [link]
Sorry that should have been "right columns being NULL" not left.
(5.1) By Richard Hipp (drh) on 2021-07-22 23:10:09 edited from 5.0 in reply to 1 [link]
A [bisect] shows that the error was introduced [here] ([2021-02-26]). : src:/timeline?nc&bid=ya26b6597e3nacd63062eby5f8bf99579n99aab32da1y05dbea9b26y15795a96a8na631c38d22sde9c86c9e4y310dac342enccb8cf5256ne4d1970ef1 : src:/vdiff?from=310dac342e7b1f9b&to=e4d1970ef17b2330&nc : src:/timeline?n=20&c=e4d1970ef17b2330 Here is the cross-platform test case: > ~~~ CREATE TABLE onerow(x INT); INSERT INTO onerow(x) VALUES(0); SELECT * FROM ( SELECT null AS aaa FROM onerow UNION ALL SELECT 'missing' AS aaa FROM onerow ) AS a LEFT JOIN (SELECT 1 AS bbb) AS b ON a.aaa IS NULL; ~~~
(6) By Richard Hipp (drh) on 2021-07-22 14:14:03 in reply to 5.0 [link]
Simplified test case that omits the UNION ALL: > ~~~ CREATE TABLE onerow(x INT); INSERT INTO onerow(x) VALUES(0); SELECT * FROM (SELECT 'missing' AS aaa FROM onerow) AS a LEFT JOIN (SELECT 1 AS bbb) AS b ON a.aaa IS NULL; ~~~ The above should return one row of output, and it does in PostgreSQL, MySQL, SQL Server, and SQLite version 3.34.0 and earlier. But in 3.35.0 through the latest check-in, it returns zero rows.
(7) By Mark Brand (mabrand) on 2021-07-22 14:39:53 in reply to 6 [link]
Interesting; I didn't think to try it without the UNION. I suspected UNION ALL had something to do with triggering the problem, because plain UNION does not show the bad behavior, i.e. this returns both rows: SELECT *, sqlite_version() FROM ( SELECT NULL val FROM (SELECT 1) UNION SELECT 'missing' FROM (SELECT 1) ) LEFT JOIN (SELECT 1) ON val IS NULL;
(8) By Richard Hipp (drh) on 2021-07-22 14:55:48 in reply to 5.0 [link]
Another simplification: > ~~~ CREATE TABLE t1(a INT); INSERT INTO t1(a) VALUES(1); CREATE TABLE t2(b INT); SELECT * FROM (SELECT 3 AS c FROM t1) AS t3 LEFT JOIN t2 ON c IS NULL; ~~~ Should give one row of output, but it fact gives zero rows.
(9) By Richard Hipp (drh) on 2021-07-22 16:23:58 in reply to 1 [link]
This problem originated in our efforts to resolve a performance issue described by an [earlier forum post from Jinho Jung]. I think the problem has now been resolved by [check-in 1f6796044008e6f3], though testing is still on-going. : /forumpost/a5fc33cf9d8f4ff5 : src:/info/1f6796044008e6f3
(10) By Mark Brand (mabrand) on 2021-07-23 10:01:21 in reply to 9
Just want to confirm that this patch also fixes the error in the application query where I first encountered it. By the way, is it clear why the problem did not manifest with a plain UNION, but did show itself with UNION ALL or no union at all?