SQLite Forum

Unexpected Expression Result by FULL OUTER JOIN
Login

Unexpected Expression Result by FULL OUTER JOIN

(1) By Jinsheng Ba (bajinsheng) on 2022-06-20 07:29:29 [source]

The second SELECT should not return empty as the expression 1 IS TRUE should be evaluated true.

CREATE TABLE t0 (c0, c1 UNIQUE);
CREATE TABLE t1 (c0);
CREATE TABLE t2 (c0, c1);
CREATE TABLE t3 (c1);
INSERT INTO t0(c0, c1) VALUES (2, 1);
INSERT INTO t2 (c1) VALUES ('x');
CREATE INDEX i0 ON t0(c0) WHERE c1 ISNULL;
ANALYZE main;

SELECT * FROM t3 LEFT OUTER JOIN t1 INNER JOIN t0 ON (t1.c0 IN (t3.c1)) FULL OUTER JOIN t2 WHERE 1; -- |||||x
SELECT 1 IS TRUE FROM t3 LEFT OUTER JOIN t1 INNER JOIN t0 ON (t1.c0 IN (t3.c1)) FULL OUTER JOIN t2; -- {} Empty result

Environment:

Version: 3.39.0 (Unreleased)

Commit ID: 11162446

(2) By Chris Locke (chrisjlocke1) on 2022-06-20 08:10:07 in reply to 1 [link] [source]

Does sqlite support full outer join? omitted.html would suggest it doesn't? https://www.sqlite.org/omitted.html

(3) By ddevienne on 2022-06-20 08:12:34 in reply to 2 [link] [source]

The key part is in bold below:

Version: 3.39.0 (Unreleased)

(4) By Ryan Smith (cuz) on 2022-06-20 09:09:16 in reply to 1 [link] [source]

the expression 1 IS TRUE should be evaluated true

Strange - is this an actual rule in the documentation?

I always thought that "IS" was a more strict check than "=" so that TRUE might semantically equal 1, but they are technically not the same type of thing (Boolean vs. Integer) which IS should evaluate as a mismatch, I thought.

Meaning:

  • (NULL = FALSE) --> FALSE
  • (NULL = NULL) --> FALSE
  • (NULL IS NULL) --> TRUE
  • (1 = TRUE) --> TRUE
  • (1 IS TRUE) --> FALSE
  • (1 IS 1) --> TRUE

I can see that in older SQLites "1 IS TRUE" definitely evaluated to TRUE, so I'm not questioning the veracity of the statement, more if the convention is correct and documented somewhere.

(5) By ddevienne on 2022-06-20 09:51:38 in reply to 4 [link] [source]

they are technically not the same type of thing (Boolean vs. Integer)

From https://www.sqlite.org/datatype3.html (bold emphasis mine):

SQLite recognizes the keywords "TRUE" and "FALSE", as of version 3.23.0 (2018-04-02)
but those keywords are really just alternative spellings for the integer literals 1 and 0 respectively.

There's no Boolean type in SQLite.

(6) By Richard Hipp (drh) on 2022-06-20 10:02:54 in reply to 1 [link] [source]

Alternative script:

CREATE TABLE t1(aa INT);
CREATE TABLE t2(bb INT);
CREATE TABLE t3(cc INT, dd INT UNIQUE);
CREATE TABLE t4(ee INT, ff TEXT);
INSERT INTO t3(cc, dd) VALUES (2, 1);
INSERT INTO t4(ff) VALUES ('x');
CREATE INDEX i0 ON t3(cc) WHERE dd ISNULL;
ANALYZE main;
SELECT * FROM t1 LEFT JOIN t2 ON true JOIN t3 ON (bb IN (aa)) FULL JOIN t4 ON true;
SELECT 1 FROM t1 LEFT JOIN t2 ON true JOIN t3 ON (bb IN (aa)) FULL JOIN t4 ON true;

The only difference in the two queries is the "*" versus the "1" for the result set. And yet the first query returns one row (which is correct) and the second query returns zero rows (incorrect).

The problem appears to be that the query optimizer is transforming the second query into:

SELECT 1 FROM t1 JOIN t3 ON (bb IN (aa)) CROSS JOIN t2 FULL JOIN t4 ON true;

But this transformation is incorrect because the "ON (bb IN (aa))" clause references the t2 table which is to its right. Such a transformation used to be ok back when the only kind of outer join was a LEFT JOIN, which is why SQLite has historically allowed it. But now with introduction of FULL JOIN, it no longer works.

(7) By Richard Hipp (drh) on 2022-06-20 18:37:15 in reply to 6 [link] [source]

A better test case:

CREATE TABLE t1(a1 INT);
CREATE TABLE t2(b2 INT);
CREATE TABLE t3(c3 INT, d3 INT UNIQUE);
CREATE TABLE t4(e4 INT, f4 TEXT);
INSERT INTO t3(c3, d3) VALUES (2, 1);
INSERT INTO t4(f4) VALUES ('x');
CREATE INDEX i0 ON t3(c3) WHERE d3 ISNULL;
ANALYZE main;
SELECT * FROM t1 LEFT JOIN t2 ON true JOIN t3 ON (b2 IN (a1)) FULL JOIN t4 ON true;
SELECT 1 FROM t1 LEFT JOIN t2 ON true JOIN t3 ON (b2 IN (a1)) FULL JOIN t4 ON true;

This test case is better because:

  • It also runs on PostgreSQL (modulo the "ANALYZE" statement) so we can easily verify that PostgreSQL and SQLite get the same answer.

  • The tables are numbers in consecutive order in the query. t1, t2, t3, t4.

  • Column names are all short and unique and contain a numeric identifier that easily associates them with their corresponding table.

This is working on trunk as of check-in 238453ffab0ba1bd.

(8) By Jinsheng Ba (bajinsheng) on 2022-06-21 01:20:41 in reply to 7 [link] [source]

Thanks!