problem report: incorrect left join behavior
(1) By anonymous on 2022-05-01 22:34:56 [link] [source]
SQLITE version 3.23.1 Left Join did not include all rows from the left table. In other words, left join behaved like Inner Join! Ideally, the row of the left table should have shown Null for matches not found. I am stuck to use DbVisualizer 10.0.27 to create a database, as such cannot upgrade to a later version of sqlite. Would appreciate it if any guidance is provided on what can be done so that all rows from the left table are listed. Thanks, PK
(2) By Richard Hipp (drh) on 2022-05-01 22:44:02 in reply to 1 [link] [source]
Do you have a test case that demonstrates the problem?
(3) By Larry Brasfield (larrybr) on 2022-05-01 23:00:50 in reply to 1 [link] [source]
I built the v3.23.1 CLI and submitted this input:
create table Lefty(name text, occupation text);
create table Righty(name text, age int);
insert into Lefty values ('Ike', 'general'), ('Santa', 'deliverer');
insert into Righty values ('Ike', 131);
select l.name, l.occupation, r.age from
Lefty l left join Righty r on l.name=r.name;
This is the result:
Ike|general|131
Santa|deliverer|
As far as I can see, this is correct. Perhaps you could share a minimal example of some table definitions, some data for them, and a query which demonstrates the "incorrect left join behavior" you report.
(4) By anonymous on 2022-05-02 00:34:14 in reply to 3 [link] [source]
Hi Larry Brasfield, thanks for your response. When I used the left join on a test data, I got the correct result. For some reason, result of real data shows incorrect records count. Wonder what could be an issue. I will dig into data and see whether can find any bug. Once again, thanks for your help. Appreciate it. Table: LEFTY store area S1 A1 S2 A1 S3 A1 S4 A2 S5 A1 Table: RIGHTY area sku price A1 P1 10 A1 P3 11 A2 P5 13 A2 P7 15 SQL: select l.store, l.area, r.sku, r.area, r.price from Lefty l left join Righty r on l.area=r.out_area; SQL RESULT store area sku area price S1 A1 P1 A1 10 S1 A1 P3 A1 11 S2 A1 P1 A1 10 S2 A1 P3 A1 11 S3 A1 P1 A1 10 S3 A1 P3 A1 11 S4 A2 P5 A2 13 S4 A2 P7 A2 15 S5 A1 P1 A1 10 S5 A1 P3 A1 11 Best, PK
(5.1) By Aask (AAsk1902) on 2022-05-02 08:16:56 edited from 5.0 in reply to 4 [link] [source]
Wonder what could be an issue.
In the column(s) that you are joining with, do the column(s) in the left table
- Have null values?
- Non unique values?
- Is your left table defined such that it is case insensitive?
(7) By Gunter Hick (gunter_hick) on 2022-05-02 08:33:13 in reply to 5.1 [source]
None of the above is relevant.
(8) By Aask (AAsk1902) on 2022-05-02 09:17:50 in reply to 7 [link] [source]
None of the above is relevant.
If the case of a column being joined on is immaterial, why aren't these two results identical?
sqlite> with
...> LEFTY (store,area) AS (Values('S1','A1'),('S2','A1'),('S3','A1'),('S4','A2'),('S5','A1')),
...> RIGHTY(area, sku, price) AS (Values('a1','P1',10),('A1','P3',11),('A2','P5',13),('A2','P7', 15))
...> select
...> l.store, l.area, r.sku, r.area, r.price
...> from Lefty l left join Righty r on l.area=r.area;
store area sku area price
----- ---- --- ---- -----
S1 A1 P3 A1 11
S2 A1 P3 A1 11
S3 A1 P3 A1 11
S4 A2 P5 A2 13
S4 A2 P7 A2 15
S5 A1 P3 A1 11
sqlite> with
...> LEFTY (store,area) AS (Values('S1','A1'),('S2','A1'),('S3','A1'),('S4','A2'),('S5','A1')),
...> RIGHTY(area, sku, price) AS (Values('A1','P1',10),('A1','P3',11),('A2','P5',13),('A2','P7', 15))
...> select
...> l.store, l.area, r.sku, r.area, r.price
...> from Lefty l left join Righty r on l.area=r.area;
store area sku area price
----- ---- --- ---- -----
S1 A1 P1 A1 10
S1 A1 P3 A1 11
S2 A1 P1 A1 10
S2 A1 P3 A1 11
S3 A1 P1 A1 10
S3 A1 P3 A1 11
S4 A2 P5 A2 13
S4 A2 P7 A2 15
S5 A1 P1 A1 10
S5 A1 P3 A1 11
sqlite>
The difference in the scripts is the first row value of the first column on the second table.
(9) By Gunter Hick (gunter_hick) on 2022-05-02 09:34:55 in reply to 8 [link] [source]
Neither of the queries has anything to do with LEFT JOIN. Try omitting LEFT, there will be no changes. They have everything to do with improper vetting of data. Proper use of foreign keys could ensure that neither table contains an invalid area entry. There is no store in 'a1', so no RHS prices for area 'a1' are displayed.
(10) By Aask (AAsk1902) on 2022-05-02 09:50:30 in reply to 9 [link] [source]
Neither of the queries has anything to do with LEFT JOIN. Try omitting LEFT, there will be no changes.
True, it has to do with
l.area=r.area;
where equality fails because of case.
(11) By MBL (UserMBL) on 2022-05-02 12:42:01 in reply to 8 [link] [source]
Try the following with and without the COLLATE NOCASE to see the difference of lower/upper case influence:
sqlite> .mode qbox
sqlite> with
...> LEFTY (store,area) AS (Values('S1','A1'),('S2','A1'),('S3','A1'),('S4','A2'),('S5','A1')),
...> RIGHTY(area, sku, price) AS (Values('a1','P1',10),('A1','P3',11),('A2','P5',13),('A2','P7', 15))
...> select
...> l.store, l.area, r.sku, r.area, r.price
...> from Lefty l
...> join Righty r on l.area=r.area COLLATE NOCASE;
┌───────┬──────┬──────┬──────┬───────┐
│ store │ area │ sku │ area │ price │
├───────┼──────┼──────┼──────┼───────┤
│ 'S1' │ 'A1' │ 'P1' │ 'a1' │ 10 │
│ 'S1' │ 'A1' │ 'P3' │ 'A1' │ 11 │
│ 'S2' │ 'A1' │ 'P1' │ 'a1' │ 10 │
│ 'S2' │ 'A1' │ 'P3' │ 'A1' │ 11 │
│ 'S3' │ 'A1' │ 'P1' │ 'a1' │ 10 │
│ 'S3' │ 'A1' │ 'P3' │ 'A1' │ 11 │
│ 'S4' │ 'A2' │ 'P5' │ 'A2' │ 13 │
│ 'S4' │ 'A2' │ 'P7' │ 'A2' │ 15 │
│ 'S5' │ 'A1' │ 'P1' │ 'a1' │ 10 │
│ 'S5' │ 'A1' │ 'P3' │ 'A1' │ 11 │
└───────┴──────┴──────┴──────┴───────┘
(12) By Aask (AAsk1902) on 2022-05-02 12:48:02 in reply to 11 [link] [source]
That is why I asked 3. Is your left table defined such that it is case insensitive? only to be met with None of the above is relevant.
(13) By Gunter Hick (gunter_hick) on 2022-05-02 13:05:52 in reply to 12 [link] [source]
The OP complained that row(s) from the LEFT HAND TABLE of a LEFT JOIN were being omitted. Upon request, he provided an example where each row of the LH table had at least one matching row on the RH table. Thus no extra rows were generated. No LH rows were omitted. Extraneous rows in the RH table (i.e. rows not matched by the JOIN condition) are irrelevant to determining if all LH rows are being used or not. The extra RH rows would be shown in an (unsupported) RIGHT JOIN; which can be emulated by swapping the table order.
(15) By Ryan Smith (cuz) on 2022-05-02 13:39:43 in reply to 12 [link] [source]
Not sure if you are being obtuse on purpose or honestly not following the clear explanations.
The OP's problem had to do with Left-Hand table rows in a LEFT-Join not showing up, which, by definition MUST all show up (which is what it means to be a LEFT-join), otherwise it is just a normal (inner) join.
You then proposed several reasons why the join condition may fail - which would have been very helpful if the OP's problem was that the join-condition is failing, but that was not the problem - LEFT-hand table rows must all show up, regardless of the JOIN-condition, so helping with the join-condition, however helpful that may be, is not relevant to the question - which was pointed out.
It's a bit like someone asking about hooking a trailer, but their car doesn't seem to have a tow-bar. You can then say many things pointing out the best towing principles and tow-bar coupling techniques, all probably helpful, but irrelevant to the problem at hand.
Also let me me be clear - Please don't take anything personal - Gunter (or anyone else) pointing out something you said is not relevant or otherwise wrong or of no consequence, is not to pick a fight or to one-up you, it's so that the person asking the question is not confused by irrelevant stuff or led astray leaving them with hours of wild-goose-chasing. This is valuable for people asking questions trying to get to the bottom of a real niggle, and we've all been "corrected" (except maybe Igor Tandetnik, he's never been wrong). :)
(6) By Gunter Hick (gunter_hick) on 2022-05-02 08:31:19 in reply to 4 [link] [source]
From https://sqlite.org/lang_select.html
"If the join-operator is a "LEFT JOIN" or "LEFT OUTER JOIN", then after the ON or USING filtering clauses have been applied, an extra row is added to the output for each row in the original left-hand input dataset that corresponds to no rows at all in the composite dataset"
In your example, each and every LHS (S) row has two RHS (A) rows that satisfy the join condition (which happens to have a typo - there is no column out_area - from posting redacted output instead of true output).
If you had a row (S6,A3) then you would obtain a result row (S6,A3,NULL,NULL,NULL) as long as there is no A3 row the RHS table.
(14) By anonymous on 2022-05-02 13:31:48 in reply to 6 [link] [source]
Thanks all for your responses. Please mark this thread 'complete'. The strange result was caused by trailing spaces. Much appreciated all the help. Best, PK