Bug: unexpected result from an empty table
(1) By Wang Ke (krking) on 2021-08-06 07:43:55 [link] [source]
For this case:
CREATE TABLE t0(c0); SELECT COUNT(t1.c1), t1.c1 FROM (SELECT 1 AS c1 FROM t0) AS t1; -- 0|1
It says in the documentation:
If the SELECT statement is an aggregate query without a GROUP BY clause, then each aggregate expression in the result-set is evaluated once across the entire dataset. Each non-aggregate expression in the result-set is evaluated once for an arbitrarily selected row of the dataset. The same arbitrarily selected row is used for each non-aggregate expression. Or, if the dataset contains zero rows, then each non-aggregate expression is evaluated against a row consisting entirely of NULL values.
Since the table t0 is empty, interim result of subquery
SELECT 1 AS c1 FROM t0 should be empty, and the expected result for the whole query should be
0| instead of
I think the cause of the problem is still related to optimization SQLITE_QueryFlattener, when we turn off this optimization, we can get the desired result.
(2) By Richard Hipp (drh) on 2021-08-06 22:29:21 in reply to 1 [source]
The "t1.c1" value is a "bare column". Among SQL database engines, SQLite is the only one that allows bare columns in aggregate queries. It has always been implied, but never explicitly stated, that the value of a bare column on an aggregate query that has no input rows is arbitrary. The fix for this problem is to update the documentation to actually say that, which I have down here.
Think about it. The value of a bare column will be one of the values that that column takes on for the duration of the aggregate. But if the aggregate has no input rows, what value does it take?
Usually the value of a bare column in an aggregate with no inputs will be NULL. But in this specific example, the query optimizer recognizes that the value of "t1.c1" will always be the constant "1", so it just hard-codes a "1" in that spot.
(3) By Wang Ke (krking) on 2021-08-06 23:00:18 in reply to 2 [link] [source]
OK, thank you for your explanation and update to the documentation.
We have realized that it is a bare column, but it was not clear what the value of the column should be when the result set is empty. We used to think that "arbitrary" refers to "any one in the result set". When the result set is empty, it can only be NULL. Your update to the documentation has well corrected our thoughts and answered our doubts.