SQLite Forum

order of INSERT affects the output
Login

order of INSERT affects the output

(1) By Wang Ke (krking) on 2021-04-10 18:46:03 [link]

Hello everyone,

Consider the following example:

```SQL
CREATE TABLE t0(c0);
INSERT INTO t0 VALUES(-1);
INSERT INTO t0 VALUES('a');

SELECT COUNT(*) FROM t0 GROUP BY NULL HAVING c0;
```
The output of SQLite 3.35.4 is "2".

But when we change the order of the INSERT statements, just like:

```SQL
CREATE TABLE t0(c0);
INSERT INTO t0 VALUES('a');
INSERT INTO t0 VALUES(-1);

SELECT COUNT(*) FROM t0 GROUP BY NULL HAVING c0;
```

There will be no output.

I wonder whether it is a bug.

Looking forward to your early reply :)

(2.1) By Richard Hipp (drh) on 2021-04-10 20:16:46 edited from 2.0 in reply to 1 [link]

Not a bug.

SQLite allows columns in an aggregate query that are not arguments to
an aggregate function nor arguments to the GROUP BY clause.  This is
not standard SQL, and is sometimes criticized.  But it does have its uses.

Here is a quote from the [relevant documentation][1]:

<blockquote>
The usual case is that all column names in an aggregate query are either arguments to aggregate functions or else appear in the GROUP BY clause. A result column which contains a column name that is not within an aggregate function and that does not appear in the GROUP BY clause (if one exists) is called a "bare" column. Example:
<pre>
    SELECT a, b, sum(c) FROM tab1 GROUP BY a;
</pre>
In the query above, the "a" column is part of the GROUP BY clause and so each row of the output contains one of the distinct values for "a". The "c" column is contained within the sum() aggregate function and so that output column is the sum of all "c" values in rows that have the same value for "a". But what is the result of the bare column "b"? The answer is that the "b" result will be the value for "b" in one of the input rows that form the aggregate. The problem is that you usually do not know which input row is used to compute "b", and so in many cases the value for "b" is undefined.
</blockquote>

In your case, the c0 column in the HAVING clause is a bare column.  It can
take on any value within its group.  Sometimes it takes on the value '-1'
which is considered TRUE, and other times it takes on the value 'a' which is
considered FALSE.

## Why Does SQLite Allow This Kind Of Confusing Behavior

When I was first writing SQLite, I attempted to follow [Postel's law][2].
This means that I wrote SQLite so that it would accept unusual inputs
(such as your example) and do the best it could with them rather than raise an
error.  You could make an argument that this was a bad choice, and that I should
have made SQLite more intolerant of goofy inputs and more likely to raise errors
when it sees something unusual.  However, there are now so many millions of applications
in the wild that use SQLite that I am reluctant to make it more strict for fear
of breaking legacy.  So this potentially confusing situation will containue
to exist.

[1]: https://www.sqlite.org/lang_select.html
[2]: https://en.wikipedia.org/wiki/Robustness_principle

(3) By tom (younique) on 2021-05-06 09:06:50 in reply to 2.1 [link]

> SQLite allows columns in an aggregate query that are not arguments to an aggregate function nor arguments to the GROUP BY clause. This is not standard SQL, and is sometimes criticized. But it does have its uses.

I don't know why there could be any criticism because imho it has absolutely no negative impact. Those who want strict SQL behavior just don't have to use it.

To me, it's one of the best (of not the best at all) variance from strict SQL that one can do. It's a great aid and not a misfeature at all. I love it! 💙

(4) By Gunter Hick (gunter_hick) on 2021-05-06 09:47:31 in reply to 1 [link]

Don't do processing that requires a specific visitation order or output order without making sure that the required order is actually guaranteed. Which usually means adding an ORDER BY clause to uniquely determine said order.

(5) By Harald Hanche-Olsen (hanche) on 2021-05-06 11:06:39 in reply to 3

> I don't know why there could be any criticism because imho it has absolutely no negative impact.

That is not at clear. In fact, Postel's law itself has been criticised because being liberal in what you accept can allow errors to go undetected, with possible data corruption or worse (security holes, in the case of the Internet) as a consequence.

I wish it were called Postel's *guideline* instead of a *law*.

And to be clear, I have nothing against the deviation from the standard discussed in this thread. But I would be surprised to learn it had never happened that someone, somewhere, got bitten by it. Hopefully not too seriously.