bug report : adding constant to GROUP BY leads to different output

(1.1) By Wang Ke (krking) on 2021-04-25 10:26:01 edited from 1.0 [link]

Hello everyone,

Consider the following example:

```
-- SQLite version 3.35.5
CREATE TABLE t0 (c0 REAL);
INSERT INTO t0(c0) VALUES (1), (NULL), (1);
CREATE UNIQUE INDEX idx ON t0(NULL DESC);

SELECT * FROM t0 GROUP BY c0;
```
Since the two "1" is equal, the expected output should contain two lines: one is null and the other is 1.0. And yes, the output is exactly what I expect.

But when we add a constant to the GROUP BY clause, just like:

```
SELECT * FROM t0 GROUP BY c0, NULL;
```

The "GROUP BY constant" means to group all the candidate records into a group by the constant provided. It's like adding a column to the candidate records, and values this constant. So, there should still be 2 lines.

But what we get from the output contains 3 lines. It looks like that SQLite takes the two 1.0 as not equal.

If we annotate the CREATE INDEX statement, the output turns to be two lines again.

I wonder whether it's a bug.

Looking forward to your early reply!

(2.1) By luuk on 2021-04-25 09:57:15 edited from 2.0 in reply to 1.0 [link]

> But when we add a constant to the GROUP BY clause, just like:
> 
> SELECT * FROM t0 GROUP BY c0, NULL;

NULL is not a constant, because this does not return TRUE (or 1):

```
SELECT null=null
```
But this (adding a constant):
```
SELECT * FROM t0 GROUP BY c0, '42';
```

gives the correct output.

(3.1) By Wang Ke (krking) on 2021-04-25 10:25:23 edited from 3.0 in reply to 2.1 [link]

Okay, then, why it becomes 2 lines again when we annotate the CREATE INDEX statement?

(4) By luuk on 2021-04-25 10:52:31 in reply to 3.1 [link]

I wonder whether it's a bug..... 😉

(5) By Wang Ke (krking) on 2021-04-25 13:56:56 in reply to 1.1 [link]

Can the developers explain this?

(6) By Keith Medcalf (kmedcalf) on 2021-04-25 17:45:41 in reply to 2.1 [link]

NULL is a constant and it is False.  

For the purposes of GROUP BY and DISTINCT, the grouping comparison is 'IS' not '==' so NULL compares equal to NULL and all NULL form a single group just like any other value.

(7) By Wang Ke (krking) on 2021-04-26 11:57:22 in reply to 6 [link]

I see. But the method of how GROUP BY handles NULL does not affect its handling of other numbers, I still think that the two 1.0s should be grouped into the same group, and that adding NULL to the GROUP BY clause should not affect the result in this example.

So could you please tell me whether it's a bug or not?

(8) By Richard Hipp (drh) on 2021-04-26 12:38:32 in reply to 5

Your test case is kind of silly because:

  *   It involves an index on a constant
  *   It uses a GROUP BY clause on a (non-integer) constant.

However, the same underlying failure can be provoked without such silliness:

> ~~~
CREATE TABLE t1(a INT, b INT);
INSERT INTO t1(a,b) VALUES(1,null),(null,null),(1,null);
CREATE UNIQUE INDEX t1b ON t1(abs(b));
SELECT * FROM t1 GROUP BY a, abs(b);
~~~

I'm testing the fix now.

(9) By Wang Ke (krking) on 2021-04-26 13:20:14 in reply to 8 [link]

I am testing this DBMS with randomly generated test cases. It looks silly because it's just one of the billions of randomly generated test cases against my test oracle. No matter how silly they look like, they obey the syntax provided by the documentation and may cause incorrect results.

(10) By Larry Brasfield (larrybr) on 2021-04-26 13:29:31 in reply to 9 [link]

I don't think Richard is disparaging your testing or post. (The fact that he took the time to find a non-silly inducement of the same problem shows that he took it seriously, and for the reason you suggest.)

Sometimes, a response in this forum is for the benefit of participants other than the one to whom the response is nominally addressed.

(11.1) By Wang Ke (krking) on 2021-04-26 13:50:44 edited from 11.0 in reply to 10 [link]

Sorry, but I didn't feel offended by his response. I am just talking about how I got this case. Sometimes, I think, "silly" cases may be more likely to find bugs because humans don't usually write like this.

I totally agree with you and that's exactly why I chose to report the potential bug and discuss my testing methods with you all in this forum.

Looking forward to Richard's patch to this bug :)

(12) By Ryan Smith (cuz) on 2021-04-26 13:54:20 in reply to 11.0 [link]

You are right of course, that's why fuzzers are so good at detecting problems, but we do still need a proper test-case for the bug report.

Explaining why it is silly, or indeed that silly is helpful, is already understood, but does not make it any less silly.  It is silly because it is unlikely to have real-world parallels, which is why Richard made a test case that may actually appear in a real-world DB, which then becomes less silly but still demonstrates the problem to inform the bug report and possible future test-case.

Just trying to explain it better because I do feel there might be some language barrier - like when you replied to Keith in argument, when he actually responded to Luuk informing him that his assumption is not correct, so in essence agreeing with you.

I'm interested though, if this tool you use is not a standard fuzzer, but I assume produces random-built SQL sequences that should be workable normally.  I assume you have a pre-built DB where you control what types of SQL sequences would make sense and allow them to be concatenated into test queries. If it is more elegant than that, I'm interested to know, if you are interested to share.

(13) By Richard Hipp (drh) on 2021-04-26 14:00:58 in reply to 11.1 [link]

> "silly" cases may be more likely to find bugs because humans don't usually write like this

Precisely.  And silly bugs should still be fixed, but they have a lower
priority because they do not impact real-world applications.

My point is that even though your original test case is "silly" (in the sense
that it would never come up in a real-world application) the same malfunction
can be provoked using a different test case that *might* appear in a real-world
application.  This increases the severity and priority of the bug.

I hasten to point out, though, that even though this is a bug that *might*
appear in a real-world application, it has apparently been in the code for
many years and nobody has ever run across it, despite SQLite being used in
literally millions of applications.  So, even though it needs to be fixed,
it is not an emergency.  Even though it would be reasonable for an application
to contain SQL queries like this, apparently no application does so.

Status:  I have a patch that fixes your original problem.  But I am continuing
to investigate the logic of the optimization that attempts to avoid unnecessary
sorts (which is the source of this problem) to see if there are yet other
subtle design errors.  I'll say more about this, because the time I spend
writing replies on the Forum is time that is *not* spent fixing and improving
the actual code.  Over and out.

(14) By Richard Hipp (drh) on 2021-04-26 14:05:41 in reply to 12 [link]

Presumably, Wang Ke's fuzzer builds upon the ground-breaking ideas in
semantic fuzzing done by [Manuel Rigger][1].

[1]: https://www.manuelrigger.at/

(15) By Richard Hipp (drh) on 2021-04-26 14:34:22 in reply to 1.1 [link]

Please try again with SQLite [check-in 7178dc3a32c3a4a3][1] or later.

[1]: src:/info/7178dc3a32c3a4a3

(16) By Wang Ke (krking) on 2021-04-26 15:49:39 in reply to 14 [link]

Yeah, we are using their fuzzer to produce test cases, and using our new test oracle to check the output, which will be released later.

(17) By Wang Ke (krking) on 2021-04-26 15:53:07 in reply to 15 [link]

This bug is indeed fixed, thanks!