SQLite Forum

A segmentation fault in SQLite latest release build
Login

A segmentation fault in SQLite latest release build

(1) By Wang Ke (krking) on 2021-11-08 08:48:32

Hello developers,

We found a test case causing a segmentation fault in [SQLite latest release build (3.36.0)](https://sqlite.org/releaselog/3_36_0.html).

The test case is as follows:

```
CREATE VIRTUAL TABLE rt0 USING rtree(c0,c1,c2 INT);
INSERT INTO rt0(c2) VALUES(0);
SELECT * FROM (SELECT rt0.c2,rt0.c1 FROM rt0 UNION ALL SELECT rt0.c1,rt0.c1 FROM rt0) LEFT JOIN (SELECT * FROM (SELECT (SELECT max(sum(rt0.c1),rank() OVER (ORDER BY ra0.c2)) FROM rt0 AS ra0) AS ca1 FROM rt0) WHERE ca1=0);
```

which causes:

> sqlite3: sqlite3.c: RtreeNode *rtreeNodeOfFirstSearchPoint(RtreeCursor *, int *): Assertion `pCur->bPoint || pCur->nPoint' failed.

when compiling with option `--enable-debug`, and segmentation fault when the option is disabled.

Here is the gdb debugging information:

```
#0  0x00005648d420ed44 in rtreeNodeOfFirstSearchPoint (
    pCur=pCur@entry=0x5648d5dc6490, pRC=pRC@entry=0x7ffe9ddf6894)
    at sqlite3.c:193706
#1  0x00005648d420efdd in rtreeColumn (cur=0x5648d5dc6490, ctx=0x7ffe9ddf6a60, 
    i=1) at sqlite3.c:193981
#2  0x00005648d41e2fbb in sqlite3VdbeExec (p=0x5648d5dbf550) at sqlite3.c:94169
#3  0x00005648d41e7410 in sqlite3Step (p=0x5648d5dbf550) at sqlite3.c:84861
#4  sqlite3_step (pStmt=0x5648d5dbf550) at sqlite3.c:19382
#5  0x00005648d414797c in exec_prepared_stmt (pArg=0x7ffe9ddf6f10, 
    pStmt=0x5648d5dbf550) at shell.c:14164
#6  0x00005648d414c1c1 in shell_exec (pArg=<optimized out>, 
    zSql=<optimized out>, pzErrMsg=0x7ffe9ddf6d28) at shell.c:14473
#7  0x00005648d414d946 in runOneSqlLine (p=0x7ffe9ddf6f10, 
    zSql=0x5648d5dc8130 "SELECT * FROM (SELECT rt0.c2,rt0.c1 FROM rt0 UNION ALL SELECT rt0.c1,rt0.c1 FROM rt0) LEFT JOIN (SELECT * FROM (SELECT (SELECT max(sum(rt0.c1),rank() OVER (ORDER BY ra0.c2)) FROM rt0 AS ra0) AS ca1 FR"..., in=0x0, 
    startline=3) at shell.c:21449
#8  0x00005648d4158cf4 in process_input (p=0x7ffe9ddf6f10) at shell.c:21549
#9  0x00005648d4133d59 in main (argc=<optimized out>, argv=0x7ffe9ddf8238)
    at shell.c:22350
```

Bisecting shows the problem may be related to [check-in c510377b0b](https://sqlite.org/src/info/c510377b0b).

Hope the problem will be handled properly :)

(2) By Richard Hipp (drh) on 2021-11-08 14:52:02 in reply to 1 [link]

Thanks for the bug report.

The crash is due to a NULL pointer dereference in the byte-code engine caused
by incorrect byte-code.  The incorrect byte-code results for a fault in the
code generator.

Each table or subquery in a complex SELECT statement is assigned a cursor
number.  The name resolution logic for aggregate functions depends
on the fact that cursor numbers for subqueries are always greater than 
cursor numbers in outer queries.  But that assumption was violated by a new
UNION ALL optimization that as [added on 2020-12-19][1].  The query in question
invokes that optimization, causing cursor numbers to be misordered, resulting
in incorrect byte-code, and ultimately the NULL pointer dereference.

[1]: https://sqlite.org/src/info/df1d6482f9e92daf

A simplified query is this:

> ~~~~
SELECT * FROM (
  SELECT 1 FROM rt0 AS q3
  UNION ALL
  SELECT 2 FROM rt0 AS q4
) LEFT JOIN (
  SELECT * 
    FROM (
           SELECT (SELECT sum(q2.c1) + rank() OVER () FROM rt0 AS q1) AS ca1
             FROM rt0 AS q2
         ) AS q5
   WHERE q5.ca1=0
);
~~~~

The [UNION ALL optimization][1] transforms this query into the following:

> ~~~~
SELECT 1, q7.* 
  FROM rt0 AS q3
  LEFT JOIN (
    SELECT * 
      FROM (
             SELECT (SELECT sum(q2.c1) + rank() OVER () FROM rt0 AS q1a) AS ca1
               FROM rt0 AS q2a
           ) AS q5a
     WHERE q5a.ca1=0
  ) AS q7
UNION ALL
SELECT 2, q8.*
  FROM rt0 AS q4
  LEFT JOIN (
    SELECT * 
      FROM (
             SELECT (SELECT sum(q2.c1) + rank() OVER () FROM rt0 AS q1b) AS ca1
               FROM rt0 AS q2b
           ) AS q5b
     WHERE q5b.ca1=0
  ) q8;
~~~~

If you enter the second optimized SQL directly, it works
(because all of the cursor numbers are well ordered).
But when the query optimizer makes that translation,
it ends up with the cursor number for q2b being less than the cursor
number for q5b.

Multiple things need to be fixed here:

  1.   The [UNION ALL optimization][1] needs to be fixed so that it yields
       a parse tree where the cursor
       numbers of all subquerys greater than than the cursor number
       of any outer query.

  2.   New assert() statements need to be added to check every parse tree
       transformation and assure that the cursor numbers are always
       well-ordered.

I'm working on these changes now.

Everything in this post is the result of a preliminary analysis
and is subject to change or correction in follow-up posts.

(3) By Richard Hipp (drh) on 2021-11-08 23:27:00 in reply to 2 [link]

Plans did change.

The end solution (seen at [check-in 74aec5dd1df95b56](src:/info/74aec5dd1df95b56))
was to refactor the logic that figures out which query in a cascade of nested
and correlated queries a particular aggregate function belongs too.  The new
logic does not depend on subquery cursor numbers being greater than outer
query cursor numbers.  That constraint is no longer on the parse tree, and
so no new assert() logic to check is was required.

Should be working now.

(4) By Wang Ke (krking) on 2021-11-09 02:04:55 in reply to 3 [link]

Great job done!

Compared to guaranteeing the order of query cursor numbers and subquery cursor numbers, this is indeed a bit more targeted and have potentially less code modification, although this is already quite a lot of work.

I 'm trying to fuzz with testcases similar to this one and no further problems occurred yet.

Thank you for your explanation and hard work.