SQLite Forum

Timeline
Login

27 forum posts by user mzm2021

2021-08-06
14:21 Reply: Fail to calculate long expression (artifact: f868cc54fa user: mzm2021)

Now remove also the division since it does not obey to the same rules across languages...

12:59 Reply: Fail to calculate long expression (artifact: dffdec6da1 user: mzm2021)

Again, different languages ⇒ different operator rules.

SQLite: select 7530/(-396) gives -19

Python: 7530/(-396) gives -20

Python division is actually called floor division: a / b is floor(a div b)

To get the same result as Python's in SQLite: SELECT floor(1.0*7530/(-396))

11:46 Reply: Fail to calculate long expression (artifact: f891211202 user: mzm2021)

Each language defines its own set of casting rules and its operators semantics.

Your use of modulo (%) is what makes the results so different.

11:43 Reply: Fail to calculate long expression (artifact: 35e51155c8 user: mzm2021)

The error message is self-explanatory. So if you have to deal with such deeply nested expressions, you'll have to increase the following constants:

SQLITE_LIMIT_EXPR_DEPTH

YYTRACKMAXSTACKDEPTH

2021-07-15
16:16 Reply: Should SQLite be able to optimize this query? (artifact: 25a5c5b395 user: mzm2021)

Side note here: Not really an answer to your question, but wouldn't using order by 2 even faster than order by count(*)?

select country_long, count(*)
from (
  select * from [global-power-plants] order by rowid
)
where country_long is not null
group by country_long
order by 2 desc
14:22 Edit reply: Feature request: add index info to WITH clause table (artifact: b269164623 user: mzm2021)

Short version

I'm with you on adding INDEXes to CTE but I hate your syntax proposal. :)

Long version

I feel your pain

Actually, I've written a lot of large queries where having such a feature would have been a time saver. It would have also prevented me from rewriting a "clean" CTE-query into a complex write-only equivalent (I was unable to understand it after a few hours even with extensive comments.)

No prior art

There is no other SQL implementation that has such a feature (or at least no implementation that I am aware of.)

So the syntax has to be "invented". My reasoning is that it should also fit with the SQL way of expressing things: I mean it has to avoid mixing a heavy DDL syntax (CREATE INDEX) in a relatively pure query syntax (the CTE.)

The proposal

My suggestion is twofold:

CREATE UNIQUE INDEX IF NOT EXISTS schema-name . index-name ON table-name ( indexed-column ) , WHERE expr
     linerad = 10px
     linewid *= 0.5
     $h = 0.21

     circle radius 10%
A0:  arrow 2*arrowht
CR:  oval "CREATE" fit
UQ:  oval "UNIQUE" fit with .w at (linewid right of CR.e,.8*$h below CR)
IX:  oval "INDEX" fit with .w at (linewid right of UQ.e,CR)
     arrow from CR.e right even with UQ; arrow to IX.w
     arrow from CR.e right linerad then down even with UQ then to UQ.w
     line from UQ.e right linerad then up even with IX then to arrowht left of IX.w
     arrow from IX.e right
     oval "IF" fit
     arrow right 2*arrowht
     oval "NOT" fit
     arrow 2*arrowht
ETS: oval "EXISTS" fit

     # IF NOT EXISTS bypass
Y1:  .5*$h below UQ.s  # vertical position of back-arrow
     arrow from IX.e right linerad then down even with Y1  then left even with UQ
     arrow from ETS.e right then down even with Y1 then left even with ETS.w
     line left even with IX.w

     # second row
     arrow left even with first circle then down $h*1.25 then right 2*arrowht
SN:  oval "schema-name" fit
     arrow 2*arrowht
DOT: oval "." bold fit
     arrow
     oval "index-name" fit
     arrow 2*arrowht
     oval "ON" fit
     arrow 2*arrowht
     oval "table-name" fit
     arrow 2*arrowht

LP:  oval "(" bold fill red fit
     arrow
IXC: box "indexed-column" fill red fit
     arrow
RP:  oval ")"  bold fit    fill red     

     # Loop over indexed-column
CMA: oval "," bold fit at $h*1.25 below IXC fill red
     arrow from IXC.e right linerad then down even with CMA then to CMA.e
     line from CMA.w left even with 2*arrowht left of IXC.w \
       then up even with IXC then right linerad

     # schema-name bypass
     arrow from (first circle,SN.n) down even with $h below SN \
       then right even with SN
     line right even with arrowht right of DOT.e then up even with DOT \
        then right linerad

     # WHERE clause and end circle
WH:  oval "WHERE" fit with .w at IXC.w+(linewid,-$h*3.5) fill red
     arrow right 2*arrowht
WEX: box "expr" fit fill red
     arrow right even with linewid*1.5 right of RP.e
     circle same fill red

     # Linkage from the last ")" to the WHERE clause or or circle
     arrow from RP.e right linerad then down even with CMA
J1:  previous.end
     line down even with last circle then to arrowht left of last circle.w
     arrow from J1 down even with $h below CMA then left even with CMA
     arrow left even with 2*arrowht left of WH.w then down even with WH \
       then to WH.w
  • Extend CTE declaration to allow table-contraints to be used inside the column names declaration (just as what one can already do in CREATE TABLE).

Your example using my suggested syntax

WITH 
LookupTable( TimeStamp, LookupContent, PRIMARY KEY( TimeStamp ) ) as MATERIALIZE (
   select substr(Content,1,24)        as TimeStamp
        , trim(substr(Content,26,40)) as LookupContent 
     from VirtualLookupTextTable
    where instr(Content,'keyword')
)
select FromTimeStamp, ToTimeStamp, M.Content, L.Content
  from VirtualMainTextTable   M
  left outer join LookupTable L
               on M.ToTimeStamp>M.FromTimeStamp and L.TimeStamp > M.FromTimeStamp and L.TimeStamp <= M.ToTimeStamp
                                                 or L.TimeStamp == M.ToTimeStamp

Another example using INDEX inside the columns definition of the CTE

WITH 
LookupTable( a, b, INDEX (a, b), INDEX (b) WHERE a < 10 ) as MATERIALIZE (...)
SELECT ...
14:20 Reply: Feature request: add index info to WITH clause table (artifact: 00fbd81e67 user: mzm2021)

Short version

I'm with you on adding INDEXes to CTE but I hate your syntax proposal. :)

Long version

I feel your pain

Actually, I've written a lot of large queries where having such a feature would have been a time saver. It would have also prevented me from rewriting a "clean" CTE-query into a complex write-only equivalent (I was unable to understand it after a few hours even with extensive comments.)

No prior art

There is no other SQL implementation that has such a feature (or at least no implementation that I am aware of.)

So the syntax has to be "invented". My reasoning is that should also fit with the SQL way of doing things: I mean it has to avoid mixing a heavy DDL syntax (CREATE INDEX) in a relatively pure query syntax (the CTE.)

The proposal

My suggestion is twofold:

CREATE UNIQUE INDEX IF NOT EXISTS schema-name . index-name ON table-name ( indexed-column ) , WHERE expr
     linerad = 10px
     linewid *= 0.5
     $h = 0.21

     circle radius 10%
A0:  arrow 2*arrowht
CR:  oval "CREATE" fit
UQ:  oval "UNIQUE" fit with .w at (linewid right of CR.e,.8*$h below CR)
IX:  oval "INDEX" fit with .w at (linewid right of UQ.e,CR)
     arrow from CR.e right even with UQ; arrow to IX.w
     arrow from CR.e right linerad then down even with UQ then to UQ.w
     line from UQ.e right linerad then up even with IX then to arrowht left of IX.w
     arrow from IX.e right
     oval "IF" fit
     arrow right 2*arrowht
     oval "NOT" fit
     arrow 2*arrowht
ETS: oval "EXISTS" fit

     # IF NOT EXISTS bypass
Y1:  .5*$h below UQ.s  # vertical position of back-arrow
     arrow from IX.e right linerad then down even with Y1  then left even with UQ
     arrow from ETS.e right then down even with Y1 then left even with ETS.w
     line left even with IX.w

     # second row
     arrow left even with first circle then down $h*1.25 then right 2*arrowht
SN:  oval "schema-name" fit
     arrow 2*arrowht
DOT: oval "." bold fit
     arrow
     oval "index-name" fit
     arrow 2*arrowht
     oval "ON" fit
     arrow 2*arrowht
     oval "table-name" fit
     arrow 2*arrowht

LP:  oval "(" bold fill red fit
     arrow
IXC: box "indexed-column" fill red fit
     arrow
RP:  oval ")"  bold fit    fill red     

     # Loop over indexed-column
CMA: oval "," bold fit at $h*1.25 below IXC fill red
     arrow from IXC.e right linerad then down even with CMA then to CMA.e
     line from CMA.w left even with 2*arrowht left of IXC.w \
       then up even with IXC then right linerad

     # schema-name bypass
     arrow from (first circle,SN.n) down even with $h below SN \
       then right even with SN
     line right even with arrowht right of DOT.e then up even with DOT \
        then right linerad

     # WHERE clause and end circle
WH:  oval "WHERE" fit with .w at IXC.w+(linewid,-$h*3.5) fill red
     arrow right 2*arrowht
WEX: box "expr" fit fill red
     arrow right even with linewid*1.5 right of RP.e
     circle same fill red

     # Linkage from the last ")" to the WHERE clause or or circle
     arrow from RP.e right linerad then down even with CMA
J1:  previous.end
     line down even with last circle then to arrowht left of last circle.w
     arrow from J1 down even with $h below CMA then left even with CMA
     arrow left even with 2*arrowht left of WH.w then down even with WH \
       then to WH.w
  • Extend CTE declaration to allow table-contraints to be used inside the column names declaration (just as what one can already do in CREATE TABLE).

Your example using my suggested syntax

WITH 
LookupTable( TimeStamp, LookupContent, PRIMARY KEY( TimeStamp ) ) as MATERIALIZE (
   select substr(Content,1,24)        as TimeStamp
        , trim(substr(Content,26,40)) as LookupContent 
     from VirtualLookupTextTable
    where instr(Content,'keyword')
)
select FromTimeStamp, ToTimeStamp, M.Content, L.Content
  from VirtualMainTextTable   M
  left outer join LookupTable L
               on M.ToTimeStamp>M.FromTimeStamp and L.TimeStamp > M.FromTimeStamp and L.TimeStamp <= M.ToTimeStamp
                                                 or L.TimeStamp == M.ToTimeStamp

Another example using INDEX inside the columns definition of the CTE

WITH 
LookupTable( a, b, INDEX (a, b), INDEX (b) WHERE a < 10 ) as MATERIALIZE (...)
SELECT ...
2021-07-14
16:06 Reply: Help with a query (artifact: 32dba01e8e user: mzm2021)

Below is my previous query updated to group rows by pollDate and to apply the aggregate function sum() on each of these groups:

SELECT pollDate, sum(import)/2.0
FROM Readings
WHERE interval BETWEEN 1 AND 48
GROUP BY pollDate
10:59 Reply: Help with a query (artifact: da9be237a5 user: mzm2021)

I see no column named value in your table.

Which column do you want to use for computing the sum?

So far, and if my guess machine is correctly tuned, if you're after half the sum of column import for all rows which interval is between 1 and 48, the translation into SQL is:

SELECT sum(import)/2.0
FROM Readings
WHERE interval BETWEEN 1 AND 48
10:10 Reply: Help with a query (artifact: e5a1dc5920 user: mzm2021)

Giving more details about the input table will help us help you.

But your current description is very vague:

I am after a query that would get the sum(column value) from a table that records intervals based on subset (interval) of a number between 1 to 48

i.e select aDate, sum(column) as v, sum(v / 2) as c where interval between 15 and 22 order by aDate;

... and that explains also why you're probably not finding any answer: vague human words are not the best way to state your problem and translate it into SQL.

2021-07-13
08:47 Reply: "=" character not mentioned in expr syntax definition (artifact: 81292b508e user: mzm2021)

= is a binary operator and its use is referenced in the 6th line of the syntax diagram.

expr binary-operator expr
box "expr" fit; arrow; oval "binary-operator" fit; arrow; box "expr" fit

All binary operators supported by SQLite are listed in Operators.

2021-07-07
09:26 Reply: How ddo I specify options with MSVC (artifact: fc7f8c7602 user: mzm2021)

As all C compilers, you can pass defines to CL.exe on the command line: /D (Preprocessor Definitions).

cl /D SQLITE_MAX_COLUMN=16384 shell.c sqlite3.c -Fesqlite3.exe
09:18 Edit reply: Repeat values n times (artifact: 8626b8efe8 user: mzm2021)

Assuming the sku/qty table is named input, here is a possible solution:

WITH RECURSIVE counter(value) AS (
   SELECT 1
     UNION ALL
   SELECT value + 1 FROM counter LIMIT (SELECT MAX(qty) FROM input)
)
SELECT * FROM input JOIN counter ON value <= qty ORDER BY sku, qty, value

The counter CTE will generate a sequence of integers value from 1 to the maximum qty. It is then joined with table input on value <= qty.

Using SQLite's specific table-valued function generate_series(), the recursive CTE could be removed as in:

SELECT * FROM input JOIN generate_series(1, (SELECT MAX(qty) FROM input)) ON value <= qty ORDER BY sku, qty, value

By the way, the ORDER BY clause is not required. It is only meant to help you inspect the result and check how each query works.

09:14 Edit reply: Repeat values n times (artifact: 841cc6faf6 user: mzm2021)

Assuming the sku/qty table is named input, here is a possible solution:

WITH RECURSIVE counter(value) AS (
   SELECT 1
     UNION ALL
   SELECT value + 1 FROM counter LIMIT (SELECT MAX(qty) FROM input)
)
SELECT * FROM input JOIN counter ON value <= qty ORDER BY sku, qty

The counter CTE will generate a sequence of integers value from 1 to the maximum qty. It is then joined with table input on value <= qty.

Using SQLite's specific table-valued function generate_series(), the recursive CTE could be removed as in:

SELECT * FROM input JOIN generate_series(1, (SELECT MAX(qty) FROM input)) ON value <= qty ORDER BY sku, qty
09:09 Edit reply: Repeat values n times (artifact: 7a7cea8d21 user: mzm2021)

Assuming the sku/qty table is named input, Here is a possible solution:

WITH RECURSIVE counter(value) AS (
   SELECT 1
     UNION ALL
   SELECT value + 1 FROM counter LIMIT (SELECT MAX(qty) FROM input)
)
SELECT * FROM input JOIN counter ON value <= qty ORDER BY sku, qty

The counter CTE will generate a sequence of integers value from 1 to the maximum qty. It is then joined with table input on value <= qty.

Using SQLite's specific table-valued function generate_series(), the recursive CTE could be removed as in:

SELECT * FROM input JOIN generate_series(1, (SELECT MAX(qty) FROM input)) ON value <= qty ORDER BY sku, qty
09:08 Reply: Repeat values n times (artifact: 34804dc26f user: mzm2021)

Assuming the sku/qty table is named input, Here is a possible solution:

WITH RECURSIVE counter(value) AS (
   SELECT 1
     UNION ALL
   SELECT value + 1 FROM counter LIMIT (SELECT MAX(qty) FROM input)
)
SELECT * FROM input JOIN counter ON value <= qty ORDER BY sku, qty

The counter CTE will generate a sequence of integers value from 1 to the maximum qty. It is then joined with input on value <= qty.

Using SQLite's specific table-valued function generate_series(), the recursive CTE could be removed as in:

SELECT * FROM input JOIN generate_series(1, (SELECT MAX(qty) FROM input)) ON value <= qty ORDER BY sku, qty
2021-07-06
15:45 Reply: Bug report: CTE can report "circular reference" for an actual column count issue (artifact: 5d4c29d2e5 user: mzm2021)

I confirm this bug is no more present in 3.36.

After inspecting the timeline, it seems that it has been corrected in check-in 99812236 without referencing this report.

Anyway, thanks dan for the fix.

14:13 Reply: row value misuse (artifact: 23f1b0f8b4 user: mzm2021)

SQLite does not accept temporary column names in a table/view alias.

That's why it is reporting a syntax error near the left parenthesis of:

… as bar("k", "v")

PostgresSQL docs describe aliases with column names in:

https://www.postgresql.org/docs/13/queries-table-expressions.html#id-1.5.6.6.5.7.3

But the SQLite syntax diagram of the FROM clause does not allow such a syntax:

https://sqlite.org/syntax/table-or-subquery.html

2021-06-10
15:26 Reply: Slow insert with prepare/step (artifact: ae0f199ca2 user: mzm2021)

Enclosing the loop in Test2() between

sqlite3_exec(db, "BEGIN TRANSACTION;", NULL, NULL, NULL);

and

sqlite3_exec(db, "COMMIT;", NULL, NULL, NULL);

should be sufficient.

If the code is going to be used in production, you'll need robust error handling for sqlite3_exec() returns.

14:07 Reply: Slow insert with prepare/step (artifact: 4d517d6e30 user: mzm2021)

Test1() has a single transaction whereas Test2() has 1000 transactions.

You can improve the speed of the Test2() by starting a transaction before starting the loop and committing it after its end.

2021-03-22
12:58 Reply: Bug report: Complex CTE generates segmentation fault depending on the order of joined tables in its body (artifact: bc267ed84c user: mzm2021)

Sure I will do.

But my question was general about how the query planner considers theses hints: The documentation does not state clearly which of estimatedCost or estimatedRows has higher precedence.

I quickly checked the source code of SQLite and it seems that the values of estimatedCost are always compared before those of estimatedRows.

12:28 Reply: Bug report: Complex CTE generates segmentation fault depending on the order of joined tables in its body (artifact: ef1b233ef3 user: mzm2021)

I use another vtable which was directly inspired by wholenumber.c. So it was fixed using the same method.

BTW, I looked at the explanation of estimatedCost in https://www.sqlite.org/vtab.html#outputs where I read:

The SQLite core initializes estimatedCost to a very large value prior to invoking xBestIndex, so if xBestIndex determines that the current combination of parameters is undesirable, it can leave the estimatedCost field unchanged to discourage its use.

So I tested commenting out estimatedCost's assignment in wholenumber.c as in:

if( (idxNum & 12)==0 ){
    /*pIdxInfo->estimatedCost = (double)1e99;*/
  }else if( (idxNum & 3)==0 ){
    pIdxInfo->estimatedCost = (double)5;
  }else{
    pIdxInfo->estimatedCost = (double)1;
  }

And it worked fine too.

I noticed also that wholenumber.c makes no use of estimatedRows.

How effective would it be to provide the query planner with estimatedRows hints?

09:45 Reply: Bug report: Complex CTE generates segmentation fault depending on the order of joined tables in its body (artifact: 1f75ed5db7 user: mzm2021)

Thank you.

I confirm that the cost change to the wholenumber virtual table fixes the reported issue.

By the way, is there a chance that other virtual table implementations would be affected by such a cost problem?

2021-03-19
09:03 Reply: Bug report: Complex CTE generates segmentation fault depending on the order of joined tables in its body (artifact: b52a020ce4 user: mzm2021)

I should probably file another bug in another thread. But since the test case is almost the same, I prefer continuing here.

As explained in my previous reply, I confirm that replacing generate_series by a wholenumber derived virtual table in the above CTE makes the query enter in an infinite loop. As the first bug, the query used to work on 3.34.1 but not anymore on any 3.35.x release including yesterday's test tarball where you have fixed the segmentation fault.

Below is a simple variation on the above CTE with an integers table derived from wholenumber.

The wholenumber extension is compiled using:

gcc -g -fPIC -shared wholenumber.c -o wholenumber.so

Then SQLite shell is invoked with the following input (Note that only the radygrid sub-table defintion has been updated to use integers instead of generate_series):

.load ./wholenumber

CREATE VIRTUAL TABLE integers USING wholenumber;

WITH
	cst(rsx, rsy) AS  (
		SELECT 100, 100
	),

	cst2(minx, maxx, stepx, miny, maxy, stepy, qualitativex, qualitativey) AS (		
		SELECT NULL, NULL, NULL, NULL, NULL, NULL, 0, 0
	),radygrid

	ds0(m, n, x, y, x2, y2, title, size, mark, label, markmode) AS (
		SELECT 1, 2, 3, 4, 5, 6, 7 , 8, 9, 10, 11
	)
	,
	
	ds(m, n, x, y, x2, y2, title, size, mark, label, markmode) AS (
		SELECT m, n, x,
			y, x2,
			y2,
			title, size, mark, label, markmode
		FROM ds0
		WINDOW w AS (PARTITION BY m, x ORDER BY n)
	),

	d(m, n, x, y, x2, y2, labelx, labely, title, size, mark, label, markmode) AS (
		SELECT m, n, x, y,	x2, y2, x, y, title, size, mark, label, markmode
		FROM ds, cst2
	),

	ylabels(y, label) AS (
		SELECT y, MIN(labely) FROM d GROUP BY y
	),

	yaxis(maxy, miny, stepy , minstepy) AS (
		WITH
			xt0(minx, maxx) AS (
				SELECT  coalesce(miny, min(min(y2), min(y))),  coalesce(maxy, max(max(y2), max(y))) + qualitativey  FROM d, cst2
			),
			xt1(mx, mn) AS (SELECT maxx, minx FROM xt0),
			xt2(mx, mn, step) AS (SELECT mx, mn, (mx-mn)  FROM xt1),
			
			xt3(mx, mn, ms) AS (
				SELECT mx, mn, first_value(rs) OVER (order by x desc) AS ms FROM (SELECT mx, mn, step, f,(mx-mn) as rng, 1.0*step/f as rs, 1.0*(mx-mn)/(step/f) AS x FROM xt2, (SELECT 1 AS f UNION ALL SELECT 2 UNION ALL SELECT 4 UNION ALL SELECT 5)) AS src WHERE x < 10 limit 1),
			xt4(minstepy) AS (
				SELECT MIN(abs(y2-y)) FROM d WHERE y2 != y
			)
		SELECT (mx/ms)*ms, (mn/ms)*ms, coalesce(stepy, ms), coalesce(minstepy, ms, stepy)  FROM xt3, cst2,xt4
	),

	distinct_mark_n_m(mark, ze, zem, title) AS (
		SELECT DISTINCT mark, n AS ze, m AS zem, title FROM ds0
	),

	facet0(m, mi, title, radial) AS (
		SELECT md, row_number() OVER () - 1, title, 'radial' IN (SELECT mark FROM distinct_mark_n_m WHERE zem = md)
		FROM (SELECT DISTINCT zem AS md, title AS title FROM distinct_mark_n_m ORDER BY 2, 1)
	),

	facet(m, mi, xorigin, yorigin, title, radial) AS (
		SELECT m, mi,
			rsx * 1.2 * IFNULL(CASE WHEN (
				0
			) > 0 THEN mi / (
				0
			) ELSE mi % (
				2
			)  END, mi),
			rsy  * 1.2 * IFNULL(CASE WHEN (
				2
			) > 0 THEN mi / (
				2
			) ELSE mi / (
				0
			)  END, 0),
			title, radial FROM facet0, cst
	),

	radygrid(m, mi, tty, wty, ttx, ttx2, xorigin, yorigin) AS (
		SELECT m, mi,  rsy / 2 / ((maxy-miny)/stepy) * (value-1) AS tty, coalesce(NULL, miny + stepy * (value-1)) AS wty, xorigin, xorigin+rsx, xorigin + rsx / 2, yorigin + rsy / 2  FROM integers, yaxis, cst, facet LEFT JOIN ylabels ON ylabels.y = (miny + (value-1) * stepy) WHERE radial AND value <= 1+1.0*(maxy-miny)/stepy
	),
	
	ypos(m, mi, pcx, pcy, radial) AS (
		SELECT m, mi, xorigin, yorigin + CASE
			WHEN 0 BETWEEN miny AND maxy THEN
				rsy - (0 - miny) * rsy / (maxy-miny)
			WHEN 0 >= maxy THEN 0
			ELSE  rsy
		END, radial FROM yaxis, cst, facet WHERE NOT radial
		UNION ALL
		SELECT m, mi, xorigin + rsx / 2, yorigin + (CASE
			WHEN 0 BETWEEN miny AND maxy THEN
				rsy - (0 - miny) * rsy / 2 / (maxy-miny)
			WHEN 0 >= maxy THEN 0
			ELSE  rsy
		END ) / 2, radial FROM yaxis, cst, facet WHERE radial
	)

SELECT * FROM radygrid , ypos;
08:20 Reply: Bug report: Complex CTE generates segmentation fault depending on the order of joined tables in its body (artifact: 9d10cd453e user: mzm2021)

Thank you for your reply. I confirm that this particular bug is gone with the new tarball.

I even tested it with the original large CTE request and it worked correctly.

Still, there is a regression in a test case involving the same CTE but with the wholenumber virtual table instead of generate_series.

In that test case, the query enters into an infinite loop as the limits on wholenumber generated values inside some parts of the CTE seem to not be respected. It is as if the WHERE clause is completely ignored in some chunks of the CTE which look like this:

SELECT value FROM integers WHERE value < 10

I am still investigating this particular issue as it looks to be related to shared cursors in materialized views (according to my quick understanding of the new SQLite code) and to query flattening which is not working as before.

I am also checking the source code of both virtual tables to see if there are any differences in cursor handling that may explain this behavior.

I'll keep you updated about my findings.

2021-03-18
10:50 Post: Bug report: Complex CTE generates segmentation fault depending on the order of joined tables in its body (artifact: a41295d743 user: mzm2021)

The complex CTE below generates a segmentation fault depending on the order of the joined tables in its body.

The problem was spotted on 3.35.2 and 3.35.1. It does not affect 3.34.1.

I used the SQLite amalgamation compiled with:

gcc shell.c sqlite3.c -lpthread -ldl -lm -g

On 3.35.2, here is the GDB session log with the backtrace:

(gdb) r < bug-cte-materialize.sql 
Starting program: /home/mm/sqlite-test/sqlite-amalgamation-3350200/a.out < bug-cte-materialize.sql
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000000000044eeff in sqlite3BtreeCursor (p=0x0, iTable=1, wrFlag=4, 
    pKeyInfo=0x0, pCur=0x767a28) at sqlite3.c:69440
69440	  if( p->sharable ){
(gdb) bt
#0  0x000000000044eeff in sqlite3BtreeCursor (p=0x0, iTable=1, wrFlag=4, 
    pKeyInfo=0x0, pCur=0x767a28) at sqlite3.c:69440
#1  0x000000000046c3d2 in sqlite3VdbeExec (p=0x742778) at sqlite3.c:89843
#2  0x0000000000464b5b in sqlite3Step (p=0x742778) at sqlite3.c:84321
#3  0x0000000000464dad in sqlite3_step (pStmt=0x742778) at sqlite3.c:84378
#4  0x0000000000416ec3 in exec_prepared_stmt (pArg=0x7fffffffcb40, 
    pStmt=0x742778) at shell.c:13374
#5  0x0000000000417b73 in shell_exec (pArg=0x7fffffffcb40, 
    zSql=0x7264c0 "WITH\n\tcst(rsx, rsy) AS  (\n\t\tSELECT 100, 100\n\t),\n\n\tcst2(minx, maxx, stepx, miny, maxy, stepy, qualitativex, qualitativey) AS (\t\t\n\t\tSELECT NULL, NULL, NULL, NULL, NULL, NULL, 0, 0\n\t),\n\n\tds0(m, n, x, y, "..., pzErrMsg=0x7fffffffc9c8) at shell.c:13683
#6  0x000000000042647e in runOneSqlLine (p=0x7fffffffcb40, 
    zSql=0x7264c0 "WITH\n\tcst(rsx, rsy) AS  (\n\t\tSELECT 100, 100\n\t),\n\n\tcst2(minx, maxx, stepx, miny, maxy, stepy, qualitativex, qualitativey) AS (\t\t\n\t\tSELECT NULL, NULL, NULL, NULL, NULL, NULL, 0, 0\n\t),\n\n\tds0(m, n, x, y, "..., in=0x7ffff76ac640 <_IO_2_1_stdin_>, startline=1) at shell.c:20613
#7  0x0000000000426a5f in process_input (p=0x7fffffffcb40) at shell.c:20727
#8  0x0000000000428560 in main (argc=1, argv=0x7fffffffde38) at shell.c:21522

Obviously, the problem is caused by sqlite3BtreeCursor() being called with a NULL pointer as its first parameter.

The CTE query below is already a simplified version of an original (and larger one) that started misbehaving after upgrading to 3.35.

This bug and a previous one (reported a few minutes ago) were discovered when trying to debug issues with the original query.

WITH
	cst(rsx, rsy) AS  (
		SELECT 100, 100
	),

	cst2(minx, maxx, stepx, miny, maxy, stepy, qualitativex, qualitativey) AS (		
		SELECT NULL, NULL, NULL, NULL, NULL, NULL, 0, 0
	),

	ds0(m, n, x, y, x2, y2, title, size, mark, label, markmode) AS (
		SELECT 1, 2, 3, 4, 5, 6, 7 , 8, 9, 10, 11
	)
	,
	
	ds(m, n, x, y, x2, y2, title, size, mark, label, markmode) AS (
		SELECT m, n, x,
			y, x2,
			y2,
			title, size, mark, label, markmode
		FROM ds0
		WINDOW w AS (PARTITION BY m, x ORDER BY n)
	),

	d(m, n, x, y, x2, y2, labelx, labely, title, size, mark, label, markmode) AS (
		SELECT m, n, x, y,	x2, y2, x, y, title, size, mark, label, markmode
		FROM ds, cst2
	),

	ylabels(y, label) AS (
		SELECT y, MIN(labely) FROM d GROUP BY y
	),

	yaxis(maxy, miny, stepy , minstepy) AS (
		WITH
			xt0(minx, maxx) AS (
				SELECT  coalesce(miny, min(min(y2), min(y))),  coalesce(maxy, max(max(y2), max(y))) + qualitativey  FROM d, cst2
			),
			xt1(mx, mn) AS (SELECT maxx, minx FROM xt0),
			xt2(mx, mn, step) AS (SELECT mx, mn, (mx-mn)  FROM xt1),
			
			xt3(mx, mn, ms) AS (
				SELECT mx, mn, first_value(rs) OVER (order by x desc) AS ms FROM (SELECT mx, mn, step, f,(mx-mn) as rng, 1.0*step/f as rs, 1.0*(mx-mn)/(step/f) AS x FROM xt2, (SELECT 1 AS f UNION ALL SELECT 2 UNION ALL SELECT 4 UNION ALL SELECT 5)) AS src WHERE x < 10 limit 1),
			xt4(minstepy) AS (
				SELECT MIN(abs(y2-y)) FROM d WHERE y2 != y
			)
		SELECT (mx/ms)*ms, (mn/ms)*ms, coalesce(stepy, ms), coalesce(minstepy, ms, stepy)  FROM xt3, cst2,xt4
	),

	distinct_mark_n_m(mark, ze, zem, title) AS (
		SELECT DISTINCT mark, n AS ze, m AS zem, title FROM ds0
	),

	facet0(m, mi, title, radial) AS (
		SELECT md, row_number() OVER () - 1, title, 'radial' IN (SELECT mark FROM distinct_mark_n_m WHERE zem = md)
		FROM (SELECT DISTINCT zem AS md, title AS title FROM distinct_mark_n_m ORDER BY 2, 1)
	),

	facet(m, mi, xorigin, yorigin, title, radial) AS (
		SELECT m, mi,
			rsx * 1.2 * IFNULL(CASE WHEN (
				0
			) > 0 THEN mi / (
				0
			) ELSE mi % (
				2
			)  END, mi),
			rsy  * 1.2 * IFNULL(CASE WHEN (
				2
			) > 0 THEN mi / (
				2
			) ELSE mi / (
				0
			)  END, 0),
			title, radial FROM facet0, cst
	),

	radygrid(m, mi, tty, wty, ttx, ttx2, xorigin, yorigin) AS (
		SELECT m, mi,  rsy / 2 / ((maxy-miny)/stepy) * (value-1) AS tty, coalesce(NULL, miny + stepy * (value-1)) AS wty, xorigin, xorigin+rsx, xorigin + rsx / 2, yorigin + rsy / 2  FROM generate_series(1), yaxis, cst, facet LEFT JOIN ylabels ON ylabels.y = (miny + (value-1) * stepy) WHERE radial AND stop = 1+1.0*(maxy-miny)/stepy
	),
	
	ypos(m, mi, pcx, pcy, radial) AS (
		SELECT m, mi, xorigin, yorigin + CASE
			WHEN 0 BETWEEN miny AND maxy THEN
				rsy - (0 - miny) * rsy / (maxy-miny)
			WHEN 0 >= maxy THEN 0
			ELSE  rsy
		END, radial FROM yaxis, cst, facet WHERE NOT radial
		UNION ALL
		SELECT m, mi, xorigin + rsx / 2, yorigin + (CASE
			WHEN 0 BETWEEN miny AND maxy THEN
				rsy - (0 - miny) * rsy / 2 / (maxy-miny)
			WHEN 0 >= maxy THEN 0
			ELSE  rsy
		END ) / 2, radial FROM yaxis, cst, facet WHERE radial
	)

-- Segmentation fault
SELECT * FROM radygrid , ypos

-- Inverting the tables order generates no fault
-- SELECT * FROM  ypos, radygrid

10:35 Post: Bug report: CTE can report "circular reference" for an actual column count issue (artifact: aa5a0431c9 user: mzm2021)

The CTE query below should report error "table ds0 has 10 values for 11 columns". It complains about "circular reference: ds0" instead on 3.35.2 (and possibly all 3.35 releases.)

Check the comments in ds0 and in the body for variations on the query to make SQLite behave correctly.

The problem was found on the latest 3.35.2. It does not affect version 3.34.1.

I used the stock SQLite shell (compiled from the amalgamation ZIP) using:

gcc shell.c sqlite3.c -lpthread -ldl -lm -g

My apologies for the not-so-minimal query. It is already a simplified version of the original CTE that went bad after upgrading to 3.35.x.

It seems that the complexity of the CTE is part of what triggers this bug.

WITH
	cst(rsx, rsy) AS  (
		SELECT 100, 100
	),

	cst2(minx, maxx, stepx, miny, maxy, stepy, qualitativex, qualitativey) AS (		
		SELECT NULL, NULL, NULL, NULL, NULL, NULL, 0, 0
	),

	ds0(m, n, x, y, x2, y2, title, size, mark, label, markmode) AS (
		-- Reports error "circular reference: ds0" instead of complaining about the column number "table ds0 has 10 values for 11 columns"
		SELECT 1, 2, 3, 4, 5, 6, 7 , 8, 9, 10

		-- Works fine with the 11th value
		-- SELECT 1, 2, 3, 4, 5, 6, 7 , 8, 9, 10, 11
	)
	,
	
	ds(m, n, x, y, x2, y2, title, size, mark, label, markmode) AS (
		SELECT m, n, x,
			y, x2,
			y2,
			title, size, mark, label, markmode
		FROM ds0
		WINDOW w AS (PARTITION BY m, x ORDER BY n)
	),

	d(m, n, x, y, x2, y2, labelx, labely, title, size, mark, label, markmode) AS (
		SELECT m, n, x, y,	x2, y2, x, y, title, size, mark, label, markmode
		FROM ds, cst2
	),

	ylabels(y, label) AS (
		SELECT y, MIN(labely) FROM d GROUP BY y
	),

	yaxis(maxy, miny, stepy , minstepy) AS (
		WITH
			xt0(minx, maxx) AS (
				SELECT  coalesce(miny, min(min(y2), min(y))),  coalesce(maxy, max(max(y2), max(y))) + qualitativey  FROM d, cst2
			),
			xt1(mx, mn) AS (SELECT maxx, minx FROM xt0),
			xt2(mx, mn, step) AS (SELECT mx, mn, floor(mx-mn)  FROM xt1),
			
			xt3(mx, mn, ms) AS (
				SELECT mx, mn, first_value(rs) OVER (order by x desc) AS ms FROM (SELECT mx, mn, step, f,(mx-mn) as rng, 1.0*step/f as rs, 1.0*(mx-mn)/(step/f) AS x FROM xt2, (SELECT 1 AS f UNION ALL SELECT 2 UNION ALL SELECT 4 UNION ALL SELECT 5)) AS src WHERE x < 10 limit 1),
			xt4(minstepy) AS (
				SELECT MIN(abs(y2-y)) FROM d WHERE y2 != y
			)
		SELECT ceil(mx/ms)*ms, floor(mn/ms)*ms, coalesce(stepy, ms ), coalesce(minstepy, ms, stepy)  FROM xt3, cst2,xt4
	),

	distinct_mark_n_m(mark, ze, zem, title) AS (
		SELECT DISTINCT mark, n AS ze, m AS zem, title FROM ds0
	),

	facet0(m, mi, title, radial) AS (
		SELECT md, row_number() OVER () - 1, title, 'radial' IN (SELECT mark FROM distinct_mark_n_m WHERE zem = md)
		FROM (SELECT DISTINCT zem AS md, title AS title FROM distinct_mark_n_m ORDER BY 2, 1)
	),

	facet(m, mi, xorigin, yorigin, title, radial) AS (
		SELECT m, mi,
			rsx * 1.2 * IFNULL(CASE WHEN (
				0
			) > 0 THEN mi / (
				0
			) ELSE mi % (
				2
			)  END, mi),
			rsy  * 1.2 * IFNULL(CASE WHEN (
				2
			) > 0 THEN mi / (
				2
			) ELSE mi / (
				0
			)  END, 0),
			title, radial FROM facet0, cst
	),

	radygrid(m, mi, tty, wty, ttx, ttx2, xorigin, yorigin) AS (
		SELECT m, mi,  rsy / 2 / ((maxy-miny)/stepy) * (value-1) AS tty, coalesce(NULL, miny + stepy * (value-1)) AS wty, xorigin, xorigin+rsx, xorigin + rsx / 2, yorigin + rsy / 2  FROM generate_series(1), yaxis, cst, facet LEFT JOIN ylabels ON ylabels.y = (miny + (value-1) * stepy) WHERE radial AND stop = 1+1.0*(maxy-miny)/stepy
	),
	
	ypos(m, mi, pcx, pcy, radial) AS (
		SELECT m, mi, xorigin, yorigin + CASE
			WHEN 0 BETWEEN miny AND maxy THEN
				rsy - (0 - miny) * rsy / (maxy-miny)
			WHEN 0 >= maxy THEN 0
			ELSE  rsy
		END, radial FROM yaxis, cst, facet WHERE NOT radial
		UNION ALL
		SELECT m, mi, xorigin + rsx / 2, yorigin + (CASE
			WHEN 0 BETWEEN miny AND maxy THEN
				rsy - (0 - miny) * rsy / 2 / (maxy-miny)
			WHEN 0 >= maxy THEN 0
			ELSE  rsy
		END ) / 2, radial FROM yaxis, cst, facet WHERE radial
	)

-- With any reference to radygrid or ypos, error "circular reference: ds0" is reported instead of "table ds0 has 10 values for 11 columns"
SELECT * FROM ypos, radygrid

-- Correct error "table ds0 has 10 values for 11 columns" is reported when any table other than ypos or radygrid is used
-- SELECT * FROM facet0

Thanks you.