SQLite Forum

Timeline
Login

50 most recent forum posts by user cuz

2021-12-02
17:42 Reply: Linking tables (artifact: 60881ff48b user: cuz)

Now that I have insert values and on looking closer, it's clear you are having circular references.

Not sure how the class on foreign keys went in your establishment, but it's a one-way street. if table X has a foreign-key reference to table Y, then table Y cannot ALSO have a foreign Key reference to table X. Else a race condition of sorts is created. In other words, if I want to insert something into table X, it means a corresponding Key value must already exist in Table Y, and I cannot insert it in table Y first because its foreign Key constraint requires the value to already exist in table X, which is circular.

Now you can work around this in two ways, like disabling foreign key checking before inserting, or you can insert NULL values for the columns first and then later update them once values exist in the other table. Any of these workaround however fully negates the entire point of foreign keys and will make your code horribly botched.

Pick the table which is the boss of the values, then make a reference to it from the table which references/is interested in/consumes the values, and not the other way around. If I remove all the FK statements from your data tables while leaving all of them in the consuming main table, then it all works fine.

See here:

  -- SQLite version 3.35.4  [ Release: 2021-04-02 ]  on  SQLitespeed version 2.1.3.11.
  -- ================================================================================================

PRAGMA foreign_keys = 1;

CREATE TABLE IF NOT EXISTS einfAnforderung(
	EinfAnfId INTEGER PRIMARY KEY,
	Leistungen TEXT,
	Unterstuetzung TEXT,
	Durchfuehrung TEXT,
	Doku TEXT,
	LastId INTEGER
);

CREATE TABLE IF NOT EXISTS sysAnforderung(
	sysAnfordId INTEGER PRIMARY KEY,
	ZielgruppEx TEXT,
	ZielgruppIn TEXT,
	Hardware TEXT,
	Funktionen TEXT,
	Daten TEXT,
	Ausgaben TEXT,
	Schnittstellen TEXT,
	Nutzeranford TEXT,
	Systemanford TEXT,
	Weitereanford TEXT,
	LastId INTEGER
);

CREATE TABLE IF NOT EXISTS istZustand(
	IstId INTEGER PRIMARY KEY,
	Unetz TEXT,
	Uhard TEXT,
	Ubs TEXT,
	Uanwend TEXT,
	Uschnitt TEXT,
	Uunterschied TEXT,
	Ugp TEXT,
	Uprozessd TEXT,
	LastId INTEGER
);

CREATE TABLE IF NOT EXISTS ziel(
	ZielId INTEGER PRIMARY KEY,
	Anlass TEXT,
	Beschreibung TEXT,
	Ziele TEXT,
	Start DATE,
	Ende DATE,
	Sonstiges TEXT,
	LastId INTEGER
);

CREATE TABLE IF NOT EXISTS unternehmen (
	UId INTEGER AUTO_INCREMENT PRIMARY KEY,
	Uname TEXT,
	Uadresse TEXT,
	TochterU TEXT,
	Utelefonnr TEXT,
	Ufax TEXT,
	Uweb TEXT,
	Ugroesse TEXT,
	Uprognose TEXT,
	Ugruendung TEXT,
	Ubranche TEXT,
	UprodBeschr TEXT,
	Uanlage BLOB,
	Usonstiges TEXT,
	LastId INTEGER
);

CREATE TABLE IF NOT EXISTS lastenheft (
	LastId INTEGER AUTO_INCREMENT PRIMARY KEY,
	LastName TEXT,
	UId INTEGER,
	ZielId INTEGER,
	IstId Integer,
	sysAnfordId INTEGER,
	EinfAnfId INTEGER,
	FOREIGN KEY(UId) REFERENCES unternehmen(UId),
	FOREIGN KEY(UId) REFERENCES ziel(ZielId),
	FOREIGN KEY(UId) REFERENCES istZustand(IstId),
	FOREIGN KEY(UId) REFERENCES sysAnforderung(sysAnfordId),
        FOREIGN KEY(UId) REFERENCES einfAnforderung(EinfAnfId)
);

INSERT INTO unternehmen VALUES(NULL, 2, 'hello', 'j', 'j', 'k', 'j', 'k', 'm', 'j', 'kd', '2', 'j', 'j', 'j');


SELECT * FROM unternehmen;

  --       |            |        |        |Utelefon-|     |     |        |         |Ugruendu-|        |            |       |Usonstig-|      
  -- UId   |    Uname   |Uadresse|TochterU|    nr   | Ufax| Uweb|Ugroesse|Uprognose|    ng   |Ubranche| UprodBeschr|Uanlage|    es   |LastId
  -- ------|------------|--------|--------|---------|-----|-----|--------|---------|---------|--------|------------|-------|---------|------
  -- NULL  |      2     |hello   |    j   |    j    |  k  |  j  |    k   |    m    |    j    |   kd   |      2     |   j   |    j    |   j  



  -- ------------------------------------------------------------------------------------------------

  -- 2021-12-02 19:36:24.084  |  [Success]    Script Success.

Also, I don't know if your programming language cares about it, but you have dangling commas at the end of your call and not enough values for to match the insert prototype. You have to check that very well.

Perhaps "try" the SQL outside of the code first making sure it works and then try to adapt it to code. The other way madness lies.

Let us know if there are any other questions.

16:33 Reply: Linking tables (artifact: afaea8cb7d user: cuz)

This looks all fine in principle.

Your table definition with he FK definitions need to be the last statement called though, you cannot have that succeed when the referenced tables do not exist yet.

AS to the problem of the original question, now we know the schema, can you now show an example of a set of values you would insert and what do you expect after that insert would be in the different tables vs. what you actually see in the tables? (Just two or three inserts will do).

16:03 Reply: In perspective of disk usage, is the sizeof 'a' < INT ? (artifact: ca8474171b user: cuz)

It is always worth using foreign keys for columns where the values are not unique. It almost always saves space if using with an integer foreign key, but there is more to foreign keys.

  • A Foreign Key is a constraint, it has functionality and can facilitate auto updating, auto deletion and the like.
  • A Foreign Key provides a way to dereference a value, and if that value needs to change (such as you want to use 'Video' in stead of 'video'), then you need to change only 1 item in 1 table.
  • A Foreign Key provides data integrity, avoiding such things as misspelling some entries or finger errors (since only listed values are accepted, or if not constrained, you can still easily see and rectify the wrong ones).

Having a smaller space-saving footprint is just the cherry on top.

15:55 Reply: Linking tables (artifact: 1743aa7248 user: cuz)

All we can tell from your description above is: Yep, you probably have a problem.

If you want more help than that, please show the exact schema SQL for making the tables. That should already make the problem clear, but just in case it is more obscure, posting the data (or some of it) for each table in the form of SQL INSERT statements will also help.

12:48 Edit reply: Howto group most similar values (artifact: f454a5993d user: cuz)

Nope, that works well - other than adding the Distinct and not showing the initial ID and Value, which would prevent the line with id 4 and id 7 (which are essentially duplicate) to both be shown.

12:47 Reply: Howto group most similar values (artifact: 216fe84a94 user: cuz)

Nope, that works well - other than adding the Distinct and not showing the initial ID, which would prevent the line with id 4 and id 7 (which are essentially duplicate) to both be shown.

11:06 Reply: Howto group most similar values (artifact: 3ec5cc868e user: cuz)

If you need to internalize this and the CTE is in you way, here it is without the CTE and the extra fields, using a sub-query:

SELECT DISTINCT values_in_group, values_average
   FROM (
   	SELECT COUNT(*) AS values_in_group, AVG(B.Value) AS values_average
	  FROM Sampled AS A
	  JOIN Sampled AS B ON (ABS(B.Value-A.Value)/NULLIF(A.Value,0)) < 0.05
	 GROUP BY A.ID
	) AS avgs
  ORDER BY values_in_group DESC
;


  -- values_- |                    
  -- in_group |      values_average
  -- ---------|--------------------
  --     3    |    20.2333333333333
  --     1    |                25.1
  --     1    |               100.2

10:52 Edit reply: Howto group most similar values (artifact: 32190ab06f user: cuz)

I think Gunter's point was more about questioning the mathematical correctness of your request than your solution. To make his point clear, see my solution again here but with a lot more "similar" values, is this really what you expect to see?:

 -- SQLite version 3.35.4  [ Release: 2021-04-02 ]  on  SQLitespeed version 2.1.3.11.
================================================================================================

CREATE TABLE Sampled(ID INTEGER PRIMARY KEY AUTOINCREMENT, Value REAL);

INSERT INTO Sampled(ID, Value) VALUES
 ( 1,  20.1)
,( 2,  20.4)
,( 3,  25.1)
,( 4, 100.2)
,( 5,  20.2)
,( 6,   0.0)
,( 7,  20.0)
,( 8,  20.1)
,( 9,  20.2)
,(10,  20.3)
,(11,  20.4)
,(12,  20.5)
,(13,  20.6)
,(14,  20.7)
,(15,  20.8)
,(16,  20.9)
,(17,  21.0)
,(18,  21.1)
,(19,  21.2)
,(20,  21.3)
,(21,  21.4)
,(22,  21.5)
,(23,  21.6)
;

WITH avgs(values_in_group, values_average, values_min, values_max) AS (
	SELECT COUNT(*), AVG(B.Value), MIN(B.Value), MAX(B.Value)
	  FROM Sampled AS A
	  JOIN Sampled AS B ON (ABS(B.Value-A.Value)/NULLIF(A.Value,0)) < 0.05
	 GROUP BY A.ID
 )
 SELECT DISTINCT values_in_group, values_average, values_min, values_max
   FROM avgs
  ORDER BY values_in_group DESC
;


  -- values_- |                    |values_- |values_- 
  -- in_group |      values_average|      min|      max
  -- ---------|--------------------|---------|---------
  --     20   |              20.715|     20.0|     21.6
  --     19   |    20.6684210526316|     20.0|     21.5
  --     19   |    20.7526315789474|     20.1|     21.6
  --     18   |    20.6222222222222|     20.0|     21.4
  --     17   |    20.5764705882353|     20.0|     21.3
  --     17   |    20.8294117647059|     20.2|     21.6
  --     16   |            20.53125|     20.0|     21.2
  --     15   |    20.4866666666667|     20.0|     21.1
  --     15   |    20.9133333333333|     20.3|     21.6
  --     14   |    20.9571428571429|     20.4|     21.6
  --     13   |                20.4|     20.0|     20.9
  --     12   |               21.05|     20.5|     21.6
  --     11   |                21.1|     20.6|     21.6
  --     1    |                25.1|     25.1|     25.1
  --     1    |               100.2|    100.2|    100.2

Some values are similar (within 5%) to values above them and below them, but not to values more than 5% away, so a lot of small groupings then form containing values from multiple sets of "close" values.

A better way to do this is to predefine the categories (say 1, 1.5, 2, 2.5 etc...) and then see how many values are "close" to the predefined categories. Say if you want that, we can show how to do that too.

10:52 Edit reply: Howto group most similar values (artifact: 283d47a255 user: cuz)

I think Gunther's point was more about questioning the mathematical correctness of your request than your solution. To make his point clear, see my solution again here but with a lot more "similar" values, is this really what you expect to see?:

 -- SQLite version 3.35.4  [ Release: 2021-04-02 ]  on  SQLitespeed version 2.1.3.11.
================================================================================================

CREATE TABLE Sampled(ID INTEGER PRIMARY KEY AUTOINCREMENT, Value REAL);

INSERT INTO Sampled(ID, Value) VALUES
 ( 1,  20.1)
,( 2,  20.4)
,( 3,  25.1)
,( 4, 100.2)
,( 5,  20.2)
,( 6,   0.0)
,( 7,  20.0)
,( 8,  20.1)
,( 9,  20.2)
,(10,  20.3)
,(11,  20.4)
,(12,  20.5)
,(13,  20.6)
,(14,  20.7)
,(15,  20.8)
,(16,  20.9)
,(17,  21.0)
,(18,  21.1)
,(19,  21.2)
,(20,  21.3)
,(21,  21.4)
,(22,  21.5)
,(23,  21.6)
;

WITH avgs(values_in_group, values_average, values_min, values_max) AS (
	SELECT COUNT(*), AVG(B.Value), MIN(B.Value), MAX(B.Value)
	  FROM Sampled AS A
	  JOIN Sampled AS B ON (ABS(B.Value-A.Value)/NULLIF(A.Value,0)) < 0.05
	 GROUP BY A.ID
 )
 SELECT DISTINCT values_in_group, values_average, values_min, values_max
   FROM avgs
  ORDER BY values_in_group DESC
;


  -- values_- |                    |values_- |values_- 
  -- in_group |      values_average|      min|      max
  -- ---------|--------------------|---------|---------
  --     20   |              20.715|     20.0|     21.6
  --     19   |    20.6684210526316|     20.0|     21.5
  --     19   |    20.7526315789474|     20.1|     21.6
  --     18   |    20.6222222222222|     20.0|     21.4
  --     17   |    20.5764705882353|     20.0|     21.3
  --     17   |    20.8294117647059|     20.2|     21.6
  --     16   |            20.53125|     20.0|     21.2
  --     15   |    20.4866666666667|     20.0|     21.1
  --     15   |    20.9133333333333|     20.3|     21.6
  --     14   |    20.9571428571429|     20.4|     21.6
  --     13   |                20.4|     20.0|     20.9
  --     12   |               21.05|     20.5|     21.6
  --     11   |                21.1|     20.6|     21.6
  --     1    |                25.1|     25.1|     25.1
  --     1    |               100.2|    100.2|    100.2

Some values are similar (within 5%) to values above them and below them, but not to values more than 5% away, so a lot of small groupings then form containing values from multiple sets of "close" values.

A better way to do this is to predefine the categories (say 1, 1.5, 2, 2.5 etc...) and then see how many values are "close" to the predefined categories. Say if you want that, we can show how to do that too.

10:51 Reply: Howto group most similar values (artifact: c5e1aee9a8 user: cuz)

I think Gunther's point was more about questioning the mathematical correctness of your request than your solution. To make his point clear, see my solution again here but with a lot more "similar" values, is this really what you expect to see?:

 -- SQLite version 3.35.4  [ Release: 2021-04-02 ]  on  SQLitespeed version 2.1.3.11.

  -- Script Items: 3          Parameter Count: 0
  -- 2021-12-02 12:48:05.400  |  [Info]       Script Initialized, Started executing...
  -- ================================================================================================

CREATE TABLE Sampled(ID INTEGER PRIMARY KEY AUTOINCREMENT, Value REAL);

INSERT INTO Sampled(ID, Value) VALUES
 ( 1,  20.1)
,( 2,  20.4)
,( 3,  25.1)
,( 4, 100.2)
,( 5,  20.2)
,( 6,   0.0)
,( 7,  20.0)
,( 8,  20.1)
,( 9,  20.2)
,(10,  20.3)
,(11,  20.4)
,(12,  20.5)
,(13,  20.6)
,(14,  20.7)
,(15,  20.8)
,(16,  20.9)
,(17,  21.0)
,(18,  21.1)
,(19,  21.2)
,(20,  21.3)
,(21,  21.4)
,(22,  21.5)
,(23,  21.6)
;

WITH avgs(values_in_group, values_average, values_min, values_max) AS (
	SELECT COUNT(*), AVG(B.Value), MIN(B.Value), MAX(B.Value)
	  FROM Sampled AS A
	  JOIN Sampled AS B ON (ABS(B.Value-A.Value)/NULLIF(A.Value,0)) < 0.05
	 GROUP BY A.ID
 )
 SELECT DISTINCT values_in_group, values_average, values_min, values_max
   FROM avgs
  ORDER BY values_in_group DESC
;


  -- values_- |                    |values_- |values_- 
  -- in_group |      values_average|      min|      max
  -- ---------|--------------------|---------|---------
  --     20   |              20.715|     20.0|     21.6
  --     19   |    20.6684210526316|     20.0|     21.5
  --     19   |    20.7526315789474|     20.1|     21.6
  --     18   |    20.6222222222222|     20.0|     21.4
  --     17   |    20.5764705882353|     20.0|     21.3
  --     17   |    20.8294117647059|     20.2|     21.6
  --     16   |            20.53125|     20.0|     21.2
  --     15   |    20.4866666666667|     20.0|     21.1
  --     15   |    20.9133333333333|     20.3|     21.6
  --     14   |    20.9571428571429|     20.4|     21.6
  --     13   |                20.4|     20.0|     20.9
  --     12   |               21.05|     20.5|     21.6
  --     11   |                21.1|     20.6|     21.6
  --     1    |                25.1|     25.1|     25.1
  --     1    |               100.2|    100.2|    100.2

Some values are similar (within 5%) to values above them and below them, but not to values more than 5% away, so a lot of small groupings then form containing values from multiple sets of "close" values.

A better way to do this is to predefine the categories (say 1, 1.5, 2, 2.5 etc...) and then see how many values are "close" to the predefined categories. Say if you want that, we can show how to do that too.

10:24 Reply: Howto group most similar values (artifact: a3e92ed16b user: cuz)

At last, an interesting question :)

This is one way of doing it. I've added some more information about the values, and also added another Zero value to make sure the logic doesn't break ever:

  -- SQLite version 3.35.4  [ Release: 2021-04-02 ]  on  SQLitespeed version 2.1.3.11.
================================================================================================

CREATE TABLE Sampled(ID INTEGER PRIMARY KEY AUTOINCREMENT, Value REAL);

INSERT INTO Sampled(ID, Value) VALUES
 (1,  20.1)
,(2,  20.4)
,(3,  25.1)
,(4, 100.2)
,(5,  20.2)
,(6,   0.0)
;

WITH avgs(values_in_group, values_average, values_min, values_max) AS (
	SELECT COUNT(*), AVG(B.Value), MIN(B.Value), MAX(B.Value)
	  FROM Sampled AS A
	  JOIN Sampled AS B ON (ABS(B.Value-A.Value)/NULLIF(A.Value,0)) < 0.05
	 GROUP BY A.ID
 )
 SELECT DISTINCT values_in_group, values_average, values_min, values_max
   FROM avgs
  ORDER BY values_in_group DESC
;


  -- values_- |                    |values_- |values_-
  -- in_group |      values_average|      min|      max
  -- ---------|--------------------|---------|---------
  --     3    |    20.2333333333333|     20.1|     20.4
  --     1    |                25.1|     25.1|     25.1
  --     1    |               100.2|    100.2|    100.2

Hope that is clear, but feel free to ask if anything isn't.

2021-12-01
12:57 Reply: Will the database file be corrupted if using MEMORY journal mode ? (artifact: d9829ba3b6 user: cuz)

The point of the journals or write-ahead logs is to avoid corruption during loss of writing the data to the storage medium, whatever that loss may be caused by, software that segfaults, or hangs, or gets powered down, or gets lied to by the operating system on whether that data has truly reached the drive surface. Possibilities are myriad.

Not using them removes this safeguard and so corruption becomes a very real possibility.

A very simplified demonstration follows (and which mimics-ish only one of the SQLite journal modes, but it is enough to make the point):

Imagine your DB has 8-bytes per page which is stored in bytes starting with the length, then the type and then the data. (In this simple DB data can roll across page boundaries).

If you write to it the int 55 (0x37, length = 1, type Int = 1) and 'Hello World' (Length = 12 (0x0c), String = Type 3) it might look like this:
[Using actual text in stead of hex bytes for easier reading where appropriate]

Pg No | data
  0   | 01 01 37 0c 03  H  e  l 
  1   |  l  o  W  o  r  l  d

Now let's say I send a query to change the 55 to 1027 (0x0403), this now requires 2 bytes for the integer so we have to move our page data up to become:

Pg No | data
  0   | 02 01 04 03 0c 03  H  e
  1   |  l  l  o  W  o  r  l  d

So the first step we will do to try achieve the above is to copy the two original pages out to a journal file, then write the new two pages, then remove the journal file.
However, if during the write something goes wrong and only the first 3 new bytes make it to the storage, like this:

Pg No | data
  0   | 02 01 04 0c 03  H  e  l 
  1   |  l  o  W  o  r  l  d

This data has become corrupt, because it looks like a2-Byte integer of 0x040c followed by something that is 3 bytes long with a type "H"... which is where the DB engine gives up and calls it corrupt.

The remedy is to note that the journal file we copied away with the two pages is not deleted (we never got that far) and so the next time the DB engine starts it may simply copy those two pages back leaving us with a "rolled-back" state as before: 

Pg No | data
  0   | 01 01 37 0c 03  H  e  l 
  1   |  l  o  W  o  r  l  d

Which leaves us without the change, but with a perfectly valid non-corrupt DB.

There are other problems too, the write can fail during making of the journal file, but that is easily caught with a checksum. Other considerations when choosing a journal mode is that the OS can lie about the journal file having been written to disk, that is why options such as forcing Flush to disk after writes can be enabled. Sometimes that is not needed since some hardware (server hardware mostly) do either not lie about data reaching the drive, or have battery backup for the drive itself to ensure the cache gets flushed at least, removing the need for strict waiting.

I hope that answers the question of why the journal modes are needed and why you can't live without them with any safety expectation. You can have any range of set-ups between maximum speed and maximum safety, but the maximums themselves are mutually exclusive.

2021-11-29
20:14 Reply: Proposed new date/time function to return a unix timestamp (artifact: e3ede39ef9 user: cuz)

Firstly - this will be a great addition - thank you, I've been having to compute and translate timestamps forever, and now see the answer is here - but not quite....

Why blanket discard integer milliseconds? In most systems dealing with any sort of date-time accuracy (data logging especially), the milliseconds part is most important.

It is trivial to ignore integer milliseconds by integer-dividing by 1000. It's less trivial to get them if there is no single unixepoch function giving them, typically requiring daisy-chaining multiple functions together.
Maybe then a second function (as was suggested earlier) such as unixepoch_ms() or such?

Another post mentioned the added luxury this will afford when creating JS timestamps. Without milliseconds, this solution is only half a solution.

2021-11-25
15:28 Reply: Feature Request split string table-valued function (artifact: dc60c431d4 user: cuz)

Good job!

Fair warning (as you probably already know) - That will be quite slow for larger datasets. It has the Advantage/Disadvantage of causing multiple output rows which a simple UDF cannot easily do.

If a "split()" function did exist (even as a UDF), we could get rid of a lot of the fluff in that CTE and speed it up significantly.

General musings:
I wonder if there is a precedent in Postgres or even MySQL or such, I don't recall ever using/finding such a function in any SQL, which probably suggests it's not universally needed, but might be useful still. Will need to be Unicode-aware I suppose.

10:51 Edit reply: Feature Request split string table-valued function (artifact: 840c98a8e8 user: cuz)

This demonstrates a way to split and flatten separated values in a column. It uses one column of the input as a key to identify which row the separated values belong to and lists it accordingly.
Not sure if it is exactly what you needed, but say if not, it should be easy to adapt the format.

  -- SQLite version 3.35.4  [ Release: 2021-04-02 ]  on  SQLitespeed version 2.1.3.11.
================================================================================================

-- Make temporary Space-Separated-Value table

DROP TABLE IF EXISTS tmpssv;
CREATE TABLE tmpssv (     
  ID INTEGER PRIMARY KEY,
  keyCol TEXT,
  ssvCol TEXT
);


-- And fill it with some space-separated content

INSERT INTO tmpssv (ID, keyCol, ssvCol) VALUES
    (1, 'foo', '4 66 51 3009 2 678')
  , (2, 'bar', 'Sputnik Discovery')
  , (3, 'baz', '101 I-95 104')
  , (4, 'foz', 'Amsterdam  Beijing  London  Moscow  New York  Paris  Tokyo')
  , (5, 'one', 'John')
;


-- The CTE Query that takes a column (string) and iteratively splits off the first value (up to the first space) until no more text is left in the string.

WITH ssvrec(i, l, c, r) AS (
      SELECT keyCol, 1,
             ssvCol||' ', '' -- Forcing a space at the end of ssvCol removes complicated checking later
        FROM tmpssv
       WHERE 1
    UNION ALL
      SELECT i,
             instr( c, ' ' ) AS vLength,
             substr( c, instr( c, ' ' ) + 1) AS vRemainder,
             trim( substr( c, 1, instr( c, ' ' ) - 1) ) AS vSSV
        FROM ssvrec
       WHERE vLength > 0
    )
  SELECT ID, keyCol, r AS ssvValues
    FROM tmpssv, ssvrec
  WHERE keyCol = i AND r <> ''
  ORDER BY ID, keyCol
;

-- I've used compact notation to avoid long wordy SQL, but the short column names basically represent:
--   i = Key of unpacked row
--   l = Length of text remaining to be unpacked
--   c = Column text remaining to be unpacked
--   r = Resulting current unpacked (split-off) ssv value


-- The Resulting split/unpacked values:

  --  ID|keyCol|ssvValues  
  -- ---|------|-----------
  --  1 |  foo |2          
  --  1 |  foo |3009       
  --  1 |  foo |4          
  --  1 |  foo |51         
  --  1 |  foo |66         
  --  1 |  foo |678        
  --  2 |  bar |Discovery  
  --  2 |  bar |Sputnik    
  --  3 |  baz |101        
  --  3 |  baz |104        
  --  3 |  baz |I-95       
  --  4 |  foz |Amsterdam  
  --  4 |  foz |Beijing    
  --  4 |  foz |London     
  --  4 |  foz |Moscow     
  --  4 |  foz |New        
  --  4 |  foz |Paris      
  --  4 |  foz |Tokyo      
  --  4 |  foz |York       
  --  5 |  one |John       


-- Cleanup

DROP TABLE IF EXISTS tmpssv;

10:48 Reply: Feature Request split string table-valued function (artifact: 71ccc5de85 user: cuz)

This demonstrates a way to split and flatten separated values in a column. It uses one column of the input as a key to identify which row the separated values belong to and lists it accordingly.
Not sure if it is exactly what you needed, but say if not, it should be easy to adapt the format.

  -- SQLite version 3.35.4  [ Release: 2021-04-02 ]  on  SQLitespeed version 2.1.3.11.
================================================================================================

-- Make temporary Space-Separated-Value table

DROP TABLE IF EXISTS tmpssv;
CREATE TABLE tmpssv (     
  ID INTEGER PRIMARY KEY,
  keyCol TEXT,
  ssvCol TEXT
);


-- And fill it with some space-separated content

INSERT INTO tmpssv (ID, keyCol, ssvCol) VALUES
    (1, 'foo', '4 66 51 3009 2 678')
  , (2, 'bar', 'Sputnik Discovery')
  , (3, 'baz', '101 I-95 104')
  , (4, 'foz', 'Amsterdam  Beijing  London  Moscow  New York  Paris  Tokyo')
  , (5, 'one', 'John')
;


-- The CTE Query that takes a column (string) and iteratively splits off the first value (up to the first space) until no more text is left in the string.

WITH ssvrec(i, l, c, r) AS (
      SELECT keyCol, 1,
             ssvCol||' ', '' -- Forcing a space at the end of ssvCol removes complicated checking later
        FROM tmpssv
       WHERE 1
    UNION ALL
      SELECT i,
             instr( c, ' ' ) AS vLength,
             substr( c, instr( c, ' ' ) + 1) AS vRemainder,
             trim( substr( c, 1, instr( c, ' ' ) - 1) ) AS vSSV
        FROM ssvrec
       WHERE vLength > 0
    )
  SELECT ID, keyCol, r AS ssvValues
    FROM tmpssv, ssvrec
  WHERE keyCol = i AND r <> ''
  ORDER BY ID, keyCol
;

-- I've used compact notation to avoid long wordy SQL, but the short column names basically represent:
--   i = Index of unpacked row
--   l = Length of text remaining to be unpacked
--   c = Column text remaining to be unpacked
--   r = Resulting current unpacked (split-off) ssv value


-- The Resulting split/unpacked values:

  --  ID|keyCol|ssvValues  
  -- ---|------|-----------
  --  1 |  foo |2          
  --  1 |  foo |3009       
  --  1 |  foo |4          
  --  1 |  foo |51         
  --  1 |  foo |66         
  --  1 |  foo |678        
  --  2 |  bar |Discovery  
  --  2 |  bar |Sputnik    
  --  3 |  baz |101        
  --  3 |  baz |104        
  --  3 |  baz |I-95       
  --  4 |  foz |Amsterdam  
  --  4 |  foz |Beijing    
  --  4 |  foz |London     
  --  4 |  foz |Moscow     
  --  4 |  foz |New        
  --  4 |  foz |Paris      
  --  4 |  foz |Tokyo      
  --  4 |  foz |York       
  --  5 |  one |John       


-- Cleanup

DROP TABLE IF EXISTS tmpssv;

2021-11-24
18:52 Reply: CBOR Support & challenging DB design problem for keeping history with big schema changes (artifact: 15917792c7 user: cuz)

Sidestepping the CBOR request completely since I don't care one way or the other, but to comment on the secondary question:

if I could plug a function of my language's implementation into SQLite, this will be just fine.

This is pretty easy to do in SQLite, see here: sqlite.org/c3ref/create_function.html

The main question is how to build indexes on such...

Creating an Index on the result of a function has been possible in SQLite for quite some time now. So if you are successful at creating the functions you want to use (as above), then adding indexes based on them is trivial within the rules.

If the intended functions are non-deterministic or violates any other of those rules, your only remedy will be, as Dominique suggested, making a virtual table module where you have control over indexing. This can be quite involved but is still very possible and example modules abound.

18:17 Reply: How to free max heap size on execution of sqlite query using sqlite3_exec? (artifact: 50a81fd714 user: cuz)

This thread has probably run its course, but in the interest of trying to be helpful:

Just I want to know why heap size increase on query execution?

Well, so do we. We have no idea. What we do know is:

  • 1. It is not SQLite, all of us use it daily and it does no memory leaking.
  • 2. That leaves your application, which we do not know and cannot guess at (and mostly we are not C++ programmers).
  • 3. Typically memory leaks of the sort is caused by the calling thread (your program) preparing statements, iterating them, but then not finalizing them. or
  • 4. receiving strings (in memory allocated by SQLite) from the SQLite APIs (such as Error messages) which you then do not free the memory of. Any memory structure passed to you from an SQLite API which is now no longer under the control of SQLite (and won't be passed back to SQLite to finalize) must be freed by your code.
  • 5. Error handling in code sometimes short-circuits the code execution and ends up by-passing the finalization of statements or other memory objects.

That's best we can do, somewhat educated guesses. Personally I am starting to think you would probably get better feedback from a C++ forum. Try to make the smallest C++ program that shows the leaky behaviour and ask them what is wrong with it (is what I would do).

If any body technically explain it will be better. Is there method to free it?

As Larry mentioned, this was already handled in another post (#11). We can do no better job than the documentation it shows, but if you do find something in the documentation that is not clear, feel free to quote it here, say what you understand it to mean, and we will gladly help correct any misunderstandings or add clarity.

2021-11-23
18:32 Reply: How to free max heap size on execution of sqlite query using sqlite3_exec? (artifact: 49e3177b7b user: cuz)

To answer the original question (since nobody else did) - what you are probably looking for is this:
sqlite3_db_release_memory() or sqlite3_release_memory - related to whether your version is compiled with SQLITE_ENABLE_MEMORY_MANAGEMENT.

That said, the warnings of the other posters are to be heeded - if your Heap grows and grows forever, especially when simply repeating the same query, you have a leak. This is most probably due to what Larry described, which should be fixed before attempting any other memory optimizations.

2021-11-14
10:18 Reply: A newbie's question about database structure design (artifact: 2b377d4a77 user: cuz)

You are exactly correct One table, some extra rows for distinguishing dates and Stocks.

With appropriate indexing, this one table will probably be faster than any other combination of multiple tables you could consider.

Also, SQLite is size-limited based on your choice of Page-size, and the row count limit, well, let's just say at INT64 it can store those rows you described for longer than the age of the known Universe.

The Size limit will be depending on your amount of data per row, which sounds like it isn't a lot. Rows are stored on pages, and in the default SQLite setup the page size is 4K which is typically corresponding to most OS's IO page-sizes, which is efficient. There is a limit on the amount of pages that can be in the DB, so the real technical limiting factor is how big a page is, and you can make the page size bigger easily to a maximum of 64K which allows DBs of up to 2 Terra bytes in size - See here how to change it.

There are other considerations. If you want to add another Index with a billion rows, that will see a large space requirement added. If you want to Vacuum a DB (reducing the free space inside it after some use) you will need free space on your hard drive of more than twice the size of the DB.

Best advice I can give is.... start with the default page size, and try import it all, see if it runs into size errors, and if so, see at what percentage of the job it failed (if at all), that should give you an indication of how much more you will need, which will help deciding an appropriate page size.

Also - Ensure you import each section (folder) in a single transaction, to make it as fast as possible without postponing fail conditions by much, and that way it's also easy to just add on new folders as they come.

09:49 Edit reply: Usage of application_id and magic.txt (artifact: 6a768e7dca user: cuz)

How is this the responsibility of the SQLite team?

This lies exactly with the SQLite team. Not sure if it can be classed a "responsibility" since there are no promises of which no demand can be made, but to get your software registered in the magix.txt file, is most certainly something needing to be OK'd and then acted upon by the SQLite team.

To the OP: The list of software in that file, which could be deemed "registered SQLite file formats" have been added over the years upon just such requests as yours, so there is no problem asking. However, the habit (having seen this request some few times before) seems to be to accept mainly mainstream/public types of software, and only after some public prevalence.

To use Richard's own words: "Let's wait until we encounter some of your software in the wild, then reconsider." (This may not be the verbatim quote since I don't have it handy, but I believe it to be the exact sentiment).

I hope that clears things up a bit for all.

09:48 Reply: Usage of application_id and magic.txt (artifact: 53e04f2de3 user: cuz)

How is this the responsibility of the SQLite team?

This lies exactly with the SQLite team. Not sure if it can be classed a "responsibility" since there are no promise of which no demand can be made, but to get your software registered in the magix.txt file, is most certainly something needing to be OK'd and then acted upon by the SQLite team.

To the OP: The list of software in that file, which could be deemed "registered SQLite file formats" have been added over the years upon just such requests as yours, so there is no problem asking. However, the habit (having seen this request some few times before) seems to be to accept mainly mainstream/public types of software, and only after some public prevalence.

To use Richard's own words: "Let's wait until we encounter some of your software in the wild, then reconsider." (This may not be the verbatim quote since I don't have it handy, but I believe it to be the exact sentiment).

I hope that clears things up a bit for all.

2021-11-08
20:33 Reply: Hide rows in table1 if same value found in table2 ? (artifact: 688f53bade user: cuz)

Gunter is right, this doesn't seem to be an SQLite problem.

However, DB4S is simply a DB admin tool, like a myriad others, and a very good one at that. It's function is to display the contents of a DB, edit it and to let you write queries. It isn't really a display tool to let you customize views (that I know of).

Having said that, if you create a VIEW in SQL with what you want to display, and then select that view (much like any other table), it will show what you require.

Off the top of my head, this should create the sort of view you would like:

CREATE VIEW table1_view AS
SELECT *
  FROM table1
 WHERE table1.HEX NOT IN (SELECT table2.HEX FROM table2)
;

Once you run that SQL successfully, a new object should appear where your list of tables are with that name (table1_view), selecting it will show only the lines that do not have corresponding entries in table2.

I did not test that since I don't have your tables, but it should work in principle.

PS: If ever you want to remove that new view object from your DB again, you can run this SQL:

DROP VIEW table1_view;
2021-11-02
14:52 Reply: Is the order conserved in a table from a VALUES clause? (artifact: c279c2fa03 user: cuz)

I did see your other comment and I do see what you are saying of "why" it happens to work that way, no argument there, but I am not accepting that that is what "should" happen.

My contention is simply that either it must be so guaranteed in the documentation to work that way, now and forever, OR, pragma_reverse_unordered_selects must catch this case and reverse it also.

We are very much using the reverse_unordered_selects as part of pre-production testing because I thought that if it works with that pragma enabled, then it works correctly, no exceptions (and I do not think my belief is misplaced). Now we are seeing a case where that simply may not be, a query order may change in future for a query that is going into production and we have no way of knowing now if all our SQL is good and ordered and will remain so for days to come.

I mean, if it only works "sometimes", what is the point of having that pragma?

13:54 Reply: Is the order conserved in a table from a VALUES clause? (artifact: d6d5a86b7d user: cuz)

Right you are, sorry was already replying when that was sent.

I have to assume that this means that either:

  1. the values clause is definitely guaranteed to be ordered in SQLite, or
  2. that there is a bug and reverse_unordered_selects is not doing its job.
13:38 Reply: Is the order conserved in a table from a VALUES clause? (artifact: 830317c28d user: cuz)

The order of the values are most assuredly not guaranteed in SQL in general, and I doubt it is anywhere guaranteed by SQLite (unless I've missed such documentation).

To ensure a row order, you have to be sure the set contains values which possess intrinsic order and by which you specifically ask the results to be ordered. That is the only way. Any apparent order you are currently getting from any query that isn't ordering by a specific value, is purely happenstance and coincidental.

Luckily SQLite provides a fool-proof way by which you can see what is "guaranteed": There is a pragma called "reverse_unordered_selects" (you can read about it here) which, when activated, will purposefully give your query results back in a bad order UNLESS you asked correctly for it to be ordered, in which case it returns it correctly. So if your query returns a good order while that pragma is in effect, you can be guaranteed that the query will work in future for the order you've intended.

2021-10-20
19:13 Reply: I need a little help with UPDATE? (artifact: dcce7c2652 user: cuz)

David Raymond is right, we need a lot more info to offer anything resembling "wise" advice.

If I have to guess, your most immediate solution to just make it work, regardless of how well it functions (and assuming the "missing" s table in your query is actually called "s"), would be to do:

UPDATE t1 SET t1.start_station_name = (
    SELECT s.start_station_name
      FROM s
     WHERE s.start_lat = t1.start_lat AND s.start_lng = t1.start_lng
    )
;

Note that this will replace ALL station names in t1 with whatever they are in "s" for the same lat-long. Please back up your files before doing this, and give us much more information for more useful suggestions.

Note Also - if lat and lng are REAL (Floating point) fields, the equality check x.lat = y.lat is not guaranteed to work due to small possible differences in float values that seem equal to the naked eye.

15:01 Reply: INDEX usage on inequality and IN (artifact: 76ff3e4a5c user: cuz)

That is called "Cardinality", which in math terms is the size of a set, and in RDBMS terms is the size of the set of possible values in a column.

i.e. A table with 1,000,000 rows with only 4 possible values in Column A is said to have a Cardinality of 4 for Column A. By contrast if Column B was declared Unique and indeed contained all values, its cardinality would be 1 million.

Most SQL engines use exactly this cardinality figure to guess what would be best to do, use the index or not. The problem is they cannot magically "know" what the cardinality of data is, they have to really at some point look through each row in a table and for each column (or at least Indexed column) to calculate the cardinality. in SQLite this is done with:
ANALYZE table_name;

Based on the info stored when ANALYZE is run (which, depending on the compile choices, may store much more than just cardinality), the query planner is well equipped to make such decisions as you suggest above, and is more likely to pick a good index.

If you did not ANALYZE the table, but have your own personal ideas/beliefs about how well the content in a table conforms to the query filtering, you could hint this to the query planner using "likelihood()" - which allows you to say whether the result of a comparison is more likely, or less likely, to be TRUE.

Simon is correct though, the ideas around this and the decision by the query planner to use or not to use a specific index, is non-trivial and after many years of hard work, it still can easily guess/assume wrong, so it really pays to provide hints or analyze tables for anything more than a simple lookup.

2021-10-13
22:58 Reply: inconsistent output: insert on conflict with returning (artifact: a490a6b058 user: cuz)

I suppose he got it from the documentation.

See the INSERT...ON CONFLICT (aka UPSERT) documentation here: sqlite.org/lang_upsert.html#examples

As to the original question...

The OP is doing an UPSERT with RETURNING clause that works when the INSERT happens, but if conflicted and in stead an UPDATE happens, no row is returned (since no insert happened), which is probably as designed and as documented since the full RETURNING field-set might not be available to the UPDATE mechanism, the same as when the INSERT mechanism runs and the "excluded.xxx" construct is not available when INSERTing since no exclusion happened - so not sure the RETURNING clause can deal with both.

I suspect it simply works for the INSERT part only as designed, but that is pure speculation on my part and as far as I can tell, no such limitation is documented, so it may simply be a bug.

12:25 Reply: binary .dump format (artifact: 5ffa89363d user: cuz)

Yes I agree, my contemplation imagined something that will be handled by a text processor and even editor, which can still use the mentioned companion files, or if text-editing isn't required, then length sections can work where context dependence is a bigger issue than corrupt bytes I think, but true, in such a case it's certainly not "theoretically impossible".

The TLV suggestion can work also, but then the entire data should be TLV sections, (even Schema, normal data) like it is in the file-types Andreas mentioned, avoiding context dependence and simplifying readers/writers. Much like the usual way of serializing data.

Either way, the moment you can no longer load an sqlite dump into a text editor (or importable via the standard cli sql reader), I think it stops being what it aimed to be at birth. In that regard, Base64 is drop-in replacement/alternate for hex and can still work 100%.

I like the suggestion using SQLite DIFF files on ZFS (or similar), maybe writing a full backup weekly, say, and diffs in between - the "lightness" of which will wholly depend on how often existing blobs change.

I have not actually tried using diffs for backup in sqlite - I wonder if you (or someone else maybe) have done this and can say something about how well it works and mention any caveats?

09:51 Reply: binary .dump format (artifact: bd7d3d1c1b user: cuz)

A binary dump format is not only not available, it's not even theoretically possible since any byte sequence can be present in any blob which makes format bytes impossible, or at least very difficult - something like MIME boundaries works but is heavy on checking for boundary matches in content, which doesn't matter much for typical e-mail, but will for DBs. It also means even small blobs will add large boundary identifiers, which will take away from the saving.

I suppose a best-case enhancement for large blob data would be 3-to-4 encoding (aka Base64) rather than the current hex representation in .dump files, which gives you a ~33% increase on blob size in stead of ~100%[1] increase and is de-duplication safe.

This change, or alternate ability will be needed on both the .dump side and the .import side, perhaps indicated with starting a Base64 binary with '0z' in stead of '0x' used for Hex, or some such. Should not be much effort to implement either, but it's certainly not available at this point.

Andreas' compression suggestion is better for full file deduplication mechanisms, but if you have intra-file data section (byte-run) dedupliaction it won't work, plus if the data is similar through blobs it will reduce size well, but if the data blobs are random or noise-like, compression would do more harm than good (same reason why PNGs of photos are much much larger than JPEGs, yet a PNG of a solid colour background weighs almost nothing while its JPEG is still huge). Text compresses quite well (because of similarity of a language's lexemes and morphemes), so if the ratio of text-to-blobs in the DB is high, then compression may be quite good regardless of the above. I suppose testing is needed for your data. Either way, I think that suggestion is the most viable avenue for you to explore if size-saving is your goal.

[1] I use approximations ~nn% because Hex strings include the '0x' added to a blob, if the blob was 3 bytes long, the resulting text would be 2 + 3 x 2 = 8 bytes long, which is nearly a 200% increase in size, and as blob length increase, that percentage approaches 100%. Similarly Base64 will have such a marker, but it also requires 3-byte alignment so can end with one or two "=" characters, producing a saw-tooth size-gain graph that smooths out (--> 33%) for larger blobs.
i.e - Lots of small blobs are the worst, whatever encoding you use.

2021-10-09
13:52 Reply: x86_64/amd64 sqlite3 binaries (artifact: 90fb69202d user: cuz)

You're correct in that nearly all hardware now is 64bit, but the problem here isn't the hardware, it's operating systems, and more specifically, software.

While most of the install-base for OSes are also 64bit these days, quite many are still 32bit, and even the 64bit OSes have to be able to run software in the wild, of which very many are still fully 32bit, and by that token, if they use libraries, it has to also be 32bit libraries.

Why are there still 32bit software around?, I hear you ask. Well there are different reasons for different 32bit software ranging anywhere from "couldn't be bothered" to "An enormous codebase that will take years to refactor" or just "using old outdated software that is no longer supported but it still does the job, so why not?". On Linux we are used to compiling stuff ourselves, but for most of the World, software comes pre-compiled, certainly 90+% of Windows or Apple user installed software and even these days on Linux it's more common for basic users to install from pre-compiled binaries.

If you doubt me, just see the ruckus caused when Ubuntu tried to drop 32-bit support. I am betting that any major OS that wants to remain a major OS will keep supporting 32bit software for many years to come, meaning people will keep using 32bit software and need 32bit libraries (such as sqlite3).

That's why it's still available.

2021-10-02
15:07 Reply: How to read Images from sqlite3 and show via opencv, PIL or Matplot lib ? (artifact: 67bd9291c4 user: cuz)

I'm afraid SQLite's domain ends at "storing the bytes and giving it back to you".

How you use those bytes, to display images, write them to file or supply them as codes to nuclear missiles aimed at the moon, is completely up to you.

It's irrelevant to SQLite, quite off-topic here, and most importantly, we have very little thoughts on that and may not be much help.

Someone here may even have some valuable input, but your chances of getting good help will be much higher on some programming forum.

2021-09-30
21:18 Reply: sqlite3_exec: Any way to validate the first argument? (artifact: 9b9468ede5 user: cuz)

To emphasize what Larry said (lest someone thinks we represent different schools of thought), Larry's contention is absolutely correct and the test I have is used in pre-production code and the return values are used in ASSERT() statements for sanity checks. I sometimes get involved with the question to the point one forgets to also convey sane programming practices.

There is a very big difference between testing code for a pointer you obtained by valid means yourself and is sure where it comes from but unsure whether it is still valid during the test phase of your software - and - "trying" operations on memory pointers which you do not own or know where you got it from, even during testing. Such a programmer should be taken outside and shot.

Also, the entire notion of you ending up at a point where you do not know if a pointer you are using is still good, is a sign of serious incompleteness of your program - which is ok in development and testing, but atrocious in production.

20:54 Reply: How to filter find duplicate and show row with same field using sqlite code? (artifact: 2b969bffcc user: cuz)

To add to the other excellent replies, a favourite of mine if I want to see how many and which records are duplicate, is to do:

SELECT name, COUNT(*) AS cnt, GROUP_CONCAT(person_no) AS nos
  FROM table
 GROUP BY name
 HAVING COUNT(*) > 1

20:45 Reply: sqlite3_exec: Any way to validate the first argument? (artifact: 5e446ddfdb user: cuz)

I don't think sqlite3_execute() has a helpful "check if my DB pointer is good" return mode, but almost every other function that takes the DB pointer will fail hard if wrong (fail how is hard to say, depends if the pointer points to valid but uninitialized memory, or invalid memory, etc.)

A test I usually do is simply prepare a statement with query: SELECT 2 * 3;

then check it returns SQLITE_OK, then step it, see the return value is good and check the single returned field is 6. Anything that isn't in perfect working order, from the SQLite DB engine to the DB being open or my own pointers, will cause that chain to fail, and it costs very few cycles.

20:15 Reply: Feature request: Stored Procedures (artifact: 1321b318dc user: cuz)

The same goes for Triggers - easy to implement in application code. And who needs things like Foreign Keys - it is not as if that could not be implemented in an application. (That was sarcasm - sorry).

Sarcasm? really? we couldn't tell. The problem here is your sarcasm proves the opposite point than which you are trying to make. Triggers in program code is impossible (rephrasing - exquisitely difficult). So is foreign key maintenance. By contrast, scripts inside Triggers and UDF's are both very easy to add and any other processing in your code is as easy as how good a programmer you are.

There is a lot of functionality in SQLite that not everyone needs or uses. A lot of people will never use FTS (I do), others will never need Geopoly (I don't).

The truth is that very many people use FTS and so it is included in the standard builds (but you can exclude it), as opposed to Geopoly that has to be included if you wish to use it, precisely because not many implementations use it. Both of these however are based on the virtual table mechanism (always included) and infinitely less complex than would be a scripting language for stored procedures.

Furthermore, such an inclusion as stored procedures wouldn't be able to be an add-on or loadable library, it would have to form intricate part of the main engine code and cause a lot of bloat and added processing, something punishing all of us for a thing which very few of us have real use for in SQLite.

I'll say this: There are a couple of requests like this almost every year, so you are not alone, but even tens of requests per year likely won't move the needle.

I'm not saying it won't be "nice" to have, just saying the gain-vs-input efficiency ratio is way too low currently. Don't lose hope though, we've seen lots of other claimed-to-be-bloaty additions making it into SQLite over time, such as Backup tooling, CTE's, Window functions, and the very recent RETURNING clause. That said, they all either had been requested much more (possibly monetary requests too), or took much less complexity/effort to add.

07:15 Edit reply: A little help with temp tables? (artifact: e76ca95844 user: cuz)

You are missing some correct syntax only, the idea is fine.

Try:

CREATE TEMP TABLE hello AS
  SELECT ProductId, UnitPrice, SUM(Quantity) AS Qty
    FROM S
   GROUP BY ProductId
   ORDER BY 3
   LIMIT 5
;

EDIT: Not sure "ORDER BY Qty" would work since it's an alias, it works in some engines. To be safe I changed it to "ORDER BY 3" (to order by the third column) or you could use "ORDER BY SUM(Quantity)" too.

This way of creating tables is quite useful, but note that performance is a slave to the elements.

If you need it to be fast for querying, perhaps using some unique key, then pre-creating the table with relevant keys and filling it is best, and it can be achieved in almost the same way with a bit more verbose SQL.

Example:

CREATE TEMP TABLE hello(
  ProductId TEXT NOT NULL COLLATE NOCASE PRIMARY KEY,
  UnitPrice REAL,
  Qty REAL
);

INSERT INTO hello
SELECT ProductId, UnitPrice, SUM(Quantity)
  FROM S
 GROUP BY ProductId
 LIMIT 5
;

EDIT: Removed the Order-By in the second example as it has no effect since the table will be ordered according to the primary key. Adding a second Index on Qty will make ordering the output by Qty a lot faster - which is a good example reason for doing it this way.

07:04 Reply: A little help with temp tables? (artifact: c099d849b1 user: cuz)

You are missing some correct syntax only, the idea is fine.

Try:

CREATE TEMP TABLE hello AS
  SELECT ProductId, UnitPrice, SUM(Quantity) AS Qty
    FROM S
   GROUP BY ProductId
   ORDER BY Qty
   LIMIT 5
;

This way of creating tables is quite useful, but note that performance is a slave to the elements.

If you need it to be fast for querying, perhaps using some unique key, then pre-creating the table with relevant keys and filling it is best, and it can be achieved in almost the same way with a bit more verbose SQL.

Example:

CREATE TEMP TABLE hello(
  ProductId TEXT NOT NULL COLLATE NOCASE PRIMARY KEY,
  UnitPrice REAL,
  Qty REAL
);

INSERT INTO hello
SELECT ProductId, UnitPrice, SUM(Quantity)
  FROM S
 GROUP BY ProductId
 ORDER BY Qty
 LIMIT 5
;
2021-09-28
23:09 Reply: Reset database (artifact: 9eb4a71222 user: cuz)

Is there an SQL i.e. a programmatic way to reset a database to the same state as it was when it was created?

So many contradictions in one sentence, but since we're nice people, let's try answer them all after dissection:

  • Is there a Programmatic way? - Yes
  • Is there a SQL way? - No
  • when it was created? (read: before) - Yes (Unlink the file, reopen in SQLite)
  • when it was created? (read: after the FILE was created but nothing in the DB yet) - Yes, same as previous point.
  • when it was cretaed? (read: after, with content) - No, you have to truncate each table, or drop and recreate it, then redo whatever data entry was part of the create process, and even then you will have a different schema version etc. The only way to do this programmatically is painstakingly manual, in any Database engine).

I am pre-empting failure when I have just read access to the database

How can you change the "state" of anything in the DB, never mind the DB itself, to anything else (historic or otherwise) when you only have read-only access to it?
Further to that, what good is resetting an empty DB if you only have read-only access to it? It contains no information that would be useful to you and you certainly cannot add any.

I'm going to just assume you were intoxicated when posting and that you do have some write access to the file/directory.

If it's some standard setup DB, you could just keep a copy of the file around by another name, then once all connections to it are closed, replace it with the pre-made file and reopen. Be sure those connections are closed though, else everything will explode and everyone will die.

15:19 Reply: User-defined functions in the CLI? (artifact: 8c4ca9f788 user: cuz)

I believe that actually loads programmable extensions, not SQL.

It does however differ from what the OP wanted, actually I'm not very sure what the OP wants. Loading extensions are very possible (not just in the CLI, but any SQLite engine instance where allowed), alternatively, compiling added code into the CLI is possible too.

The original request reads as if the OP wished to access the sqlite_create_function API somehow from an already-compiled CLI but not via a loaded library, which is very much not possible currently, as Larry explained.

I recall someone here posting an extension which lets you add sqlite UDF functions, taking a Name and an SQL statement, and any time after that you use that Name inside a query, the SQL code will execute.

Something like:

  SELECT MAKEFUNC('Twice', 'SELECT ?1 * 2');

  SELECT Twice(3);
--> 6

I thought it quite clever at the time, but had no cause to use it yet and unfortunately do not recall the name, but someone else here may.

15:04 Reply: Possible bug in group_concat() ? (artifact: 69c95b0314 user: cuz)

Larry, running some tests on my side (but not with debugger, and not being overly familiar with the SQLite internals), it's real hard to find other ways to cause this, but I did come to believe the problem originates in the Window function return rather than the group_concat() code itself.

This deduced by manner of poking the black box with a stick and seeing how it rolls, so I could be very wrong, but posted on the off chance it saves some bug-hunting time.

2021-09-22
00:04 Reply: Queries across databases (artifact: eb314c4733 user: cuz)

The SQL for it is simply:

ATTACH DATABASE 'c:/sqlite/db/contacts.db' AS contacts;

and in program terms:
sqlite3_exec(pdb, "ATTACH DATABASE 'c:/sqlite/db/contacts.db' AS contacts;" ... );

Which should all already work according to your given example, but note that:

  • attaching has to be allowed by your version of SQLite (or C# wrapper for it),
  • there are limits to how many DBs can be attached,
  • you need read/write privileges to the folder of the file you are attaching,
  • and to the DB file itself. (Depending on the Journal mode and intended use.)

To get rid of it again you can do:

DETACH DATABASE contacts;
2021-09-15
12:17 Reply: about "strict" mode (artifact: e960514ad5 user: cuz)

May I enquire as to how exactly the session extension is having difficulty with ROWIDs?

I'm not looking to argue, I plan to use the session extension soon in a project, and am now concerned of usage caveats. Being as specific as possible will be much appreciated.

Thanks!

2021-09-14
21:15 Reply: DELETE FROM … RETURNING (artifact: 961a86551c user: cuz)

You are correct about UPDATEs needing to be passable as they are updated - i.e. there is no way to defer constraint checking till the end of a transaction, so every record update needs to succeed as if it was the only update in the query, yet also not rely on the full completed state of the transaction to check its immediate constraints.
That also means that your update order within a single statement, if it was possible, will still not solve this problem.

I think the people who advocate for this are maybe more concerned with triggers and FK actions happening in predetermined order on UPDATE and DELETE, but for that it's best to rely only on yourself and post those transactions/queries in the exact order you want them executed. No engine quirk or different DB engine's habit can ever mess with it then.

18:47 Reply: DELETE FROM … RETURNING (artifact: 9166c79448 user: cuz)

No, neither should UPDATE ordering. And (to your previous question): No, calling it a bug is not a smart conclusion - hope that wasn't you or a loved one! :)

ORDER BY is an output modifier not a relational method. Once you dictate in which order the engine should update or delete entries, you remove from it any optimization opportunity, or possibility thereof. You attempt to interfere with the engine process and no longer simply the veracity of the data or statements. You neuter the QP - undermining the very utility of an RDBMS engine.

Further, I would always want the feedback to be in order of actual deletes, but have space for people needing a different order, however that can easily be done.

What about stipulating the order of UPDATEs or DELETEs? Well, IF that is really really needed, you can easily achieve that by simply stating exactly which item to UPDATE or DELETE, and then the next, rinse, repeat, in any order you want it to happen, whether writing an SQL script or doing it in your own code.

I can't see a situation in which you would want to UPDATE or DELETE whole blobs of records, but then also needing to fine-grain control exactly which of them go first, I mean, not unless you are using a severely broken relational DB schema.

As to the other reply specifying it MUST be so because of Postgres - Well, its outcome might agree with mine, but it's an argument from authority fallacy to start with, even though a really good authority in this case. I feel looking into WHY Postgres did it, and holding up that point would be a better motivated argument.

12:58 Reply: Slow query (artifact: 85c08744da user: cuz)

That does not look like a function that can be resolved by the Query planner easily.

I don't have your data so cannot easily test it, but this should make a difference, try testing it, especially if you have already added the index on timestamp and assuming timestamp is integer:

SELECT s.Timestamp, s.fst_field , ...
  FROM table s 
  JOIN table p ON p.Timestamp BETWEEN s.Timestamp - 9 AND s.Timestamp - 1

Note that this query on a billion-row table will produce output of easily 5-Billion+ rows (if your example data can be assumed an average indication). That in itself will take significant time to output, let alone query. Probably adding a WHERE clause to limit the Query domain to some specific time-frame or such would be better.

What is it you want to learn form repeating the close data together? There's almost certainly a better way to do the thing you want to do, but for that we need to know what the real information is you wish to glean from the query.

Let us know if that sped things up, and good luck.

2021-09-09
15:17 Reply: Write-ahead logging on SAN (artifact: d9c58a7d27 user: cuz)

To add to Warren's already good reply - SQLite's ability to guarantee Database ACID requirements has very little to do with whether the file is physically on a Network storage or not (or a SAN for that matter), but rather depends on the process being fully in control of the locking mechanism for the DB file. With WAL this is achieved as Warren describes, and for other DB journal types it depends on the locking mechanism of the OS.

So the most correct assessment would be: As long as All read/write control of DB connector processes originate from the same machine+OS, or rather, the same OS process space in memory (i.e. not different VM's that are merely on the same hardware), then you should not have any problem.

2021-09-08
17:06 Reply: Segmentation fault in function isLikeOrGlob() (artifact: f7f6afa30e user: cuz)

, and I didn't feel bad.

Yeah sorry, I think there is a large cultural gap between our phrasing.

I did not actually think you were feeling bad. Perhaps my reply was a wordier version of simply saying:
"That's nothing, great job mate!"

or if I was Australian:
"No worries mate, great job!"

or maybe as an English gangster:
"Relax Gov, 's all good."

or maybe an American one:
"Chill mother(bleep)er, we cool."

or perhaps just as myself:
"All good, Keep up the good work!"

:)

16:48 Edit reply: NEXT VALUE FOR in SQLite (artifact: ac3c8a1056 user: cuz)

So do you want to create an incrementing counter, or do you want to update the value and get a returned value back? Those two are not the same.

For creating a counter, Richard's example will work a treat, and it can never return a value because it happens invisibly to the statements (i.e. In the background).

For the second option, should the counter increase by being updated? or upon inserting a new row?

If updating a value and getting the new value back, that can be achieved with this type of query - imaging there is a table "t" with an integer field "i" in record with "id" 5 that will be our update target:

UPDATE t SET i = IFNULL((SELECT MAX(i) FROM t),0) + 1 
 WHERE id = 5
RETURNING i

If it has to be an Insert query then a very similar (assuming id is also a Autoinc PK):

INSERT INTO t(i, x, y....) VALUES
(IFNULL((SELECT MAX(i) FROM t),0) + 1, x-value, y-value, ... )
RETURNING id, i

BIG NOTE: You need a very recent release of SQLite to be able to use the "RETURNING" clause.

ALSO NOTE: I've not tested these, the principle is sound but there may be syntax errors. The docs are your friend.

More ↓