SQLite Forum

Timeline
Login

7 forum posts by user wzrlpy

2021-10-09
08:38 Reply: sqlite3_serialize / deserialize 2GB limit on 64 bit (artifact: 44a4a07581 user: wzrlpy)

That 2GB limit is documented as "SQLite will refuse to allocate more than about 2GB of memory at one go" at https://www.sqlite.org/malloc.html

2021-09-20
09:17 Post: Suggestion to support gzipped text in .import & csv extension (artifact: 9ccb124e35 user: wzrlpy)

Dear SQLite maintainers,

As outlined in this forum post, the gz* functions in zlib can be used to read gzip-compressed text, while still being able to read uncompressed text, and that with minimal modifications in the code.

May you please consider using these gz* functions in the .import shell command, and in the csv extension?

Thanks in advance, Víctor

P.S.: I am not attaching a patch as I know you rather work your own code :-)

08:54 Reply: Suggestion to support gzipped text in vsv extension (artifact: 26fb365ec5 user: wzrlpy)

Sorry for being so succint :-) And thanks Keith for your kind clarification, and applying the patch.

I only entered a brief description as a comment in the patch:

+** The file content can be plain text, or text compressed with gzip (if this
+** code was compiled with "-DSQLITE_HAVE_ZLIB -lz").

Definitively the word "automagically" would have helped in the comment.

The gzopen() function detects the magic signature of gzip compression (1f 8b), at the beginning of the file, and later the other gz* functions in zlib know if they need to operate either on a compressed or on an uncompressed file. For our vsv.c purposes, the file content is always text (compressed or not), but in fact the gz* functions work as well on binary content.

2021-09-19
18:20 Post: Suggestion to support gzipped text in vsv extension (artifact: f32790adf7 user: wzrlpy)

Hello Keith,

Since SQLite has added zlib to its usual dependencies, I thought worth supporting gzipped text in your vsv extension.

It ended up being very simple:

--- ../KeithMedcalf/vsv.c       2021-07-07 17:44:36.000000000 +0200
+++ ./vsv.c     2021-09-19 20:13:32.281978471 +0200
@@ -198,6 +198,9 @@
 **    CREATE VIRTUAL TABLE temp.csv USING csv(filename=FILENAME);
 **    SELECT * FROM csv;
 **
+** The file content can be plain text, or text compressed with gzip (if this
+** code was compiled with "-DSQLITE_HAVE_ZLIB -lz").
+**
 ** The columns are named "c1", "c2", "c3", ... by default.  Or the
 ** application can define its own CREATE TABLE statement using the
 ** schema= parameter, like this:
@@ -226,6 +229,15 @@
 #include <stdio.h>
 #include <math.h>

+#ifdef SQLITE_HAVE_ZLIB
+#include <zlib.h>
+#define fopen  gzopen
+#define fclose gzclose
+#define fread  gzfread
+#define fseek  gzseek
+#define ftell  gztell
+#endif
+
 #ifndef SQLITE_OMIT_VIRTUALTABLE

 /*
@@ -257,7 +269,11 @@
 typedef struct VsvReader VsvReader;
 struct VsvReader
 {
+#ifdef SQLITE_HAVE_ZLIB
+    gzFile in;             /* Read the VSV text from this compressed input stream */
+#else
     FILE *in;              /* Read the VSV text from this input stream */
+#endif
     char *z;               /* Accumulated text for a field */
     int n;                 /* Number of bytes in z */
     int nAlloc;            /* Space allocated for z[] */

Would you please consider incorporating this patch (or variations of it) into your code? I of course give this patch a public domain license.

Thanks in advance, Víctor

2021-08-28
22:25 Post: Module pragma_function_list registers only after first usage (artifact: ac5768d00a user: wzrlpy)

Hi,

If I check what modules named pragma% are available in shell, I see first only pragma_module_list, but after invoking the function pragma_function_list(), then I get two modules:

SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> select * from pragma_module_list() where name like 'pragma%';
pragma_module_list
sqlite> select * from pragma_function_list() where name like 'pragma%';
sqlite> select * from pragma_module_list() where name like 'pragma%';
pragma_module_list
pragma_function_list
sqlite> .quit

Should not the two modules be listed right away in the first invocation of pragma_module_list()?

2021-07-03
16:02 Edit reply: SQLite3.exe Command shell: Iterate over table of csv files? (artifact: 5ca249cc09 user: wzrlpy)

Hi Gavin,

You can script out your repetitive SQL staments by generating them dynamically via SQL statements, and invoking them via .read. The generate_series() table function helps to create "loops".

Something like this:

.cd /MyProjects/#NHSWaitingTimes/

create table nhs_filenames as select
strftime('%Y%m', start_date) || '.csv' as file_name,
date(start_date, '+1 months', '-1 days') as census
from
(select date('2017-04-01', '+' || value || ' months') as start_date 
from generate_series(0,48));

create table nhs_sum as select
'([Gt ' || printf('%02d', value) || ' to ' || printf('%02d', value+1) || ' Weeks SUM 1]*' || printf('%02d', value+1) || ')+' as sum_line
from generate_series(0,51);

.headers off
.mode tabs
.separator "\r\n"
.output load_csv.sql

select 
'drop table if exists temp.vsv;',
'.import ' || file_name || ' temp.vsv',
'INSERT or IGNORE INTO waiting',
'    SELECT',
'    census,         -- [census]',
'    [Provider Org Name],            -- [provider]',
'    [Treatment Function Name],      -- [department]',
'    [RTT Part Description],         -- [dataset]',
'    sum(',

(select group_concat(sum_line, char(13) || char(10)) from nhs_sum),

'    ([Gt 52 Weeks SUM 1]*52) ),     -- [months_sum]',
'    sum([Total]),                   -- [upto52_coubt]',
'    sum([Patients with unknown clock start date]), -- [unknown_count]',
'    sum([Gt 52 Weeks SUM 1]),       -- [over52_count]',
'    sum([Total All]),               -- [all_count]',
'    ''=INT(IF(INDIRECT("F" & ROW())>0,INDIRECT("E" & ROW())/INDIRECT("F" & ROW()),INDIRECT("E" & ROW())/INDIRECT("I" & ROW())))'' AS mean_wait -- This is an Excel formula',
'    FROM vsv',
'    WHERE [Treatment Function Name]<>''Total'' ',
'    GROUP BY [census],',
'    [Provider Org Name],',
'    [Treatment Function Name],',
'    [RTT Part Description];'

from nhs_filenames;

.headers on
.mode csv
.read load_csv.sql
07:38 Reply: SQLite3.exe Command shell: Iterate over table of csv files? (artifact: d678eef49b user: wzrlpy)

Hi Gavin,

You can script out your repetitive SQL staments by generating them dynamically via SQL statements, and invoking them via .read. The generate_series() table function helps to create "loops".

Something like this:

.cd /MyProjects/#NHSWaitingTimes/

create table nhs_filenames as select
strftime('%Y%m', start_date) || '.csv' as file_name,
date(start_date, '+1 months', '-1 days') as census
from
(select date('2017-04-01', '+' || value || ' months') as start_date 
from generate_series(0,48));

create table nhs_sum as select
'([Gt ' || printf('%02d', value) || ' to ' || printf('%02d', value+1) || ' Weeks SUM 1]*' || printf('%02d', value+1) || ')+' as sum_line
from generate_series(0,51);

.headers off
.mode tabs
.separator ''
.output load_csv.sql

select 
'drop table if exists temp.vsv;' || char(13) || char(10),
'.import ' || file_name || ' temp.vsv' || char(13) || char(10),
'INSERT or IGNORE INTO waiting' || char(13) || char(10),
'    SELECT' || char(13) || char(10),
'    census,         -- [census]' || char(13) || char(10),
'    [Provider Org Name],            -- [provider]' || char(13) || char(10),
'    [Treatment Function Name],      -- [department]' || char(13) || char(10),
'    [RTT Part Description],         -- [dataset]' || char(13) || char(10),
'    sum(' || char(13) || char(10),

(select group_concat(sum_line, char(13) || char(10)) from nhs_sum) || char(13) || char(10),

'    ([Gt 52 Weeks SUM 1]*52) ),     -- [months_sum]' || char(13) || char(10),
'    sum([Total]),                   -- [upto52_coubt]' || char(13) || char(10),
'    sum([Patients with unknown clock start date]), -- [unknown_count]' || char(13) || char(10),
'    sum([Gt 52 Weeks SUM 1]),       -- [over52_count]' || char(13) || char(10),
'    sum([Total All]),               -- [all_count]' || char(13) || char(10),
'    ''=INT(IF(INDIRECT("F" & ROW())>0,INDIRECT("E" & ROW())/INDIRECT("F" & ROW()),INDIRECT("E" & ROW())/INDIRECT("I" & ROW())))'' AS mean_wait -- This is an Excel formula' || char(13) || char(10),
'    FROM vsv' || char(13) || char(10),
'    WHERE [Treatment Function Name]<>''Total'' ' || char(13) || char(10),
'    GROUP BY [census],' || char(13) || char(10),
'    [Provider Org Name],' || char(13) || char(10),
'    [Treatment Function Name],' || char(13) || char(10),
'    [RTT Part Description];' || char(13) || char(10)

from nhs_filenames;

.headers on
.mode csv
.read load_csv.sql