SQLite Forum

Timeline
Login

45 forum posts by user 6kEs4Majrd

2021-08-10
05:36 Reply: sqlar: is it reliable to use length(sqlar.blob)==sqlar.sz determine whether the data is compressed (artifact: 2fb4208659 user: 6kEs4Majrd)

This makes more sense now.

The README.md is not very clear about this. Anybody can update it?

https://sqlite.org/sqlar/doc/trunk/README.md

03:23 Reply: sqlar: is it reliable to use length(sqlar.blob)==sqlar.sz determine whether the data is compressed (artifact: 7031690be3 user: 6kEs4Majrd)

I am not asking how the code works.

I am asking why it makes sense when length(sqlar.blob)==sqlar.sz, there is no compression, when length(sqlar.blob)<sqlar.sz there is compression?

Is it possible that length(sqlar.blob)>sqlar.sz given the header/checksum can increase the length?

2021-08-09
11:33 Reply: sqlar: is it reliable to use length(sqlar.blob)==sqlar.sz determine whether the data is compressed (artifact: 3913b9fd2d user: 6kEs4Majrd)

I don't follow the code. Could you explain it in plain language how it determines whether it is compressed or not in the case that I mentioned?

04:02 Post: sqlar: is it reliable to use length(sqlar.blob)==sqlar.sz determine whether the data is compressed (artifact: c17712ff09 user: 6kEs4Majrd)

https://sqlite.org/sqlar/doc/trunk/README.md

It says

The file is compressed if length(sqlar.blob)<sqlar.sz and is stored as plaintext if length(sqlar.blob)==sqlar.sz.

But it also says

the zlib format contains a two byte compression-type indentification header (0x78 0x9c) and a 4-byte checksum at the end

Could it be possible that length(sqlar.blob)==sqlar.sz in the latter case as the header/checksum can make up the length reduction of the compression?

2021-07-10
23:08 Post: LOAD_EXTENSION: Error: error during initialization: (artifact: 9a268f74e5 user: 6kEs4Majrd)

https://www.sqlite.org/src/file?name=ext/misc/regexp.c&ci=trunk

I downloaded regexp.c from above, and compileded it like this on macOS.

gcc -I /usr/local/opt/sqlite/include -fPIC -dynamiclib /tmp/regexp.c -o /usr/local/opt/sqlite/lib/regexp.dylib

Then I got the following error when I try to load it in an sqlite3 session. Does anybody know how to load it properly?

$ LD_LIBRARY_PATH=/usr/local/opt/sqlite/lib sqlite3
SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> SELECT LOAD_EXTENSION('regexp');
Error: error during initialization:
sqlite>
2021-07-09
16:16 Post: allowing reading by another process while inserting many entries? (artifact: c469006faf user: 6kEs4Majrd)

https://stackoverflow.com/questions/1609637/is-it-possible-to-insert-multiple-rows-at-a-time-in-an-sqlite-database

I see the following example on how to efficiently insert many entries.

BEGIN TRANSACTION;
INSERT INTO 'tablename' table VALUES ('data1', 'data2');
INSERT INTO 'tablename' table VALUES ('data3', 'data4');
...
COMMIT;

I recall that when many entries are being inserted the db is blocked from reading in other processes. But my memory may be wrong. Could anybody confirm if this is case?

If so, I don't want other reading processes to be blocked for a long time. Is it better to split many INSERT INTO statements into multiple transactions instead of just using one transition for all the INSERT INTO statements. Thanks.

2021-06-15
21:55 Reply: Extract files from sqlar with a directory prepended (artifact: ace24d41dc user: 6kEs4Majrd)

If my understanding of https://sqlite.org/cli.html#sql_parameters is correct, I think it is to specify the parameters in a sqlite3 script or session. But I need the parameters in the command line just like the -C option of sqlite3 -A.

So there is not a way to do so?

Is there a reason you can't simply generate the fully-expanded sqlar query directly?

I need such parameters to be easy to be changed in the command line. Changing command line parameters is easier than changing a sqlite3 script which I have to deal with special parameters that need to be escaped.

2021-06-14
20:33 Reply: Extract files from sqlar with a directory prepended (artifact: 2606d60a0b user: 6kEs4Majrd)
sqlite> SELECT
   ...> ('/home/aho/tmp/' || name),
   ...> writefile(('/home/aho/tmp/' || name), sqlar_uncompress(data, sz), mode, mtime)

I don't want to use the path literally like '/home/aho/tmp/' in a sqlite session. Is there a way to specify the parameter $dir via the command line as in sqlite3 test.sqlar -A --extract --directory ~/tmp /etc/logrotate.conf? So that I can call the command returned by -n without replacing the variables like $dir to their actually value?

16:04 Reply: Extract files from sqlar with a directory prepended (artifact: c73b91b3c2 user: 6kEs4Majrd)

It gives you the actual paths extracted, to be printed with --verbose.

What do you mean? I don't see how to print the first ($dir || name).

$ sqlite3 --verbose test.sqlar -Axn
sqlite3: Error: unknown option: -verbose
Use -help for a list of options.

How to run the SQL commands in an sqlite3 session without using -Ax. I tried the following. But no files are written.

$ sqlite3 test.sqlar
SQLite version 3.35.5 2021-04-19 18:32:05
Enter ".help" for usage hints.
sqlite> SELECT  ($dir || name), writefile(($dir || name), sqlar_uncompress(data, sz), mode, mtime) FROM sqlar WHERE (1) AND (data IS NULL OR $dirOnly = 0);
sqlite>
2021-06-13
16:01 Reply: Extract files from sqlar with a directory prepended (artifact: 93cb0840b7 user: 6kEs4Majrd)

Does ($dir || name) right after 'SELECT' do anything?

So GLOB '*..[/\]*' does match abc../? In other words, this glob can miss some legal paths?

Why is (1) needed?

If I want to ignore abspaths, I should add AND NOT GLOB '/*' at the end?

13:01 Reply: Extract files from sqlar with a directory prepended (artifact: 7be4875762 user: 6kEs4Majrd)

Here is what I see. I don't quite understand where dirOnly and dirare defined.

https://sqlite.org/cli.html

writefile() only has two arguments according to the above page. Why does it have 4 arguments below ($dir || name), sqlar_uncompress(data, sz), mode, mtime?

Why does the same select command appear twice below?

What does name NOT GLOB '*..[/\]*' mean? So things like abc../ is ignored? This does not make sense to me.

$ sqlite3 -Axfn my.db
-- open database 'my.db'
SELECT  ($dir || name), writefile(($dir || name), sqlar_uncompress(data, sz), mode, mtime) FROM sqlar WHERE (1) AND (data IS NULL OR $dirOnly = 0) AND name NOT GLOB '*..[/\]*'
SELECT  ($dir || name), writefile(($dir || name), sqlar_uncompress(data, sz), mode, mtime) FROM sqlar WHERE (1) AND (data IS NULL OR $dirOnly = 0) AND name NOT GLOB '*..[/\]*'
2021-06-12
23:49 Post: Extract files from sqlar with a directory prepended (artifact: 4f099a2dda user: 6kEs4Majrd)

https://www.sqlite.org/sqlar.html

It shows the following command on how to extract an sqlar file.

sqlite3 example.sqlar -Ax

I'd like to extract the files but prepend the path with some directory name.

For example, if there is a file called myfile.txt, I'd like it to be extracted to mydir/myfile.txt instead of to ./myfile.txt. For files starting with '/', they are considered as illegal, errors should be printed and nothing should be extracted.

Could anybody show me the sqlite3 code to achieve this? If it is not possible to achieve this in raw sqlit3 code, can anybody show me the code in python asqw?

2021-04-03
17:05 Post: How to dump a table to a TSV file but escape characters properly (artifact: f68281a41d user: 6kEs4Majrd)

I can use the following command to dump a table into TSV format. But the problem is that when any field contains tab and newline characters, the output will be messed up. Instead, it is better to escape them as 't' and 'n', and escape '' as '\'.

sqlite3 -readonly -header -separator $'\t' dbfile "select * from mytable;"

As another mode, it may be better to escape all nonprintable characters.

Does anybody have a good way to do these? Thanks.

05:08 Post: About virutal table and FTS4 (artifact: 43cac36b7e user: 6kEs4Majrd)

I see VIRTUAL TABLE like the following. I don't quite understand how it works. My understanding is that it does not hold the real data. The data are from other tables. But for this specific case how to figure out where the data of each field is from?

sqlite3 -separator $'\t' getsploit.db 'select * from sqlite_master where type="table";'
table	exploits	exploits	0	CREATE VIRTUAL TABLE exploits USING FTS4(id text, title text, published DATE, description text, sourceData text, vhref text)
table	exploits_content	exploits_content	2	CREATE TABLE 'exploits_content'(docid INTEGER PRIMARY KEY, 'c0id', 'c1title', 'c2published', 'c3description', 'c4sourceData', 'c5vhref')
table	exploits_segments	exploits_segments	3	CREATE TABLE 'exploits_segments'(blockid INTEGER PRIMARY KEY, block BLOB)
table	exploits_segdir	exploits_segdir	4	CREATE TABLE 'exploits_segdir'(level INTEGER,idx INTEGER,start_block INTEGER,leaves_end_block INTEGER,end_block INTEGER,root BLOB,PRIMARY KEY(level, idx))
table	exploits_docsize	exploits_docsize	6	CREATE TABLE 'exploits_docsize'(docid INTEGER PRIMARY KEY, size BLOB)
table	exploits_stat	exploits_stat	7	CREATE TABLE 'exploits_stat'(id INTEGER PRIMARY KEY, value BLOB)

For FTS4, is it just to speed up the search?

So the virtual table could be created by CREATE VIRTUAL TABLE exploits (id text, title text, published DATE, description text, sourceData text, vhref text)? The only drawback is the search speed is much slower but the results should be the same whether FTS4 is used or not? Thanks.

2021-03-14
14:53 Post: How to insert control characters? (artifact: 1266dd6092 user: 6kEs4Majrd)

I want to insert control characters like TAB, CR or x01. I am not sure what is the correct way to do so. Could anybody let me know? Thanks.

sqlite3 dbfile <<-'EOF' create table test (id integer primary key, value text); insert into test (value) values('atb'); EOF

2020-09-21
22:44 Post: How does sparse files used by sqlite3? (artifact: be1129fd4c user: 6kEs4Majrd)

Many file systems support sparse files. Could anybody let me know whether sqlite3 uses such a feature? If so, how is it used? Thanks.

http://blogs.tulsalabs.com/?p=166

2020-08-16
04:22 Reply: How is a table with only primary key stored on disk? (artifact: 61866bcb9e user: 6kEs4Majrd)

Therefore the application of rudimentary grade school arithmetic will allow you to compute the maximum number of keys K that can be stored in the available space.

So K is not a fixed number? It is not clear from the manual as the manual does not define what "K" is. This looks like a problem that should be fixed in the manual. What if a string to save is too large for a page? Then K is less than 1? What does K<1 means?

create table t ( c0, c1, c2, c3 ... primary key ( key_column_list ), unique ( key_column_list ) ); create index i on t ( key_column_list );

Are the "keys" mentioned in the manual the same as the "keys" in SQL code? The word "keys" mean differently in different context.

In the above example code, what should be stored in a page? And what are the binary layout in a page?

01:19 Reply: How is a table with only primary key stored on disk? (artifact: bcd1547601 user: 6kEs4Majrd)

It says "An interior page contains K keys together with K+1 pointers to child b-tree pages."

How to make sure the K keys fit in a page? Can keys be strings so that there will be no guarantee that K keys will fit in a page?

Also what are the keys metioned in fileformat.html and the keys in the sense of SQL language? Thanks.

2020-08-15
22:30 Post: How is a table with only primary key stored on disk? (artifact: d70df22a8b user: 6kEs4Majrd)

I want to understand how database storage is implemented in sqlite3. In the simplest case, it should be a table with just one primary key field. Could anybody let me know how it is implemented in sqlite3?

https://en.wikipedia.org/wiki/Database_storage_structures

https://jvns.ca/blog/2014/09/27/how-does-sqlite-work-part-1-pages/

The above page says that sqlite3 is based on B tree. But isn't B+ tree better than B tree? Thanks.

https://www.javatpoint.com/b-plus-tree

Any details about the implementation can be explained?

2020-07-09
16:26 Reply: Can -wal -shm files be deleted if -wal file is empty? (artifact: 989cec670e user: 6kEs4Majrd)

Is there a way to suppress the output of sqlite3 databasefile.db "SELECT count(*) FROM sqlite_master;"?

I tried .output /dev/null; SELECT count(*) FROM sqlite_master;. But it is not working. Could you show me the correct way to suppress the output? (I don't want to use shell redirection. I want to use sqlite3's feature to suppress the output.)

16:24 Reply: Can -wal -shm files be deleted if -wal file is empty? (artifact: 820fe498df user: 6kEs4Majrd)

As I mentioned before, there is no other processing accesing the db. So they are equivalent?

14:22 Reply: Can -wal -shm files be deleted if -wal file is empty? (artifact: accb68153b user: 6kEs4Majrd)

It seems that when I run sqlite3 databasefile.db "SELECT count(*) FROM sqlite_master;", it just delete -shm and -wal file without touching the main db file.

Given -wal is empty, is that command guaranteed to be equivalent to manually deleting -wal and -shm? If not, under what condition they are not equivalent? Thanks.

14:18 Reply: Can -wal -shm files be deleted if -wal file is empty? (artifact: e60607d062 user: 6kEs4Majrd)

But wouldn't it cause the main db file to be out-of-synch? At this moment, the main db file are the same in the two locations.

13:15 Reply: Can -wal -shm files be deleted if -wal file is empty? (artifact: 23a91c3f2f user: 6kEs4Majrd)

See my other reply. It is due to syncing problems of the database files. I always only update the db at one location. There may also be improper shutdown of the machine involved (but I am not very sure). So it could be 1) update the db at location A, 2) synch (which will copy -wal and -shm to location B), 3) update the db at location B.

13:07 Reply: Can -wal -shm files be deleted if -wal file is empty? (artifact: 71de48e3fe user: 6kEs4Majrd)

The problem is that I have the -wal -shm files on two different locations which are synced. The -shm files are out of synce, but -wal files are both empty. And I am sure there are no running sqlite3 processes accessing the databases on either locations.

In this case, I am not sure that updating the database files using the command that you mentioned on two locations makes sense. So just deleting the -wal and -shm files should be sufficient?

2020-07-08
15:35 Post: Can -wal -shm files be deleted if -wal file is empty? (artifact: 2a8c51e0b8 user: 6kEs4Majrd)

I see some -wal -shm files left on my disk when there are no sqlite3 processes running. The -wal file is empty. In this case, I can just delete the -wal and -shm files. Thanks.

2020-06-21
22:42 Reply: What reading wal can result in `BusyError: database is locked`? (artifact: ca4409f86d user: 6kEs4Majrd)

I only add rows to the database when the database is updated in other processes. So the wal is not affected. In this case, is there a way to relieve this restriction so that I can read the state of wal?

08:36 Post: What reading wal can result in `BusyError: database is locked`? (artifact: 8b05f1adf7 user: 6kEs4Majrd)

I got the following error when I read the journal_mode (in python apsw).

    if c.execute('PRAGMA journal_mode;').fetchone()[0] != 'wal':
  File "src/cursor.c", line 1019, in APSWCursor_execute.sqlite3_prepare
  File "src/statementcache.c", line 386, in sqlite3_prepare
apsw.BusyError: BusyError: database is locked

Since I just read something from the database, why does it matter whether the database is locked or not? Thanks.

2020-06-20
21:57 Post: C API for holding shm file in cache memory on Mac OS X and Linux? (artifact: 772444b72b user: 6kEs4Majrd)

https://www.sqlite.org/walformat.html

I see this sentence on the above page. Does anybody know what is the underlying C API to achieve this on Mac OS X and Linux? Thanks.

"In fact, if there were a mechanism by which SQLite could tell the operating system to never persist the shm file to disk but always hold it in cache memory, SQLite would use that mechanism to avoid any unnecessary disk I/O associated with the shm file."

2020-06-14
15:32 Reply: How db access is handled by sqlite3? (artifact: eb659095e7 user: 6kEs4Majrd)

https://www.man7.org/linux/man-pages/man3/shm_open.3.html

If multiple processes access the same database, does sqlite3 share some memory among these processes using something like shm_open()? Thanks.

2020-06-13
21:48 Post: How db access is handled by sqlite3? (artifact: a93fe6b498 user: 6kEs4Majrd)

I see that there are at least two ways to access a file, i.e., mmap() vs fread/fwrite(). Which one is used by sqlite3 when it access a database file? Thanks.

https://stackoverflow.com/questions/30385635/why-use-mmap-over-fread

2020-06-05
14:43 Reply: How to not to overwrite a row if the key exists? (artifact: eae18c1b63 user: 6kEs4Majrd)

This is helpful. In my example code, the only constraint is on the primary key column name.

In this specific case,

INSERT or IGNORE INTO x VALUES (...);

the above should be better than the following?

REPLACE INTO x VALUES (...) ON CONFLICT DO NOTHING; INSERT or REPLACE INTO x VALUES (...) ON CONFLICT DO NOTHING;

(At least, the first command is shorter to type:))

14:00 Reply: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: 69e3bb00bd user: 6kEs4Majrd)

I got this error.

  File "/xxx.py", line 23, in <module>
    c.execute('PRAGMA journal_mode=WAL;')
  File "src/cursor.c", line 1019, in APSWCursor_execute.sqlite3_prepare
  File "src/statementcache.c", line 386, in sqlite3_prepare
apsw.BusyError: BusyError: database is locked

So the following code would also cause the same error?

if c.execute('pragma journal_mode=wal;').fetchone()[0] <> 'wal':
      raise apsw.Error('Cannot change database to WAL')

So this is better as the first if just read the journal_mode and would not cause apsw.BusyError? Since WAL mode is persistent, as it as long as it is set once, the first if will always be false then the rest pragma journal_mode=wal; statement will not be called. Hence, there will be much less chance to see apsw.BusyError?

if c.execute('pragma journal_mode;').fetchone()[0] <> 'wal':
   if c.execute('pragma journal_mode=wal;').fetchone()[0] <> 'wal':
      raise apsw.Error('Cannot change database to WAL')
13:56 Reply: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: 01fb49df7f user: 6kEs4Majrd)

I notice you've got a busy_timeout of 5... Thanks. I knew it. It was a typo.

13:03 Reply: How to not to overwrite a row if the key exists? (artifact: a0693a9e8a user: 6kEs4Majrd)

This becomes confusing.

"INSERT or IGNORE INTO" is on the diagram at the top of the following webpage. But I don't see it is explained in the text on that webpage. What does it mean?

https://www.sqlite.org/lang_insert.html

I don't want to insert a record if the name of the try-to-be-inserted record exists and insert the record if the try-to-be-inserted record does not exist. Is INSERT or IGNORE INTO x VALUES (...); the best to use for this case? Thanks.

05:15 Reply: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: 2787e066f9 user: 6kEs4Majrd)

What about changing journal_mode from WAL to default? Is it harmful?

For the script, I don't know what I should test to conditionally use PRAGMA journal_mode=WAL;. Do you know how the code should be changed?

import apsw
conn = apsw.Connection(sys.argv[1])
c = conn.cursor()
c.execute('pragma busy_timeout=5;')
c.execute('PRAGMA journal_mode=WAL;')
c.execute('BEGIN IMMEDIATE;')
try:
        c.execute('CREATE TABLE IF NOT EXISTS sqlar(name TEXT PRIMARY KEY, mode INT, mtime INT, sz INT, data BLOB)')
        for (name, len_data, zlib_data) in n2d:
                c.execute('REPLACE INTO sqlar VALUES(?, ?, ?, ?, ?)', [name, 0, 0, len_data, zlib_data])
        c.execute('COMMIT')
except:
        c.execute('ROLLBACK')
        raise
02:27 Reply: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: 147f135331 user: 6kEs4Majrd)

So I need to add the following line to both the writing python apsw program and the reading python apsw program showed in the two previous messages in this thread? Or I just need to add this line to one of the two programs? Thanks.

PRAGMA journal_mode=WAL;
2020-06-04
23:09 Post: How to not to overwrite a row if the key exists? (artifact: e43fdc7c65 user: 6kEs4Majrd)

I use the following sqlite3 code (in python apsw) to overwrite or insert a row in a database.

c.execute('CREATE TABLE IF NOT EXISTS sqlar(name TEXT PRIMARY KEY, mode INT, mtime INT, sz INT, data BLOB)')
c.execute('REPLACE INTO sqlar VALUES(?, ?, ?, ?, ?)', [name, 0, t, len_data, zlib_data])

If I want to only insert the row when the key name does not exist, what is the correct way to do it? Thanks.

18:25 Reply: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: 4c4a6382b8 user: 6kEs4Majrd)

Now, I have multiple write processes using BEGIN IMMEDIATE. Then, I access the db for reading only using the following code.

import apsw
conn = apsw.Connection(f, flags = apsw.SQLITE_OPEN_READONLY)
c = conn.cursor()
c.execute('pragma busy_timeout=5000;')
try:
	for x in c.execute('SELECT name FROM sqlar'):
		print(x[0])
except:
	print(sys.argv[1] % ("Failed to process '%s'." % f), file=sys.stderr)
	raise

I still get the same error.

apsw.BusyError: BusyError: database is locked

Do you know how to fix the problem?

12:01 Reply: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: 9dad5cb85f user: 6kEs4Majrd)

I use the following code (n2d contains the data to be written, n2d of different processes are not guaranteed to be different, they may be partially the same of completely the same). I don't understand how deadlock is caused and how to solved it? I basically just write to the db (except checking if the table is avaialble), so by rewriting the code the deadlock could be removed? Thanks.

import apsw
conn = apsw.Connection(sys.argv[1])
c = conn.cursor()
c.execute('pragma busy_timeout=2147483647;')
c.execute('BEGIN TRANSACTION;')
try:
	c.execute('CREATE TABLE IF NOT EXISTS sqlar(name TEXT PRIMARY KEY, mode INT, mtime INT, sz INT, data BLOB)')
	for (name, len_data, zlib_data) in n2d:
		c.execute('REPLACE INTO sqlar VALUES(?, ?, ?, ?, ?)', [name, 0, 0, len_data, zlib_data])
	c.execute('COMMIT')
except:
	c.execute('ROLLBACK')
	raise
02:11 Reply: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: c8e6f96d76 user: 6kEs4Majrd)

In my application, I can be sure that there will be a deadlock between two processes. In such a case, what can be done to prevent BusyError from occurring. Thanks.

00:13 Post: How to circumvent "apsw.BusyError: BusyError: database is locked"? (artifact: db4c917a7b user: 6kEs4Majrd)

I see this error.

apsw.BusyError: BusyError: database is locked

I've already used. So busy_timeout does not solve this problem.

pragma busy_timeout=2147483647;

What else can be done to circumvent this problem? Thanks.

2020-06-03
23:33 Reply: Is there a way to wait until the database is available? (artifact: b557e6542e user: 6kEs4Majrd)

Could you show the code that should be used at the SQL command level instead of the C level? Thanks.

23:31 Reply: Is there a way to wait until the database is available? (artifact: a1fd2899d8 user: 6kEs4Majrd)

The max signed 32 bit integer is 2147483647. So I should set it as the following?

pragma busy_timeout=2147483647;
22:09 Post: Is there a way to wait until the database is available? (artifact: 6e6acaef86 user: 6kEs4Majrd)

There is busy_timeout. But it must be a finite number. Is there a way to let sqlite3 wait until the database is available? Thanks.

pragma busy_timeout=10000;