SQLite Forum

Timeline
Login

17 forum posts by user andse-t610

2021-07-19
20:43 Reply: Tips for getting SQLite under Python to perform better with more RAM? (artifact: faf984d13f user: andse-t610)

I have the similar situation in my python web api app: many read, few write.

Here are my observations.

Page cache is private by default, so it lives only while connection is alive. So if your application opens a new connection for each new request, page cache won't help much, because each connection will start "cold" with an empty cache and read data from disk.

I've tested 3 approaches to speed up in such case:

  • use connection pool and private cache. Slightly worse then next variant in my tests.

  • use shared cache for all connections. At app start you create "master connection" to ensure the shared cache alive all app run time. So next db connections will not start "cold", but "warm" using previous cache data.

But keep in mind that shared cache is not recommended to use and in general it is intended for other cases. But it really works for me, sorry for the "harmful advice" :-)

In the extreme case (page cache can contain entire db and all db data loaded into cache) it becomes in-memory db with disk-sync.

So, the third approach:

  • use in-memory db if you have enough RAM, few writes and can reduce the requirements for persistence when app fails.

At app start create "master connection" to file::memory:?cache=shared, load data from disk using backup api. Next your app uses this in-memory db when read. On write you have problems - you have to write into in-memory db and disk db. I haven't done this part yet, but you can use triggers + attached disk-db (so you have single transaction, but only if your disk-db not in WAL-mode - else you have separate transaction for each db), or just write python code and hope that both db will be successfully updated (if not - you can just reload in-memory db from disk - and let the whole world wait...).

I have tested my app in read-only scenario with these modifications under heavy load - shared cache is faster, in memory-db is several times faster.

Since 3.36.0 you can create shared in-memory db in another way - file:/memdbname?vfs=memdb. Source code commit says this mode doesn't use shared cache. But i have not tested it vs :memory: yet. @drh advices use memdb, but the question is not explained in the docs yet.

By the way, there are big problems with updating the sqlite version of the python sqlite3 module. I can advice you pysqlite3 module - it allows you to compile a standalone python module using sqlite3 amalgamation of any required version. The interface is similar to the standard python sqlite3 module. It is hardly googled because shadowed by pysqlite repo, which is deprecated after including it's code in the python standard library.

P.S. Excuse me for my english

19:18 Reply: Tips for getting SQLite under Python to perform better with more RAM? (artifact: 8a5e64124d user: andse-t610)

If the underlying database file is updated by a different connection, the application page cache is discarded.

After db update, arr all caches in all other connections are discarded? Or only some dirty pages?

2021-07-06
15:47 Reply: shared in memory db: :memory:+shared cache OR memdb VFS? (artifact: bbb74ad37b user: andse-t610)

I will wait for the documentation update. Thank you!

15:10 Reply: shared in memory db: :memory:+shared cache OR memdb VFS? (artifact: 1083e36553 user: andse-t610)

Unfortunately no wal:

PRAGMA jorunal_mode=wal; PRAGMA journal_mode

returns memory for each in-memory variant.

15:06 Reply: shared in memory db: :memory:+shared cache OR memdb VFS? (artifact: e2a427b766 user: andse-t610)

I'm confused...

This forum post and mentioned there commit says:

This provides a way for threads to share an in-memory database without the use of shared-cache mode

where is the truth?

14:20 Edit: shared in memory db: :memory:+shared cache OR memdb VFS? (artifact: c1004073ca user: andse-t610)

Hello! I want to have in-memory db shared between multiple db connections.

There was one way before release 3.36.0 that is described in the in-memory db docs. That is shared cache:

file::memory:?cache=shared

or as i understand the same

file:memdbname?mode=memory&cache=shared

sqlite 3.36.0 release notes says:

The "memdb" VFS now allows the same in-memory database to be shared among multiple database connections in the same process as long as the database name begins with "/".

So now I can use

file:/memdbname?vfs=memdb

This way currently is not described in the docs (even nothing is found on request "memdb").

Which should I use? Can someone describe advantages and disadvantages of each variant?

14:01 Post: shared in memory db: :memory:+shared cache OR memdb VFS? (artifact: e0784d829e user: andse-t610)

Hello! I want to have in-memory db shared between multiple db connections.

There was one way before release 3.36.0 that is described in the in-memory db docs. That is shared cache:

file::memory:?cache=shared

or as i understand the same

file:memdbname?mode=memory&cache=shared

sqlite 3.36.0 release notes says:

The "memdb" VFS now allows the same in-memory database to be shared among multiple database connections in the same process as long as the database name begins with "/".

So now I can use

file:/memdbname&vfs=memdb

This way currently is not described in the docs (even nothing is found on request "memdb").

Which should I use? Can someone describe advantages and disadvantages of each variant?

2021-06-04
19:34 Reply: Broken logical NOT in ext/misc/regexp.c (artifact: d0b9108a0b user: andse-t610)

Great job! Thank you!

15:05 Edit: Broken logical NOT in ext/misc/regexp.c (artifact: 9104f0d9e7 user: andse-t610)

I have found that logical not operations like [^something], \W, \D don't work when used alone (without modifiers like + $ ^, except {}).

select 'abc' regexp '\W' ---> 1

select '123' regexp '\D' ---> 1

select 'abc' regexp '[^a-z]' ---> 1

select 'abc' regexp '\W{1,1}' --> 1

select 'abc' regexp '\W{1}' --> 1

but

select 'abc' regexp '\W+' ---> 0

select '123' regexp '^\D' ---> 0

select 'abc' regexp '[^a-z]$' ---> 0

select 'abc' regexp '\W{2}' --> 0

I suspect that something is wrong with the end-of-text event handling...

14:57 Post: Broken logical NOT in ext/misc/regexp.c (artifact: 3fa09346c1 user: andse-t610)

I have found that logical not operations like [^something], \W, \D don't work when used alone (without modifiers like + $ ^, except {}).

select 'abc' regexp '\W' ---> 1

select '123' regexp '\D' ---> 1

select 'abc' regexp '[^a-z]' ---> 1

select 'abc' regexp '\W{1,1}' --> 1

select 'abc' regexp '\W{1}' --> 1

but

select 'abc' regexp '\W+' ---> 0

select '123' regexp '^\D' ---> 0

select 'abc' regexp '[^a-z]$' ---> 0

select 'abc' regexp '\W{2}' --> 0

12:49 Reply: Hidden noCase bug in ext/misc/regexp.c in pRe->zInit prefix optimization (artifact: 8711fd9380 user: andse-t610)

Clearly. I think we'll get the same problems if we change the collation logic on the existing database (like here)

All clear. Thanks a lot!

11:06 Reply: Hidden noCase bug in ext/misc/regexp.c in pRe->zInit prefix optimization (artifact: 4a68fe4b04 user: andse-t610)

The function regexpi should be mentioned in the docs in lines 10-11

10:39 Reply: Hidden noCase bug in ext/misc/regexp.c in pRe->zInit prefix optimization (artifact: 46e5440d75 user: andse-t610)

Thank you for your lightning response! With such developers, sqlite is not in danger)

A question along the way - is there any reason not to use flag SQLITE_DETERMINISTIC when declaring functions regexp and regexpi?

what bad (or good or nothing) things can happen if I use it?

I have found that without this flag i can't use regexp in the partial index (as the docs says).

2021-06-03
12:16 Edit: Hidden noCase bug in ext/misc/regexp.c in pRe->zInit prefix optimization (artifact: 3511f8364f user: andse-t610)

Hello! I'm trying to modify regexp.c extension for my application. So I added 3-args regexp(re, text, noCase) function.

I'm noticed, that queries like

select regexp('abc', 'ABC', 1);

select regexp('ABC', 'ABC', 1)

doesn't match.

But select regexp('ABC', 'abc', 1) matches.

I found out that it is because of the prefix optimiation in re_compile in lines 668-691.

Upper, when calling re_subcompile_re at 651, the text of the regular expression is written in lowercase, so the prefix is stored in lowercase.

Then in re_match at line 217 the prefix is compared with text case-sensitive.

The simplest way to fix it - disable prefix optimization in noCase mode - modify line 676 from

if( pRe->aOp[0]==RE_OP_ANYSTAR ){

to

if( !noCase && pRe->aOp[0]==RE_OP_ANYSTAR ){


12:15 Reply: Hidden noCase bug in ext/misc/regexp.c in pRe->zInit prefix optimization (artifact: d58f873133 user: andse-t610)

Another approach - modify lines 216-217 in re_match.

But there are three problems:

  • first char optimization zIn[in.i]!=x is case sensitive always. Simplest- disable this optimization in noCase mode
  • strcncmp is case sensitive - we can use sqlite3_strnicmp in noCase mode
  • (!) how to determine that current regexp compiled with noCase flag?
10:27 Edit: Hidden noCase bug in ext/misc/regexp.c in pRe->zInit prefix optimization (artifact: 983b43ef8e user: andse-t610)

Hello! I'm trying to modify regexp.c extension for my application. So I added 3-args regexp(re, text, noCase) function.

I'm noticed, that queries like

select regexp('abc', 'ABC', 1);

select regexp('ABC', 'ABC', 1)

doesn't match.

But select regexp('ABC', 'abc', 1) matches.

I found out that it is because of the prefix optimiation in re_compile in lines 668-691.

Upper, when calling re_subcompile_re at 651, the text of the regular expression is written in lowercase, so the prefix is stored in lowercase.

Then in re_match at line 212 the prefix is compared with text case-sensitive.

The simplest way to fix it - disable prefix optimization in noCase mode - modify line 676 from

if( pRe->aOp[0]==RE_OP_ANYSTAR ){

to

if( !noCase && pRe->aOp[0]==RE_OP_ANYSTAR ){


2021-06-01
23:21 Post: Hidden noCase bug in ext/misc/regexp.c in pRe->zInit prefix optimization (artifact: 92d0d73305 user: andse-t610)

Hello! I'm trying to modify regexp.c extension for my application. So I added 3-args regexp(re, text, noCase) function.

I'm noticed, that queries like

select regexp('abc', 'ABC', 1);

select regexp('ABC', 'ABC', 1)

doesn't match.

But select regexp('ABC', 'abc', 1) matches.

I found out that it is because of the prefix optimiation in re_compile in lines 668-691. We can see that the prefix stored as is (no case modification).

Then in re_match at line 212 the prefix is compared case-sensitive.

The simplest way to fix it - disable prefix optimization in noCase mode - modify line 676 from

if( pRe->aOp[0]==RE_OP_ANYSTAR ){

to

if( !noCase && pRe->aOp[0]==RE_OP_ANYSTAR ){