SQLite Forum

Timeline
Login

41 forum posts by user casaderobison occurring on or before 2021-06-14 13:23:15.

More ↑
2021-06-14
13:23 Reply: SQLite error (7): failed to HeapReAlloc 465625658 bytes (8), heap=7A20000 (artifact: 2728d9471a user: casaderobison)

I had a program some years ago that did a very similar pattern, and the problem is that that version of FTS needs to create some really large blobs when merging FTS nodes. I don't remember all the details, but my solution at the time was to hand shard the data to keep the size of FTS tables smaller, then run multiple queries to gather the same data.

http://sqlite.1065341.n5.nabble.com/Segfault-during-FTS-index-creation-from-huge-data-td81465.html seems to be a link to a six year old thread on the same or very similar / adjacent topic.

2021-05-31
02:15 Reply: SQLITE_IOERR on `pragma journal_mode=wal;` (artifact: dbf77f0266 user: casaderobison)

I was hedging because it seems to me that if multiple processes open a remote database in WAL mode from a single host, and no other connections are open, then they'll all be sharing the same memory. That might "work" for sufficiently fuzzy values of "work".

But yes, given that remote file systems are never recommended, your point is well made.

01:29 Reply: SQLITE_IOERR on `pragma journal_mode=wal;` (artifact: 8185b5d049 user: casaderobison)

Is there any chance the beta user is trying to access the file over a network file system? I know the message states it uses EXT4, but if it is a typical system, there are any number of other file systems that might be mounted. Without knowing the fstab and path to the file, EXT4 isn't a guarantee.

If there is a chance of remote file system usage, that could be a problem. I don't know that it would return that exact error message, but WAL especially is intended to be used exclusively by processes on a single host due to shared memory usage.

2021-05-18
01:14 Reply: memory vs mmap (artifact: 5ec1dd8821 user: casaderobison)

I don't know enough about the implementation detail, but I was trying to show you a big reason why an identical code path in SQLite accessing "simple raw pointers" could have wildly varying execution speeds due to the decisions made in implementing that underlying mapping object.

It is my experience that an mmap using application will typically create one big mapping object for an entire file, then any access to that will access the underlying memory object. However, I believe that due to the design decisions when mmap was first added to SQLite, it is only used for reading (not writing) data (which is your use case) as long as the file was under some certain size. Subsequently, I seem to recall that problems were discovered with one or more platforms mmap implementations and it was "deprecated" (that might not be the right word, but it seems right).

The page cache may perform one big allocation then access individual pages within that, but there is no requirement that the pages be continuous. A mmap object will necessarily be a contiguous chunk of the file.

So effectively you have one big mmap object that represents a sequential range of pages of the file, or you have a page cache which represents many potentially random pages in many page size chunks. This means that the method of accessing a memory DB will be of necessity very different than the code that accesses a mmap version, even though they both use pointers.

00:26 Reply: memory vs mmap (artifact: d3012a5759 user: casaderobison)

As has been written, there are many ways the mmap can be implemented. One model might be:

Allocate one page of physical memory to the mapped file object. Any time you access a different address in the mapped file object, it will write that page back to disk and load a new page from disk to physical memory.

Another might be:

Allocate one page of physical memory for each page of the mapped file object. Do not preload any data. As access is made to each page of the mapped file, a page fault is generated that will read the data from the file to physical memory.

These will both have radically different performance profiles. The first will be thrashing that one page repeatedly (assuming one is accessing random locations within the mapped file object). The other will "quickly" allocate memory, but it will slow down reading random pages from the file as one continues to access random locations.

Ultimately, a bespoke algorithm that knows a lot about how the data will be accessed will potentially be more efficient than one that must handle generic workflows.

We don't know exactly how mmap is implemented, and it has to work regardless of page visitation order.

In the case of creating a memory database, it will allocate enough space for all the pages of the database and never need to fault a page to load it from disk (except in the case of swapping, perhaps, if too much memory has been allocated).

In no way can you ever assume that accessing malloced pages is ever going to be exactly the same as mmap, as there is a lot more at play in creating memory map objects.

2021-05-13
09:53 Edit reply: How to convert am/pm time to 24hrs time format (artifact: c07e6cb3a7 user: casaderobison)

Though I will extend my "intuition" by this example:

12:00:01 pm
12:00:00.1 pm
12:00:00.01 pm
12:00:00.001 pm
12:00:00.0001 pm
12:00:00.00[...]001 pm

This leads me to the intuitive understanding that as we get closer and closer and closer to exactly 12 noon from the right side, we are on the pm side of the line, so I think the 2008 change to make 12 pm be noon was the right move.

Given that "0.999..." is exactly equal to "1.0" (https://en.wikipedia.org/wiki/0.999...) it is also intuitive to me that "11:59:59.999... am" is equal to "12:00:00 pm".

But we're talking about time standards that are not universal constants, which is why we have leap days, leap seconds, daylight time, summer time, standard time, all sorts of different legal standards for them by various governments, time zones, epochs. None of it makes "sense" in that regard.

"11:59:59 am {tick} 12:00:00 am {tick} 12:00:01 pm" hurts my OCD.

Ultimately I agree with the idea that "exactly" noon cannot be AM or PM by the very meaning of the words. Some years ago I worked on software for a company that supported radio station programming. In building a format clock, I had a list of hours of the day that were labeled "1P", "2P", ... "11P", and I used "12M" as the hour that came between "11P" and "1A" just to avoid the ambiguity. "12P" or "12N" doesn't have as much of a problem in my mind since it occurs in the middle of the day, but it is always confusing trying to talk to people about which day "12A" belongs to (but really it is the same problem when referring to "12M"). Until 2019 ISO 8601 allowed using 24:00:00 to refer to the end of the day but now that is forbidden.

09:41 Edit reply: How to convert am/pm time to 24hrs time format (artifact: 75ce300a98 user: casaderobison)

Though I will extend my "intuition" by this example:

12:00:01 pm
12:00:00.1 pm
12:00:00.01 pm
12:00:00.001 pm
12:00:00.0001 pm
12:00:00.00[...]001 pm

This leads me to the intuitive understanding that as we get closer and closer and closer to exactly 12 noon from the right side, we are on the pm side of the line, so I think the 2008 change to make 12 pm be noon was the right move.

Given that "0.999..." is exactly equal to "1.0" (https://en.wikipedia.org/wiki/0.999...) it is also intuitive to me that "11:59:59.999... am" is equal to "12:00:00 pm".

But we're talking about time standards that are not universal constants, which is why we have leap days, leap seconds, daylight time, summer time, standard time, all sorts of different legal standards for them by various governments, time zones, epochs. None of it makes "sense" in that regard.

"11:59:59 am {tick} 12:00:00 am {tick} 12:00:01 pm" hurts my OCD.

Ultimately I agree with the idea that "exactly" noon cannot be AM or PM by the very meaning of the words. Some years ago I worked on software for a company that supported radio station programming. In building a format clock, I had a list of hours of the day that were labeled "1P", "2P", ... "11P", and I used "12M" as the hour that came between the two just to avoid the ambiguity. "12P" or "12N" doesn't have as much of a problem in my mind since it occurs in the middle of the day, but it is always confusing trying to talk to people about which day "12A" belongs to (but really it is the same problem when referring to "12M"). Until 2019 ISO 8601 allowed using 24:00:00 to refer to the end of the day but now that is forbidden.

09:35 Reply: How to convert am/pm time to 24hrs time format (artifact: 057b5fd250 user: casaderobison)

Though I will extend my "intuition" by this example:

12:00:01 pm 12:00:00.1 pm 12:00:00.01 pm 12:00:00.001 pm 12:00:00.0001 pm 12:00:00.00[...]001 pm

This leads me to the intuitive understanding that as we get closer and closer and closer to exactly 12 noon from the right side, we are on the pm side of the line, so I think the 2008 change to make 12 pm be noon was the right move.

"11:59:59 am {tick} 12:00:00 am {tick} 12:00:01 pm" hurts my OCD.

Ultimately I agree with the idea that "exactly" noon cannot be AM or PM by the very meaning of the words. Some years ago I worked on software for a company that supported radio station programming. In building a format clock, I had a list of hours of the day that were labeled "1P", "2P", ... "11P", and I used "12M" as the hour that came between the two just to avoid the ambiguity. "12P" or "12N" doesn't have as much of a problem in my mind since it occurs in the middle of the day, but it is always confusing trying to talk to people about which day "12A" belongs to (but really it is the same problem when referring to "12M"). Until 2019 ISO 8601 allowed using 24:00:00 to refer to the end of the day but now that is forbidden.

09:21 Reply: How to convert am/pm time to 24hrs time format (artifact: f0080df80a user: casaderobison)

All true. This is why I labelled it "intuition" instead of "logic". I really didn't mean to come off as talking down about others views (though clearly they are wrong when they do not conform to mine). #haha

In my mind, it makes more sense that the "am pm" designation goes with the hour of the day. That being said, it is already non-intuitive that 12 maps to 0 and 12 and 24.

Intuition and logic are different things, even though people tend to conflate them. Hence my sincere thanks for pointing out that my intuition doesn't match others and thus that I need to be more precise when communicating the "round hours" of "exactly" noon and midnight.

05:03 Reply: Unexpected output from the SELECT statement (artifact: c5382cf1b0 user: casaderobison)

If you type these statements:

  explain select * from v0 where (v1=20) or (v1=10 and v2=10);

  explain select * from v0 where (v1=20) union select * from v0 where (v1=10 and v2=10);

You'll see that it generates very different VM programs for the first than the last. The second explain output clearly shows that v2 is checked, but not in the first.

Someone with more knowledge about the code generator will have to speak as to why the code generator works that way at present and what was necessary to eventually fix it.

03:02 Reply: How to convert am/pm time to 24hrs time format (artifact: 46b5f1f98e user: casaderobison)

That is an interesting article. I had no idea there was that level of "confusion".

To me, it seems intuitive that 12:00:01 am is one second after midnight, and 12:00:01 pm is one second after noon. By extension, it would be silly to consider 12:00:00 am as being offset by more than one second from 12:00:01 am. Never underestimate the ability of people to foul things up.

I do agree that when using a 12-hour clock in prose, it is probably best to use "midnight" or "noon" instead of "12 am" or "12 pm". Given the unlikely nature of any event happening exactly at the stroke of the instant of midnight or noon (instead of being offset by at least some fraction of a second), I feel people infer a precision by the term "12 am" or "12 pm" that doesn't exist, just as they infer a precision in binary floating-point arithmetic.

Thanks for sharing that. It's good to know government and legal sources don't understand (or haven't historically understood) it "intuitively" the way I do so that I can stop assuming it is as obvious as I think it is.

2021-05-06
23:25 Reply: "Office Space"... I'm losing all those pennies (artifact: 352ae63a98 user: casaderobison)

I suspect the discrepancy has to do with the digits of precision that float has. Float has 24 bits of precision (23 for subnormal) which amounts to 7.22 decimal digits of precision. MariaDB is undoubtedly going the extra mile and performing a rounding step internally before the final conversion to integer. This might be done via a IEEE 754 rounding mode, or it could be something they do in a final analysis.

Where MariaDB is a dedicated process that knows its only reason for existence is to work with a database, it is free to make all the decisions about how to utilize the floating point environment.

SQLite is but one part of a larger process, and defers those decisions to the process. Depending on what platform you are on, it might be possible to set an IEEE 754 rounding mode that would give you the same rounding mode. Alternatively, you could do something like:

SELECT CAST(CAST('16.15' AS FLOAT)*100+0.5 AS INT)

Or use ROUND as was suggested previously.

The long story short is: If all variables are the same, you will get the same answer on both platforms. If anything varies, then you will not.

What platform are you using? Operating system, compiler, runtime library, and versions.

2021-05-04
10:09 Edit reply: "Office Space"... I'm losing all those pennies (artifact: c89aaaad86 user: casaderobison)

That is part of the "lite" in SQLite. In the beginning was the C provided floating-point functionality that was the first-order approximation for decimal mode. It was lite to use existing functionality rather than implement its own decimal numeric processing.

SQLite has never claimed to be conformant to any particular SQL standard. Few database systems fully conform to any standard.

Fortunately, there are solutions to the missing functionality for those who require it, though it is still in a non-standard format (using functions instead of operators, for example).

10:06 Reply: "Office Space"... I'm losing all those pennies (artifact: 551c8c66a2 user: casaderobison)

That is part of the "lite" in SQLite. In the beginning was the C provided floating-point functionality that was the first-order approximation for decimal mode. It was lite to use existing functionality rather than implement its own decimal numeric processing.

SQLite has never claimed to be conformant to any particular SQL standard. Few database systems fully conform to any standard.

Fortunately, others have provided solutions to the missing functionality for those who require it.

01:47 Reply: License of SQLite pikchr RR diagrams (artifact: 5194dba69c user: casaderobison)

My concern was more to do with SQLite documentation. I know the compilable source code is in the public domain, but I also know there are copyright-protected portions of the project, such as SEE and certain portions of the testing infrastructure.

I just wanted to be extra sure I wasn't taken that which was not available, and I would never expect the license of pikchr to govern every image it creates.

2021-05-03
18:18 Reply: License of SQLite pikchr RR diagrams (artifact: b09c446f1c user: casaderobison)

Thank you.

18:09 Post: License of SQLite pikchr RR diagrams (artifact: ce848e8bbb user: casaderobison)

I'm finally working to finish a project I started long ago, and one of the things it "needs" is pretty railroad diagrams to illustrate the syntax / grammar of the "language". Pikchr is of course the perfect solution to this.

Is the SQLite documentation (including pikchr code) public domain or is there any problem using and adapting those "examples" for my own use?

2021-04-13
01:30 Reply: Please avoid all use of the "LL" suffix for long-long integer literals (artifact: a0bea06001 user: casaderobison)

Or use a macro to qualify them on a platform basis:

#if defined(_MSC_VER)
#define LL(x) (x##I64)
#else
#define LL(x) (x##LL)
#endif

if (somevar == LL(123456789012)) { somecode }
2021-04-07
16:08 Reply: open db in read only; explicitly store diff as WAL ? (artifact: a39339732b user: casaderobison)

I think it has been determined that SQLite doesn't do what you're asking for. One possibility that comes to mind would be to write your own VFS. In this VFS it would treat the source database as read only and would "magically" write all changes to a delta file of some sort. A separate utility could be used to merge the delta file back into the original database.

2021-03-11
02:59 Edit reply: import mishandles table name (artifact: e3252f6a6b user: casaderobison)

You opened an in-memory database. That database is by definition transient (aka temporary) but it is known as "main".

So why not just import into a table named summary. Then if you want to do some manipulation of the data to go into a persistent file, just attach that file with a specified name:

attach 'somefile.db' as nottemp;

Then you can do all the data manipulation you want from the temporary database named main into the persistent database named nottemp.

insert into nottemp.sometable
select some-list-of-columns
from main.summary
where some-list-of-conditions

I used that very technique for a data import project last year at work, importing awful looking CSV into a memory-based database named main, then transformed the data into a nicer format in tables in a database that I attached from a file.

02:58 Reply: import mishandles table name (artifact: 3282a9b78a user: casaderobison)

You opened an in-memory database. That database is by definition transient (aka temporary) but it is known as "main".

So why not just import into a table named summary. Then if you want to do some manipulation of the data to go into a persistent file, just attach that file with a specified name:

attach 'somefile.db' as nottemp;

Then you can do all the data manipulation you want from the temporary database named main into the persistent database named nottemp.

insert into nottemp.sometable
select some-list-of-columns
from main.summary
where some-list-of-conditions

I used that very technique for a data import project last year at work, importing CSV into a memory-based database named main into tables in a database that I attached from a file.

2021-03-10
02:57 Reply: Type of the column (artifact: 9e08293836 user: casaderobison)

It's in the header file, but it looks like this:

#ifdef SQLITE_INT64_TYPE
  typedef SQLITE_INT64_TYPE sqlite_int64;
# ifdef SQLITE_UINT64_TYPE
    typedef SQLITE_UINT64_TYPE sqlite_uint64;
# else
    typedef unsigned SQLITE_INT64_TYPE sqlite_uint64;
# endif
#elif defined(_MSC_VER) || defined(__BORLANDC__)
  typedef __int64 sqlite_int64;
  typedef unsigned __int64 sqlite_uint64;
#else
  typedef long long int sqlite_int64;
  typedef unsigned long long int sqlite_uint64;
#endif
typedef sqlite_int64 sqlite3_int64;
typedef sqlite_uint64 sqlite3_uint64;

So the exact incantation to get 64 bit integers depends on the environment, and can be customized by defining SQLITE_INT64_TYPE and (optionally) SQLITE_UINT64_TYPE macros.

2021-03-09
15:51 Reply: Type of the column (artifact: 0058b3b6be user: casaderobison)

Other answers are excellent but maybe this will be helpful to you (or someone).

SQLITE_INTEGER doesn't tell you anything about the size of the integer. Only that it is integral. It could be stored in one of several ways, from 8- up to 64-bit (though not all variants are used internally; there is no such thing as a 10-bit integer within the SQLite storage model, for example).

If you truly don't know in advance what size of data you expect to get back from the database, you must assume it is the full 64-bit and use the int64 API.

If you want to know "in advance" what size of an integer you will need, you'll need something like the following pseudo-code:

#define EXTENDED_INT8    -1
#define EXTENDED_INT16   -2
#define EXTENDED_INT32   -3
#define EXTENDED_INT64   -4

#define EXTENDED_BETWEEN(v, l, h) ((l <= v) && (v <= h))

int extended_column_type(sqlite3_stmt* pStmt, int iCol)
{
  int type = sqlite3_column_type(pStmt, iCol);
  if (type == SQLITE_INTEGER)
  {
    sqlite3_int64 value = sqlite3_column_int64(pStmt, iCol);
    if (EXTENDED_BETWEEN(value, -128, 127))
      type = EXTENDED_INT8;
    else if (EXTENDED_BETWEEN(value, -32768, 32767))
      type = EXTENDED_INT16;
    else if (EXTENDED_BETWEEN(value, -2147483648, 2147483647))
      type = EXTENDED_INT32;
    else
      type = EXTENDED_INT64;
  }
  return type;
}

I've not tried to compile that so it might not be perfect, and there are other ways to do it which might be "better" based on some criteria, but I hope it is illustrative of how you can build your own type checker that provides finer granularity than what SQLite offers by default.

2021-02-28
20:29 Reply: Ability to: SELECT * EXCEPT(some,columns) (artifact: af461a9804 user: casaderobison)

Then a single string parameter with a quoted identifier list could be used. I think it is doable in some way within the context of the system without adding it to core SQLite.

2021-02-26
21:08 Reply: Ability to: SELECT * EXCEPT(some,columns) (artifact: 11a3c0c693 user: casaderobison)

I think the problem is that the column list as it exists is simply a concatenation of columns that are desired. A possibly qualified asterisk says "add all the columns". The proposal says "now go back and remove some of the columns I told you to add previously". While SQL engines tend to do this in the order the columns are defined in the DDL, I don't know that this is a hard requirement.

(Perhaps there should be a reverse_wildcard_columns pragma like there is a reverse_unordered_selects pragma for those who want to ensure they aren't depending on an implementation detail. I'm not recommending it, just a thought that occurred to me.)

There certainly are many things a SQL engine could do to make things easier for the interactive user, but SQLite is not an inherently interactive user tool. That it can be used that way is incidental to its embedded nature.

Anyway, I think enhancing the select list to allow column removal is not a great idea. Not that my opinion matters in the end.

If anything, maybe what would have value would be a table-valued function that takes a query followed by a list of columns to exclude from the final list. That could be done with SQL as it exists today (in as much as SQLite supports table-valued functions). The implementation of such is left as an exercise for the reader.

2021-02-20
03:52 Reply: I made an ico file to the tool sqlite3.exe (artifact: 6ab0ca589b user: casaderobison)

Good point. Gmail previews it, though only after filtering the data, harvesting anything that can be mined for profit, etc.

00:03 Reply: I made an ico file to the tool sqlite3.exe (artifact: c5b2312549 user: casaderobison)

Though it does show up in the email announcement of the post, so that's cool.

2021-02-10
17:29 Reply: Can parts of a database be protected for distribution? (artifact: 51a8b46116 user: casaderobison)

Depending on just how you want to structure it this is possible.

The first thought that comes to my mind (which doesn't mean it is a great idea) is to use asymmetric encryption of the data. Your program which writes the sensitive portions of the database includes the private key and encrypts the fields that should not be editable. The program which reads the sensitive portions of the database decrypts with the public key. Any non-sensitive data need not use any encryption.

Of course, this makes the data opaque to SQL as well, so if you want to be able to make queries based on the contents of the encrypted data, you are out of luck. Unless you build the decryption function into SQL instead of just doing that in your program logic.

The next idea, which would be less complicated, would be to add a column with a hashed value of the data, then check to ensure the data matches the corresponding hash. This will be easier for the user to circumvent but adds a tiny level of intricacy above and beyond simple SQL queries.

My last idea, for now, would be to write a custom VFS for SQLite that stores data in a non-standard way that makes it unusable with off the shelf SQLite compatible tools.

2021-02-01
20:46 Reply: sqlite3_exec stepping through multiple queries (artifact: 552acbf30d user: casaderobison)

If you included a "sentinel query" between the two, it could help you identify the transition (as long you know the sentinel uses a different schema than the other queries). For example:

select * from mytable where gender = 'M'; select 'end-1' as EOQ; select * from mytable where gender = 'F'; select 'end-2' as EOQ;

Just a thought. One could just as easily issue two separate sqlite3_exec calls to know when one is done and another starts.

2021-01-28
03:13 Edit reply: No more moderation? (artifact: 217a2f92d5 user: casaderobison)

It is simply to decrease friction. Many people object to having to provide their email and create or reuse a password for yet another forum, especially if they don't plan to be active. The point of the forum is to allow reports of defects (among other things), and the consensus seems to be that many reports would go unreported if there was any more friction than already exists.

03:12 Reply: No more moderation? (artifact: b9a739f0a2 user: casaderobison)

It is simply to decrease friction. Many people object to having to provide their email and create or reuse a password for yet another forum, especially if they don't plan to be active. The point of the forum is to allow reports of defects (among other things), and the consensus seems to be that the many reports would go unreported if there was any more friction than already exists.

2020-11-27
20:18 Reply: Insert Statement (artifact: 1078b9c7ce user: casaderobison)

Agreed. In theory, punctuation isn't a huge problem (aside from quotation mark itself) if you're quoting all your identifiers. Leading or trailing whitespace will be if you don't realize it.

20:00 Reply: Insert Statement (artifact: 057dd2e366 user: casaderobison)

I think it is often due to tooling that automatically creates quote delimited identifiers as a way of "future proofing" the query in case new keywords are added in the future. Alternatively, it may be due to people who are accustomed to such tooling generated identifiers following the same path even though they don't have to.

2020-10-09
20:33 Reply: SQLite temp file is removed but still kept by java process (artifact: 037ca913c1 user: casaderobison)

I think what people are trying to say is this:

We do not believe this is a bug in SQLite. No one is reporting such a problem, and given that SQLite is used in billions and billions of installations, it probably would have come up.

As a consequence of that, you should probably direct your question to the sqlite-jdbc team to see if they can help.

That being said, it certainly is possible that SQLite has some issue that no one but you has discovered up until now. Unfortunately, without a program that uses only the C API to reproduce the issue (not jdbc), it will be difficult or impossible to fix.

Please contact the jdbc project. It is very likely that they open files in the background as part of their jdbc to SQLite translation layer, and do not expect anyone to try to delete a file once it has been opened.

Note: https://github.com/xerial/sqlite-jdbc/issues/80 talks about deleting open files and jdbc not cooperating in some way. It was a simple Google search for "sqlite jdbc keeps deleted file open" and it was a few items from the top of the page. That does not mean it is exactly the cause of your problem, but it illustrates that other people have seen similar issues when using jdbc.

It is also possible that the fault is elsewhere, maybe some logic problem in your own code that you can't see at the moment. But I would check with the jdbc people next while continuing to review my own code.

2020-09-28
20:56 Reply: compiling extensions for Windows x64 with mingw-w64 (artifact: 20aae5738c user: casaderobison)

The joys of delayed moderation. Note: Not meant critically, just a statement.

19:16 Reply: compiling extensions for Windows x64 with mingw-w64 (artifact: 4b363d1707 user: casaderobison)

I'm not sure if "solstice" is a term I've just not heard as a common alias for the slash, or if it might be an autocorrect glitch, but its formal name as defined by Unicode "solidus". Similar sound. Just FYI. Also "reverse solidus". See Unicode chart.

2020-09-23
21:02 Reply: In-memory database from sqlite file (artifact: 59c20f67ff user: casaderobison)

You are probably correct that it would not be possible to accomplish when the database is locked for exclusive use. The reason things like .schema don't work in that context is because the database is locked, but the different SQLite instances know how to cooperate to indicate the locked status. Other processes (such a copying the file) don't know how to cooperate, and thus might get an inconsistent state of the database while it is being updated by another process.

If there is some data in the database that you need to access even while it is locked, it might make sense to have the process that locks the database for exclusive access be the one that is responsible for making a copy of the relevant data for third parties to use. Or find a way to not hold long running exclusive locks. What journal mode are you using? Perhaps it would be adequate to have one lock running write lock and everyone else could hold read locks on the data, in which case maybe wal mode would work (if all the processes are on the same computer).

2020-09-17
03:07 Reply: Trigram indexes for SQLite (artifact: 0bb5cce17f user: casaderobison)

You could definitely tokenize a string into three character overlapping chunks. One problem might be tracking the overlapping of strings. For example, if you had a string 'ABCDEF' and you searched for 'DEFABC', you would find 'ABC' and 'DEF'. You wouldn't find 'EFA' or 'FAB'. So your MATCH clause would probably need to be customized based on what you were searching for. So the FTS index could have all the trigrams, but matching would be a little more complex than a simple LIKE. However, it should be quite fast to find even with the more complicated query formatting, and probably much easier than adding actual trigram indexes to SQLite.

2020-09-16
23:49 Reply: Trigram indexes for SQLite (artifact: dc1a0b736f user: casaderobison)

Except a trigram sequence can speed up LIKE regardless of the position of characters in a column. FTS5 won't (I don't think) support substrings within a longer string, only prefix or exact matches of tokens. For example:

SELECT * FROM tab WHERE col LIKE '%cot%'

Will return 'cot', 'rural cottage', 'cottage cheese', and 'Scott' (among other possibilities). I don't think FTS5 would support that sort of a search, where cot might be embedded within a word as it is in my name.

2020-09-13
18:15 Reply: Is data in the OS buffers faster to read? (artifact: ea747011f1 user: casaderobison)

Imagine "oodles and oodles" in Carl Sagan's voice...

2020-09-03
00:19 Reply: 'vfs' (artifact: 0eac41e323 user: casaderobison)

Virtual File System could be renamed to some other name that resolves to the same TLA. "Virtual Function Structure" (since the VFS is a structure of function pointers) could be one.

As I understand it, the SQLite VFS was originally just a way to define the file system interface in an abstract manner. Only later did it take on non-FS apis.

Or it could be left alone. Just a thought.