SQLite Forum

Timeline
Login

50 forum posts by user slavin occurring on or before 2021-08-20 13:04:50.

More ↑
2021-08-20
13:04 Reply: Trying to "Open Database" Safari History.db file and get Error: "could not open safari database file reason: unable to open.. (artifact: fcc0327ba1 user: slavin)

trying to use this software

The SQLite team made only one piece of software a non-programmer would use. It's very technical, and it's unlikely you used it. On a Mac you would have to be running Terminal.app to get at it.

I'm guessing you used SQLite Browser or something like that. Those programs are written by other people and the SQLite team can't support them. SQLite itself does not have error messages like 'unable to open database file'.

Can I ask why you're not using Safari to look at your Safari history ? That would seem the obvious way to do it. If I know why that isn't working I might be able to help.

Also, do you have the pages you need bookmarked, or just in your history ?

2021-08-18
14:50 Reply: random DB locks (artifact: cc4f5da458 user: slavin)

Bingo.

Virtual servers have a poor reputation for supporting file locking. Yours might be simulating full XFS support, but the XFS disk is not a real disk, it's part of the simulated hardware. And the communication between the virtual computer and the virtual storage may not provide full support for locking. Consequently you may be running into

https://www.sqlite.org/howtocorrupt.html#_filesystems_with_broken_or_missing_lock_implementations

In your case, you don't have two competing processes causing corruption. But you're getting lock failure anyway.

I am guessing that identical software to yours won't have this problem if it is run on a real computer with a real XFS drive inside it. But I have no way to prove it, and it's possible you have a different problem none of us have thought of. Do you have any way to test this ?

01:22 Reply: ADO.NET INSERT/UPDATE RETURNING statement question (artifact: a05aa153bb user: slavin)
sqlite> .mode table
sqlite> INSERT INTO "test"("name") VALUES ('johnny2') RETURNING id,name;
+----+---------+
| id |  name   |
+----+---------+
| 4  | johnny2 |
+----+---------+
sqlite> INSERT INTO "test"("name") VALUES ('johnny3') RETURNING "id","name";
+----+---------+
| id |  name   |
+----+---------+
| 6  | johnny3 |
+----+---------+ 

I suspect this bug, if it is a bug, is in System.Data.SQLite . You have posted useful version numbers, so someone familiar with it should be able to help.

2021-08-17
23:37 Reply: random DB locks (artifact: a94f483471 user: slavin)

Please run an integrity check on the database.

How are you doing the COMMIT ?

Is the database stored on the same computer that is doing the processing ? If not, how is the storage accessed ?

12:12 Reply: Is it possible to put the journal file to non-volatile RAM (artifact: 74be7f0953 user: slavin)

You can't do that, but there's probably a way to get the overall result you want.

What is the point in setting SQLite to store a journal file in memory, but then moving that memory to non-volatile memory ? Doesn't that have the same result as just leaving the journal file in the same folder as the database, which is the default ?

What result as you searching for ? And what kind of hardware are you using ? A normal desktop computer, a smartphone, or an embedded processor ?

2021-08-15
13:10 Reply: wal checkpointing very slow (artifact: 94c40d994a user: slavin)

[the following is oversimplified to make it easy to understand]

In WAL mode, until you reach a checkpoint, SQLite doesn't update the database. All changes are logged in the WAL file. When a checkpoint is processed, SQLite has to go through the log of changes, figure out which ones haven't been obsoleted by later changes, and make appropriate changes to the database file. Then it can start again with a new blank WAL file.

That's why the checkpoints are the bottleneck: really SQLite doesn't do much writing between checkpoints.

In your situation of many small unpredictable changes, it may be that WAL mode is not the best mode to use. You might be better off with

PRAGMA journal_mode = TRUNCATE

Perhaps you could try it.

02:32 Reply: pragma integrity_check; Error: disk I/O error (artifact: 52886c2567 user: slavin)

Yeah, but after taking the copy do an integrity check immediately to make sure it's not already corrupt.

We had a thread here from someone who insisted that their database kept getting corrupted. But it turned out that after the database file failed the integrity check they continued working with it, apparently thinking that SQLite automatically fixed all the problems integrity_check reported.

2021-08-14
21:05 Reply: Converting from sqlite to sqlite3 (artifact: 95a8f14560 user: slavin)

As DRH wrote, you're using an API that has nothing to do with SQLite. I think you're using SQL Server API calls. You could not pass a setup string when you opened a SQLite1 database. And SQLite doesn't have API calls for transactions, you just use normal calls and pass them things like "BEGIN" and "COMMIT".

So we can't tell if you mean you're converting SQL Server API calls to SQLite calls, or something else. Perhaps delete this thread and make another which tells us what you're doing more precisely.

13:02 Reply: pragma integrity_check; Error: disk I/O error (artifact: 229e2b5d93 user: slavin)

Page 35: unable to get the page. error code=266

and/or sometimes

Error: near line 1: disk I/O error

You appear to get different, random, errors each time you try an integrity check. Is there any chance that you have a real hardware failure here ? Your reports would all be explained if your storage device was developing a fault. You would start out getting occasional errors, and get more and more as the fault developed.

Can you run something irrelevant to SQLite that tests the partition's integrity ? 'chkdsk' or something ?

2021-08-13
13:06 Reply: pragma integrity_check; Error: disk I/O error (artifact: aba3e16947 user: slavin)

While checking that, also check that Mozilla isn't actually running while you're doing the integrity check. SQLite is meant to take care of that problem, but there's a chance that Mozilla assumes that it has sole access to that database file.

2021-08-12
15:57 Reply: Test ACID properties SQLITE (artifact: c09b3b60b6 user: slavin)

If you have a smartphone, a car, a SatNav device, a TV, a PVR, a security camera, a power smartmeter, a smartwatch, a games console, you have nine copies of SQLite. If you have a Windows/Apple computer too, you have another four or five copies.

Some of the less expected devices I have found SQLite in:

  • the machines in carparks (parking lots ?) which record entry/exit/payments
  • the hand-held devices some parking inspectors use to print tickets
  • a lathe which you can program with curve profiles
  • a hand-held gas detector used to spot leaks
  • the smartmeter connected to my power supply
15:37 Reply: Documentation: list of addons and extras (artifact: 9b1f6db6d6 user: slavin)

I think extensions are covered well, but they are a minority of what the people I am picturing want. And those people won't be looking at source code at that point, because they'd have to search the whole codebase, because they don't know what they're looking for yet.

My imagined users have an idea in their head of 'copy an active database' or 'quick word search' or 'interface with System.Data' or 'difference between two databases'. They don't know the name SQLite uses for it, or whether they're looking for API calls, a loadable library, a utility program, or a VFS. Just what they want done.

The page I'm thinking ofl would be "Here are links to 40 supported features you wouldn't stumble on when figuring out basic write/read/find.". And it would links to all of these

  • what unicode support is there ?
  • fast text search
  • System.Data and Android bindings
  • the backup APi
  • the checksum shim
  • comparing two databases
  • carray
  • SQLite Database Analyzer
  • json support

even though they are different kinds of things. Because the person looking for features doesn't know enough to guess whether something is a VFS or a utility program. Someone who has just got into SQLite runs their eye down that page of links once. They might seize on something immediately, or a year later come up with a new requirement and remember that SQLite does it because they read it somewhere.

2021-08-11
17:10 Post: Documentation: list of addons and extras (artifact: 30bd112cfd user: slavin)

Twice in the last month I've looked for something in SQLite which is not mentioned in the API documentation, because it is an extra program, or a rarely used addon. Having read this forum (or the email list) for years, I know roughly what the feature is called, or a word that is probably mentioned in the documentation for it. I can use the website's search features to find that word.

But many newer users of SQLite may be looking for useful or interesting extensions/addons and not have that advantage. So I went looking for a web page which mentions all the extras. And I couldn't find one. A good source is to use

https://sqlite.org/docs.html

and reveal both 'extensions' and 'tools'. This gets me json1, the archiver, Android bindings, and FTS5. But not the Checksum shim. And the System.Data.SQLite bindings are listed in Programming Interfaces. It also doesn't mention the backup API, because that's part of the C API, yet people can miss it if they don't happen to be looking through the complete list of API function calls. And why would anyone do that when looking for what feels like an addon, not a basic SQLite feature ?

I'm thinking there could be useful things developed and supported by the SQLite team, which users would find if there was a list all on one page. Am I missing that such a list exists ? Or just being silly ?

2021-08-10
15:15 Reply: SQLite Binary Log support! (artifact: a0079eb66b user: slavin)

That is far easier for me to understand. Thank you. I hope other readers will now understand your questions and be able to answer them. I also understand why you cannot use the session extension.

I do have one caution for you. You can test several configurations by making a lot of changes as fast as possible. And you will get a different speed from each one and find out which is fastest. Good. However, this reflects only the situation where you are making a lot of changes as fast as possible. If you test a situation where the changes are happening less often, a different configuration can be faster. So your result may be correct, but not useful for any real situation.

Some reasons are that caching happens at several levels in your computer and storage subsystem, and some systems wait for inactivity, or for the cache to be full, before writing their cache. And your computer tries to write the cache at the same time as running operations in other processes. Things happen with different interlacing, in a different order.

So if your real application is not going to make a lot of changes as fast as possible, this may not be such a useful test.

00:32 Reply: SQLite Binary Log support! (artifact: e7e18e601d user: slavin)

Your post refers to a ton of things that SQLite doesn't need or use: Raft, Kafka, clusters, binary logs. They are tools which allow you to do something. But we don't know what that something is, which means that we don't know how to help you.

Can I ask what you're trying to do that SQLite doesn't do ? And can you explain it without referring to a tool or technique that works with another DBMS ? You've found the session extension. You know what it does. What doesn't it do that you want ?

2021-08-09
14:35 Reply: Why WAL file (-wal) may disappear? (artifact: eafdda128a user: slavin)

One test I can think of is to have your App close the SQLite database at some point in the procedure and reopen it again. Does this make the WAL file magically appear ? This is just for diagnostic purposes, to investigate the problem. You should not need to do this in production code.

Another test would be to run your test code on a desktop computer, rather than the Android hardware. Do you get the same behaviour ?

A recent question to this forum about the Android platform leads me to believe that Android does some sort of internal caching, and doesn't show files on disk until they get big (and the cache busts) or they are closed.

The person who started that thread was unable to reproduce their problem on their computer: on computer the WAL and shm files showed up when expected. On Android they were sometimes visible and sometimes not visible.

The App running SQLite ran correctly until the App was killed, accessing the data correctly. It's just that a directory listing didn't show any actual file. It caused a problem only when the SQLite App was terminated by lost power, or by being killed by the operating system. If the Android App was allowed to quit normally, the WAL file appeared at that time.

Since you are using a smart terminal and not a mobile phone, perhaps you won't have a problem with this. Your users are less likely to run out of power.

2021-08-08
12:41 Reply: Old Nabble SQLite mailing list site down? (artifact: 748e339cdb user: slavin)

There is a good argument for preserving all messages from the email forum intact, as text files (or possibly MBOX format), probably ZIPped. Those interested could download the whole thing, expand it, and search it on their equipment.

My problem was making the messages searchable via standard websearch toolls, for people who don't check the dates on the results shown, and don't know that SQLite gets substantial changes very frequently. Though I like the suggestion upthread of including a highly visible messages like "WARNING: This is an old answer and applies to an old version of SQLite. It may not apply to the version you're using.".

12:33 Reply: Not understanding a subquery - Some guidance? (artifact: 13b06bae21 user: slavin)

The LIMIT clause goes at the end. After DESC.

The way around you put it, SQLite would do the limit first, making it choose one random row of the results first, and only then apply DESC to the one row it has, which would do nothing.

2021-08-07
23:26 Reply: Not understanding a subquery - Some guidance? (artifact: 6dcc9cd5fc user: slavin)

Did you test the sub-select ? Does that give you the result you expected/wanted ?

13:30 Edit reply: Questions about data types and overflow bounds (artifact: 91d6da55c1 user: slavin)

If so, did SQLite forget to apply the strategy to PRIMARY KEY?

Not 'forget' as much as 'not care'. Values for INTEGER PRIMARY KEY fields are processed to make them useful for primary keys without the programmer having to deal with special cases. SQLite can do anything up to and including picking a random integer, as long as it keeps the column suitable for use as a primary key.

https://sqlite.org/faq.html#q1

If you want fine-detail control over the values in an INTEGER key column, don't define it as a INTEGER PRIMARY KEY, define it as INTEGER UNIQUE and set the values you want.

2021-08-06
15:32 Reply: Questions about data types and overflow bounds (artifact: b980380c87 user: slavin)

If so, did SQLite forget to apply the strategy to PRIMARY KEY?

Not 'forget' as much as 'not care'. Values for INTEGER PRIMARY KEY fields are processed to make them useful for primary keys without the programmer having to deal with special cases. SQLite can do anything up to and including picking a random integer, as long as it keeps the column suitable for use as a primary key.

https://sqlite.org/faq.html#q1

If you want fine-detail control over the values in an INTEGER column, don't define it as a primary key, define it as INTEGER UNIQUE and set the values you want.

15:30 Reply: Fail to calculate long expression (artifact: eec4141a1b user: slavin)

I haven't seen any such evidence. Different languages have differences in parsing numbers (how to handle '11' '11.' and '11.0' when looking at text) and differences in rules for division and modulus arithmetic. Every language is free to implement its own rules.

The original question includes at least one operation of division or modulus by 0. Division by zero is undefined in SQLite. SQLite returns NUL. Not a bug. Nothing to see here.

2021-08-05
14:12 Reply: Old Nabble SQLite mailing list site down? (artifact: 21266fcec9 user: slavin)

Nabble was never a part of the SQLite infrastructure. The organisation subscribed to the mailing list and made it available on the web, but it did this to numerous mailing lists on different subjects. If you feel it provides a useful service you might approach the people who run Nabble with your concerns.

As far as preserving all posts from the old mailing list, and making them available for search on the web, I'm not sure it's a good idea. They refer to increasingly antiquated versions of SQLite, and many of the answers are now simply not true. For instance, WAL mode was introduced in 2010. Many posts on the mailing list discuss solutions and workarounds which are now not needed, because WAL does it all simpler and better. There have been other significant changes to SQLite and surrounding technologies (e.g. solid state storage rather than spinning hard disks) and some old questions would now be answered very differently.

2021-08-02
17:45 Reply: Fail to calculate long expression (artifact: b1fd4562b5 user: slavin)

Verified in the CLI for SQLite version 3.36.0 2021-06-18 18:58:49 .

I removed some of that expression and it was still failing. Can you minimise it to the point where removing just one term, or one pair of brackets makes it succeed ?

14:54 Reply: Feature Request: Allow REAL to serve as rowid by having the B-Tree interpret its byte signature as a 64bit INTEGER. (artifact: fb8f5944bb user: slavin)

Can someone list the advantages of this ? Why would anyone want to do it ?

2021-07-30
12:39 Delete reply: Why data is lost in SQLite database with WAL mode on when connection is not closed properly? (artifact: 8de2d93344 user: slavin)
Deleted
12:39 Reply: Why data is lost in SQLite database with WAL mode on when connection is not closed properly? (artifact: fb1ef9c78c user: slavin)

Your file sizes suggest that while your App is running, Android is keeping changes to files it has open in cache, and is writing them to backing storage only when your App quits cleanly. If the App is terminated by powerloss or being killed, the changes vanish. Changes might be written to backing storage if they bust the cache (grow bigger than available cache storage).

Obviously, this isn't good if your platform is meant to cope correctly with powerloss. But it's so bad that it would seem other Android programmers would have discovered and documented the problem years ago. I don't see how Android can do this, given so many people allow to run mobile phones out of power. But it's consistent with phone hardware: changes in memory are fast and cheap. Writing to backing (Flash) storage takes far more time, and eventually wears the memory out, requiring expensive replacement.

If you get the chance, try the experiment on the non-Android platform, as I described above.

2021-07-29
13:48 Reply: Why data is lost in SQLite database with WAL mode on when connection is not closed properly? (artifact: 29cc624e24 user: slavin)

I'm beginning to suspect this is an Android thing. Please try

A) Run your original App (or an adapted version of it) on a computer, rather than your Android platform. Kill it at the same place.

  1. Check to see whether opening that database on the same computer retrieves the 'lost' data.
  2. Do it again, but this time open the database after kill on your Android platform.

B) Then do it the other way. around. Prepare the database and WAL file on your android platform by killing the app as above.

  1. You already know that opening that database on the Android platform loses data.
  2. Do it again, instead of opening on Android, copy the database and WAL file to your computer, and open it on that.
2021-07-28
12:32 Reply: Error not given when referencing something that doesn't exist (artifact: c8d3ba13a9 user: slavin)

The fact that FOREIGN KEY definitions are not checked at CREATE TABLE time may be connected with database dumping and reloading. SQLite allows a definition like this because at the time you CREATE TABLE Pet, TABLE Owner may not yet exist.

Various utilities dump tables in the order they are held in the database file, or in alphabetical order. Checking the parent tables as you define a child table would require a relation-chasing procedure to figure out which order to dump the tables in.

One might argue that, if the parent table does exist, the definition should be checked anyway. But this would introduce a 'sometimes works' behaviour, which might fool programmers into thinking it always works.

In terms of the INSERT command referring to a non-existent parent, if I understand your post correctly, 3.34.0 returned no error, but version 3.36.0 returns an error. This suggests a bug has been fixed.

Hope that helps.

2021-07-25
12:42 Reply: insert into view returning id returns null (artifact: 374eb39f6c user: slavin)

Working as expected, I think.

I wonder what would happen if bar was a TABLE instead of a VIEW, but still had the same TRIGGER, which still didn't have a RETURNING clause.

Or suppose, inside of your TRIGGER, there were two INSERT commands. What would you expect to happen ?

2021-07-22
15:07 Reply: Csv or Vsv parsing blob column (artifact: 582bc0286b user: slavin)

In that case you want a CSV-parsing library. And there isn't one inside SQLite. The CSV parsing donne by the SQLite shell tool is built into the tool itself, not the SQLite library it calls.

2021-07-21
14:34 Reply: Csv or Vsv parsing blob column (artifact: dd6d723d7e user: slavin)

Can I ask for more clarification ? Your text

csv stored in a column using the existing csv and tsv extensions

doesn't ring any bells.

Are you storing an entire CSV file, encoded as a BLOB, in one field of a SQLite table ?

Are you using https://www.sqlite.org/csv.html or something else ?

2021-07-18
13:45 Reply: Does the column order in an insert statement impact the insert speed? (artifact: 36728ae68d user: slavin)

By a wild coincidence, someone blogged discussing a similar question today. They discuss a number of factors I hadn't thought of, including using prepared statements rather than raw SQL commands. Here it is:

https://avi.im/blag/2021/fast-sqlite-inserts/

13:40 Post: Inserting One Billion Rows in SQLite Under A Minute (artifact: 8f76af39ca user: slavin)

Inserting One Billion Rows in SQLite Under A Minute

https://avi.im/blag/2021/fast-sqlite-inserts/

" Recently, I ran into a situation where I needed a test database with lots of rows and needed it fast. So I did what any programmer would do: wrote a Python script to generate the DB. Unfortunately, it was slow. Really slow. So I did what any programmer would do: went down the rabbit hole of learning more about SQLite, Python, and eventually Rust… in my quest to get a 1B row database under a minute. This blog post is a summary of this fun and educational exercise. "

13:10 Reply: Does the column order in an insert statement impact the insert speed? (artifact: 8f60f59d30 user: slavin)

Your transaction size seems reasonable. you might experiment with 2000 rows and 50000 rows just to see if either is an improvement.

The fact that some values are TEXT and some are INTEGER doesn't matter. SQLite doesn't use fixed-sized data for either of them. But it was definitely good to mention it in your question.

SQLite stores all the values for a row together, in the order in which the columns are defined. So it has to construct a sequence of values for each row. I believe that it's fastest to construct the sequence in column order, which should mean that it's fastest to supply the values in column order in the INSERT command. However, it's possible that SQLite uses a linked list, in which case it might be a little faster to supply the values in reverse column order. If you're getting obsessive about speed, you might try this out.

But by and large SQLite execution time is almost always storage-bound. SQLite's processing is extremely optimised and very fast. But most of the time taken by INSERT is taken waiting for your storage subsystem to process reads and writes. Worry less about syntax and more about minimising the number of calls to storage.

One possibility, if this million-row table is useless unless complete, is to set one or more PRAGMAs to ignore ACID and change security while the table is being constructed. Then you can set them back afterwards. These two PRAGMAs might change things (and they might not, depending on your setup !)

https://sqlite.org/pragma.html#pragma_synchronous https://sqlite.org/pragma.html#pragma_wal_autocheckpoint

For each PRAGMA, find out what the setting is before changing it, and change it back when you've done your million rows. Test that out, and find out if it speeds things up. If it doesn't, don't do it.

12:48 Reply: (Deleted) (artifact: f8abbfedbf user: slavin)

Your problem is with Python, not SQLite. The things you're doing work fine in SQLite. You will get better help asking in a Python forum.

2021-07-14
12:47 Reply: My Database is being deleted after each restart of my code. How can I fix it? (artifact: b39f12845f user: slavin)

Does your bot neatly close the database connection when it stops, or is it just crashing ?

When is the database file being deleted, if it is being deleted at all ? In other words, does the database file exist in storage

  1. while your bot is running ?
  2. after your bot has stopped ?

If you use the same bot to open a text file in the same folder, and append text to it each time it starts up, do you actually get one additional piece of text every time it starts up ?

12:43 Reply: How much would you trust a page-level checksum ? (artifact: ee790fb3d3 user: slavin)

Reining in the current mathematical discussion, I'm still interested in my original question. Would it take less code, or be faster, for SQLite to test for pointless re-writes at the page level rather than the field level as it does now ?

2021-07-12
14:05 Reply: Database corruption: database disk image is malformed (artifact: 64594dc46e user: slavin)

You've described the setup in some detail. This makes diagnosis more easy. Thank you.

With literally billions of installations, and a lot of expertise here in this forum, we don't know of any bugs which cause systematic corruption at the low level you report. I would be surprised if you can find a way to make SQLite corrupt its own database in a way that triggers that error message.

Given that you're using Android, you're probably using a mobile device. (Yes, Android runs on TVs and such things, but that's a tiny proportion of platforms.). Mobile devices can lose power in the middle of sync() operations. They can also terminate processes and threads without warning. I suspect that to reproduce the errors your clients are seeing you would have to operate your devices in low-battery situations, or to have incoming phonecalls when the device is low on free memory. Without those, since I'm sure your software has no bugs in it, you will get no corruption.

Perhaps the most useful questions to ask your customers are things like "What was the last operation you did with the App before it reported that corruption ?" "After doing that operation, was the device shut down properly, or did it run out of power, or did you have to force-quit Apps, or was there an incoming phonecall ?".

One pattern of programming I've seen on mobile devices is that rather than keep a database connection open all the time an App is running, the App closes it when it's not expecting any database use. So, for example, if the user is on the homescreen, or choosing an operation, the database is closed. When the user picks a menu option for "List entries" or "Make a new entry" a connection is opened. If the connection is not used for 60 seconds, the App closes it.

13:50 Reply: Unstack one column to multiple (artifact: 1e5d2f1bcb user: slavin)

I'm impressed. And will have to learn more about USING.

Part of my post remains: that's a neat piece of programming. To come up with something like the above you need to be a programmer and to think like a programmer.

2021-07-11
15:26 Edit reply: How much would you trust a page-level checksum ? (artifact: dcaa5034c1 user: slavin)

Okay, folks. Thank you for your answers. The most significant aspects of your answers are that the checksum algorithm, like all good checksum algorithms, is sensitive to position. So swapping around two different octets, or the values of two different rows, changes the checksum. And that any attempt to fool the checksum algorithm would be ridiculously difficult without reading the raw database file.

So let us suppose I trust that a 64-bit checksum (either the one already used, or a better one) is good enough for my purpose. Now I ask the remaining question.

SQLite currently has code in which stops it from writing records which haven't changed to the file. Suppose it checked for changes using a page-level checksum instead. Would that be faster, and/or lead to a smaller codebase, than the current checks which are carried out at the row/field level ? i'm prepared for answers like "Sometimes yes. sometimes no." or "Impossible to tell without writing the code, and we don't have the time. We look forward to seeing your results.".

One sacrifice would be that you'd have to take the time to calculate a checksum for every page written, and that database files would be a little longer. Another is that SQLite would occasionally have to write two pages instead of one, so SQLite would be slower. One advantage would be that since every page would have a checksum, it would be trivial to check those checksums every time a page was read, and thereby spot page-level corruption of a database file. Another is that, under normal circumstances, a page contains data for more than one row, so fewer checks would need to be done, so SQLite would be faster.

I don't know how complicated and time-consuming it is for SQLite to check for pointless rewrites. Or what proportion of pointless rewrites the code currently catches. Thoughts welcome.

14:22 Reply: How much would you trust a page-level checksum ? (artifact: 7d81b9b5a7 user: slavin)

Okay, folks. Thank you for your answers. The most significant aspects of your answers are that the checksum algorithm, like all good checksum algorithms, is sensitive to position. So swapping around two different octets, or the values of two different rows, changes the checksum. And that any attempt to fool the checksum algorithm would be ridiculously difficult without reading the raw database file.

So let us suppose I trust that a 64-bit checksum (either the one already used, or a better one) is good enough for my purpose. Now I ask the remaining question.

SQLite currently has code in which stops it from writing records which haven't changed to the file. Suppose it checked for changes using a page-level checksum instead. Would that be faster, and/or lead to a smaller codebase, than the current checks which are carried out at the row/field level ? i'm prepared for answers like "Sometimes yes. sometimes no." or "Impossible to tell without writing the code, and we don't have the time. We look forward to seeing your results.".

One sacrifice would be that you'd have to take the time to calculate a checksum for every page written, and that database files would be a little longer. Another is that SQLite would occasionally have to write two pages instead of one, so it will take longer. One advantage would be that since every page would have a checksum, it would be trivial to check those checksums every time a page was read, and thereby spot page-level corruption of a database file. Another is that, under normal circumstances, a page contains data for more than one row, so fewer checks would need to be done.

I don't know how complicated and time-consuming it is for SQLite to check for pointless rewrites. Or what proportion of pointless rewrites the code currently catches. Thoughts welcome.

13:54 Reply: Unstack one column to multiple (artifact: d98a069d55 user: slavin)

Can't be done inside SQLite. You need to use either a scripting/programming language, or an editor which can count characters, probably one which understands regular expressions.

Of the things you wrote, the most complicated thing to deal with is "The number of fields for each record also varies.". If it wasn't for this, you could do everything you wanted in a spreadsheet program. Actually, even with it you can do what you want in a spreadsheet program which has macros, but if you didn't tumble to that solution already, you probably don't know how.

Sorry, if that file is long enough that you don't want to go through it manually, you're going to need to employ a programmer. That's what they're for.

2021-07-09
16:56 Post: How much would you trust a page-level checksum ? (artifact: 7452037e57 user: slavin)

i've been using a version of SQLite with the checksum VFS recently

https://www.sqlite.org/cksumvfs.html

and have been musing: how much would I trust a checksum ?

SQLite goes to some lengths to avoid writing data which hasn't changed. It does this on a row-by-row basis, and that involves extra processing. And the processing doesn't catch every possibliity: the checks are performed only under certain circumstances. But the checksum VFS would allow SQLite to detect, at the page level, whether it's necessary to rewrite a page.

There are advantages for this: most databases have more than one row per page, so I think less processing would be required. Both tables and indexes are stored in pages, and this technique would allow SQLite to notice when a table page needed an update, but the index page did not. Or /vice versa/. But I don't know enough about how SQLite works internally to evaluate this. Or to know whether it would be necessary or convenient to read the checksum of a page being replaced. I'd appreciate comments.

And the most important thing: given the size of the checksum used, what's the risk that a change in a page's contents would leave the checksum unchanged ?

2021-07-08
13:29 Reply: Switching pragma synchronous to upgrade durability (artifact: 7a7bf07d8a user: slavin)

https://sqlite.org/pragma.html#pragma_wal_checkpoint

I think you want PRAGMA wal_checkpoint(FULL). I'm not entirely sure I understand your question because we don't use the term 'durable' here. Check it out and post again if I got it wrong.

13:24 Reply: Shortcut to change a non changing update or upsert in a noop (artifact: 2f2836a5a3 user: slavin)

This too was discussed previously, but I can't find the post in question, or even remember whether. it was on the old mailing list. What I remember is …

If SQLite already has the value to be overwritten in memory, because it was covered by the index used to find the row, it will check the new value against the one in memory to make sure that a change needs to be made.

If SQLite has already needed to read the old value from the table, in order to fetch enough of the row to figure out which rows satisfy the WHERE clause, it will check the new value against the one in memory to make sure that a change needs to be made.

It seems that a good way to optimise your code seems to be to make sure that the column you're updating is included in the index used to satisfy the WHERE clause. In other words, don't write extra code, just craft your indexes carefully. And, of course, test out the change to make sure there's a worthwhile difference. Because if you save just a millisecond an hour, it's not worth the extra storage cost.

2021-07-07
12:20 Reply: Shortcut to change a non changing update or upsert in a noop (artifact: 3ee1edf823 user: slavin)

Can confirm, from reading discussions of this years ago.

Calls like changes() report the number of rows which satisfy the WHERE clause (and similar things which affect how many rows should be updated). They are not influenced by existing values in fields to be overwritten.

In order to make things fast, various parts of SQLite act to minimise the actual work done to file storage. They might be at the data level or the database page level, but you should not assume that file access is actually done (reading or writing). As an example of this, SQLite might not even bother to read existing values from a table if they are stored in a convenient index.

If you want to know how many rows of the database would actually get different values, add that restriction to your WHERE clause.

2021-07-03
14:54 Post: The Untold Story of SQLite (artifact: 980587188f user: slavin)

A little something for those interested in SQLite rather than trying to solve a problem connected with using it.

https://corecursive.com/066-sqlite-with-richard-hipp/

Radio interview with DRH describing how SQLite came to be, and how SQLite led to other projects like Lemon and Fossil.

Can be played on the web page, or in your podcast app. Full transcript on the web page.

14:54 Post: The Untold Story of SQLite (artifact: 06a861d93c user: slavin)

A little something for those interested in SQLite rather than trying to solve a problem connected with using it.

https://corecursive.com/066-sqlite-with-richard-hipp/

Radio interview with DRH describing how SQLite came to be, and how SQLite led to other projects like Lemon and Fossil.

Can be played on the web page, or in your podcast app. Full transcript on the web page.

12:35 Reply: cannot create tables in sqlitestudio than other apps can see (artifact: 38bb20d059 user: slavin)

A full path would have a number of folder names in it. Even if the backslashes were being eaten there would be many characters between the 'C:' and the filename.

More ↓