SQLite integrity check=ok but data corrupted
(1.1) By Art H (ahansen) on 2022-09-12 08:08:15 edited from 1.0 [link] [source]
I'm developing a C# WinForms app that uses SQLite. The DB is very small with 18 tables and is only 168KB on disk. As part of app start-up an integrity check is performed on the DB, the DB is Vacuumed and the file copied to the backup (using C# to simplify overwrite of existing file), and then an integrity check performed on the back. An log extract contains:
DB connection confirmed DB integrity check result = "ok" DB integrity confirmed App DB copied to DB Backup DB_BU connection confirmed DB_BU integrity check result = "ok" DB_BU integrity confirmed
I also do daily system backups using Macrium Reflect which is based on Windows VSS and have been doing so for years. I've been developing the app for about 9 months and never experienced a DB issue (referencing "How to corrupt a SQLite DB") until 3 days ago when data loaded into the app became partially corrupted. Following are log extracts of an in-app query.
What the data should be (drive info: S=size, F=free, U=used; sample): Name=C:\ S=107,374,178,304 F=43,544,764,416 U=63,829,413,888 Name=D:\ S=429,496,725,504 F=210,912,083,968 U=218,584,641,536 Name=E:\ S=500,106,784,768 F=270,165,630,976 U=229,941,153,792What I was getting: Name=C:\ S=-4,096 F=595,091,456 U=-595,095,552 Name=D:\ S=-4,096 F=458,686,464 U=-458,690,560 Name=E:\ S=S=1,890,578,432 F=-417,308,672 U=-1,987,080 192
Given negative values for drive space stats the data is obviously garbage; fields/columns designated as TEXT weren't affected but INTEGER types were which is puzzling given SQLite's "soft typing" nature.
Looking at the data in the DB manager (DBeaver) the values were correct and performing the identical query in the DB manager resulted in the correct data being displayed.
After restoring a backup from a week ago I'm again getting valid data in-app but would like to understand what the root cause(s) of the corruption could be.
Actions taken after the backup date: 1. restructured the base table:
DROP TABLE if EXISTS 'TmpTBL'; ALTER TABLE 'Found' RENAME TO 'TmpTBL';Column order was changed and new column addedCREATE TABLE 'Found' ( id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, drvName TEXT, label TEXT, disk TEXT, part TEXT, type TEXT, size INT, free INT, used INT, profileDT TEXT );INSERT INTO 'Found' (drvName,label,disk,part,type,size,free,used) SELECT Name,label,disk,part,type,size,free,used FROM TmpTBL ;
2. Added an index (because of a SQLite "no-index" warning from Visual Studio)
CREATE UNIQUE INDEX idx_found_name ON 'found' (drvName,disk,part);Index doesn't reference impacted columns.
Any help understanding what I did wrong will be much appreciated (apologies for formatting; couldn't figure out how to do the SQL correctly).
(2) By Rowan Worth (sqweek) on 2022-09-12 08:21:11 in reply to 1.1 [link] [source]
As part of app start-up an integrity check is performed on the DB, the DB is Vacuumed and the file copied to the backup (using C# to simplify overwrite of existing file), and then an integrity check performed on the back.
How do you know nothing is modifying the DB while C# is copying it?
(4) By Art H (ahansen) on 2022-09-12 09:57:47 in reply to 2 [link] [source]
It's a "single user" DB; only this app knows of it's existence/has access to it and the copy is foreground with no background/async processing occurring during the <1 ms copy event.
(3) By Chris Locke (chrisjlocke1) on 2022-09-12 09:19:32 in reply to 1.1 [link] [source]
Looking at the data in the DB manager (DBeaver) the values were correct
This would imply the data in the database is correct, and your application is getting data and mangling it somehow... conversion from a long to an integer? While the value is an integer in SQLite, what is the type in c# ?
Have you tried assuming the value is text? (so reading the database value into a text variable).
(5.1) By Art H (ahansen) on 2022-09-12 11:49:43 edited from 5.0 in reply to 3 [link] [source]
That was certainly my initial interpretation as well (and I spent over a day trying isolate the "mangle point" but when I again got expected values after reverting the DB to a backup I started looking at the DB itself.
However, I believe I've found the cause. As a test I created an entirely new table with old table's definition & loaded it from the old table. Again I got the "corrupted" values in-app. Then I noticed I had defined the three numeric columns as INT which of course SQLite doesn't have an INT data class; it has an INTEGER data class. Changed that in the table CREATE statement and issue resolved.
I don't know what definition SQLite was giving those columns. According to the DB manager they were still categorized as numeric so maybe as REAL? It does beg the question tho -- given SQLite's "flexible/dynamic typing" -- why that was actually a problem and causes me to question my understanding of the flexible typing concept.
EDIT: Gave it a go with string & still got the "corrupted"/dynamically typed values. So it must have something to do with SQLite's under the hood magic. Interesting tho. Thanx.
(6) By Chris Locke (chrisjlocke1) on 2022-09-12 11:19:25 in reply to 5.0 [link] [source]
What API/reference/library is your c# app using?
I may be wrong (usually am) but SQLite doesn't really care what the column is defined as. Could be a banana for all it cares.
However, in your app when you say, 'I want field 10', it might check what the column is defined as to ascertain how to grab the data. I tend to just read into a table and get field X - it's up to me then what field type to put that data... rather than say, 'I want an integer from field 10'. That may not make that much sense... :/
Glad you nailed down the issue though.
(7.1) By Art H (ahansen) on 2022-09-12 12:12:56 edited from 7.0 in reply to 6 [link] [source]
I'm using System.Data.SQLite. I forget if that's the exact name it's it has in NuGet.
To be honest I've never been real comfortable with the flexible typing approach & certainly don't rely on it. I tend to have (as I've grown more experienced) more TEXT definitions & then -- as you appear to do also -- I rely on TryParse for date, time & boolean ...
(8) By Simon Slavin (slavin) on 2022-09-12 12:46:27 in reply to 5.1 [source]
If this really is what SQLite is doing, something is very wrong somewhere. SQLite itself recognises "INTEGER" and "INT" identically as both meaning the affinity INTEGER:
https://www.sqlite.org/datatype3.html#column_affinity_for_compound_views
I have never seen anyone point out any difference between behaviour between "INTEGER" and "INT".
Is it possible that DBeaver is causing the problem ? Instead of using that, could you use the SQLite Command Line Shell to do the same things and see if the problem goes away ? The SQLite Command Line Shell is written by the same team which develops SQLite itself. If it gets something wrong they can figure out exactly what the problem is.
https://www.sqlite.org/cli.html
Another useful thing might be to use the SQLite analyzer on your database file both before and after corruption and see if anything looks strange.
(9) By Art H (ahansen) on 2022-09-12 15:38:37 in reply to 8 [link] [source]
First off, I'm pretty much a novice in the C#/SQLite arena. Having said that, I did have the thought cross my mind that the DB manager was a potential culprit particularly given SQLite's installed base and the fact I'm using the "free" Community version of DBeaver.
I've never used the Command Line Shell. Is it similar to Windows' dos//cmd window?
I've just taken a quick look at the CLS link you provided, which incidentally shows defining a column as an int in its little example which lends credence to the DBeaver as culprit theory ...
One thing I didn't see was how to acquired the Shell, I presume because it's assumed I already have it. I acquired SQLite via NuGet through Microsoft's Visual Studio so I'm not sure what all is bundled in the Library but I don't recall any mention of a Shell. If you'd point me in the right direction for acquiring/accessing the Shell I'd be glad to run a quick test some time in the next couple of days.
(10) By Donald Griggs (dfgriggs) on 2022-09-12 16:16:50 in reply to 9 [link] [source]
Hi, Art,
From the front page of sqlite.org you'll see a "Download" button.
From that page, look for "Precompiled Binaries for Windows" The third item will be "A bundle of command-line tools..."
The sqlite3.exe file within that bundle is self-contained. From a windows command prompt, just enter sqlite3 followed by your database filespec.
(11.2) By Art H (ahansen) on 2022-09-14 05:38:31 edited from 11.1 in reply to 10 [link] [source]
Guess I was intimidated by all those "binary" titles when I checked the downloads page yesterday ...
Ran some tests using the Command Line Shell
3 scenarios
- existing DB, new table, leveraged data
- existing DB, new table, new data
- new DB, table & data 2 runs/scenario; one using INT & one using INTEGER for column type
Testing results: INT vs INTEGER field definition <<< existing DB, new table, leveraged data >>> sqlite> create table intTest (drvName text, size int, used int); sqlite> insert into inttest (drvName,size,used) ...> select drvName,size,used from drvdata; sqlite> select * from intTest; ┌─────────┬──────────────┬──────────────┐ │ drvName │ size │ used │ ├─────────┼──────────────┼──────────────┤ │ C:\ │ 107374178304 │ 63829413888 │ │ D:\ │ 429496725504 │ 218584641536 │ │ E:\ │ 500106784768 │ 229941153792 │ └─────────┴──────────────┴──────────────┘ IN-APP RESULTS intTest Result = "C:\" size=-4,096 used =-595,095,552 intTest Result = "D:\" size=-4,096 used =-458,690,560 intTest Result = "E:\" size=1,890,578,432 used =-1,987,080,192 --------------------------------------------------------------- sqlite> create table intTest_2 (drvName text, size integer, used integer); sqlite> insert into inttest_2 (drvName,size,used) ...> select drvName,size,used from drvdata; sqlite> select * from intTest_2; ┌─────────┬──────────────┬──────────────┐ │ drvName │ size │ used │ ├─────────┼──────────────┼──────────────┤ │ C:\ │ 107374178304 │ 63829413888 │ │ D:\ │ 429496725504 │ 218584641536 │ │ E:\ │ 500106784768 │ 229941153792 │ └─────────┴──────────────┴──────────────┘ IN-APP RESULTS intTest Result = "C:\" size=107,374,178,304 used =63,829,413,888 intTest Result = "D:\" size=429,496,725,504 used =218,584,641,536 intTest Result = "E:\" size=500,106,784,768 used =229,941,153,792 ================================================================== <<< existing DB, new table, new data >>> sqlite> create table intTest_3 (drvName text, size int, used int); sqlite> insert into inttest_3 (drvName,size,used) values ...> ('C:\',107374178304,63829413888), ...> ('D:\',429496725504,218584641536), ...> ('E:\',500106784768,229941153792); sqlite> select * from intTest_3; ┌─────────┬──────────────┬──────────────┐ │ drvName │ size │ used │ ├─────────┼──────────────┼──────────────┤ │ C:\ │ 107374178304 │ 63829413888 │ │ D:\ │ 429496725504 │ 218584641536 │ │ E:\ │ 500106784768 │ 229941153792 │ └─────────┴──────────────┴──────────────┘ IN-APP RESULTS intTest Result = "C:\" size=-4,096 used =-595,095,552 intTest Result = "D:\" size=-4,096 used =-458,690,560 intTest Result = "E:\" size=1,890,578,432 used =-1,987,080,192 --------------------------------------------------------------- sqlite> create table intTest_4 (drvName text, size integer, used integer); sqlite> insert into inttest_4 (drvName,size,used) values ...> ('C:\',107374178304,63829413888), ...> ('D:\',429496725504,218584641536), ...> ('E:\',500106784768,229941153792); sqlite> select * from intTest_4; ┌─────────┬──────────────┬──────────────┐ │ drvName │ size │ used │ ├─────────┼──────────────┼──────────────┤ │ C:\ │ 107374178304 │ 63829413888 │ │ D:\ │ 429496725504 │ 218584641536 │ │ E:\ │ 500106784768 │ 229941153792 │ └─────────┴──────────────┴──────────────┘ IN-APP RESULTS intTest Result = "C:\" size=107,374,178,304 used =63,829,413,888 intTest Result = "D:\" size=429,496,725,504 used =218,584,641,536 intTest Result = "E:\" size=500,106,784,768 used =229,941,153,792 ================================================================== <<< virgin DB, new table, new data >>> C:\Users\AppData\LocalLow\Proj_Data_dev>sqlite3 tmp.db SQLite version 3.39.3 2022-09-05 11:02:23 Enter ".help" for usage hints. sqlite> create table intTest_5 (drvName text, size int, used int); sqlite> insert into inttest_5 (drvName,size,used) values ...> ('C:\',107374178304,63829413888), ...> ('D:\',429496725504,218584641536), ...> ('E:\',500106784768,229941153792); sqlite> select * from intTest_5; ┌─────────┬──────────────┬──────────────┐ │ drvName │ size │ used │ ├─────────┼──────────────┼──────────────┤ │ C:\ │ 107374178304 │ 63829413888 │ │ D:\ │ 429496725504 │ 218584641536 │ │ E:\ │ 500106784768 │ 229941153792 │ └─────────┴──────────────┴──────────────┘ IN-APP RESULTS intTest Result = "C:\" size=-4,096 used =-595,095,552 intTest Result = "D:\" size=-4,096 used =-458,690,560 intTest Result = "E:\" size=1,890,578,432 used =-1,987,080,192 --------------------------------------------------------------- sqlite> create table intTest_6 (drvName text, size integer, used integer); sqlite> insert into inttest_6 (drvName,size,used) values ...> ('C:\',107374178304,63829413888), ...> ('D:\',429496725504,218584641536), ...> ('E:\',500106784768,229941153792); sqlite> select * from intTest_6; ┌─────────┬──────────────┬──────────────┐ │ drvName │ size │ used │ ├─────────┼──────────────┼──────────────┤ │ C:\ │ 107374178304 │ 63829413888 │ │ D:\ │ 429496725504 │ 218584641536 │ │ E:\ │ 500106784768 │ 229941153792 │ └─────────┴──────────────┴──────────────┘ IN-APP RESULTS intTest Result = "C:\" size=107,374,178,304 used =63,829,413,888 intTest Result = "D:\" size=429,496,725,504 used =218,584,641,536 intTest Result = "E:\" size=500,106,784,768 used =229,941,153,792 ==================================================================
(12) By Keith Medcalf (kmedcalf) on 2022-09-13 18:02:00 in reply to 11.1 [link] [source]
Something in your application is using the "declared column type" to decide what it is going to do. It appears that if the "declared type" is integer then the value is retrieved as the native 8-byte/64-bit signed integer that it is, however, when the "declared type" is int then that 8-byte/64-bit value is being truncated to 4-byte/32-bit binary value by discarding the upper 4-bytes/32-bits.
I've not used any of the wares (other than the C version of SQLite3) that you are using, so cannot tell you where this is happening. But it is exactly what is happening.
(13) By Art H (ahansen) on 2022-09-14 06:28:02 in reply to 12 [link] [source]
Aha! That fits with my observations/experience. The in-app target variables were originally what C# calls a "decimal' type, which is 64-bit. After determining the source data is always returning positive whole numbers I changed the target variable type to C# 32-bit integer type and that caused a SQLite parsing error when importing the data from the DB into the app. Changing the target variable type to a 64-bit integer resolved the issue. Truncation explains why the erroneous data is consistent which means the data isn't actually corrupted after all. The computer is just doing what it was told to do.
As you say, somewhere under-the-hood/bonnet in my "ware" chain "int" & "integer" are being interpreted as different entities & I strongly suspect it is in C# itself where "int" is the keyword for the 32-bit integer type. Still seems odd to me that the interface/importation application -- knowing the source is SQLite where "int" & "integer" are equivalent -- doesn't accommodate that ...
Thanx for this.