PRAGMA INTEGRITY_CHECK takes very long
(1) By Cecil (CecilWesterhof) on 2020-05-03 10:34:51 [link] [source]
I have a very large DB (4,2 GB) and when I do a: PRAGMA INTEGRITY_CHECK on it, it takes more as twenty minutes. Is this normal, or do I have a problem?
Normal. It has to read every piece of data in the table and its indexes, and do a lot of cross-referencing.
There's a different check which reports some integrity problems, but doesn't test as many things:
It's faster and may do enough for your purposes.
(3) By Cecil (CecilWesterhof) on 2020-05-03 16:01:41 in reply to 2 [link] [source]
OK, then I do not have to worry. The quick check is about two times as fast. ;-) Time is not very important: I run this script only once a day with cron. And the database that took so long is a database I do not use: it was used several years ago for some tests. But I was just wondering. For all my databases together (except this special one) checking the integrity takes (mostly) less as a minute. Especially in a batch that really is not a problem. One other point: I see quit big differences between runs. The fastest can be two times as fast as the slowest. Is this normal?
Are you certain the system is otherwise idle when you run your tests?
I mean, if you just pack this into the cron
%daily script, then you might be running in parallel with a
locate(1) DB update, for instance.
(5) By Cecil (CecilWesterhof) on 2020-05-03 17:22:43 in reply to 4 [link] [source]
Not 100%, but reasonable sure. This was seen when I ran them from the command-line to test.
You might like to use the Unix
time command, or whatever your operating system uses instead of it. Then instead of looking at the real time figure, look at user time and system time.
Real time can vary quite a lot for the reasons already pointed-out. But user time and system time should not.
(8) By Cecil (CecilWesterhof) on 2020-05-04 11:01:03 in reply to 6 [link] [source]
It is very strange indeed. Before I went to bed I started the command two times sequentially. The first gave: real 52m49.51s user 10m11.98s sys 6m37.97s perc 31.86 and the second: real 17m7.79s user 10m14.76s sys 6m5.03s perc 95.33 The first took three times as much real time, but the spend time is about the same. So in the morning I thought lets do it multiple times with: for i in $(seq 7) ; do After more as three hours it was still busy with the first one. So I cancelled it and started it anew and got: real 20m48.56s user 10m17.09s sys 6m13.42s perc 79.33 real 17m25.53s user 11m8.20s sys 6m15.04s perc 99.78 real 17m26.89s user 11m9.30s sys 6m15.26s perc 99.78 real 16m29.56s user 10m23.10s sys 6m4.68s perc 99.82 real 15m40.67s user 9m40.58s sys 5m58.60s perc 99.84 real 16m11.05s user 10m5.58s sys 5m55.47s perc 98.97 real 16m19.01s user 10m6.34s sys 5m58.08s perc 98.51 I am starting to wonder if there is a problem with my system. :'-(
I am starting to wonder if there is a problem with my system.
There is nothing wrong with your system.
In all of those runs, the same amount of work has to be done, hence user and sys time will be approximately the same.
In the first run,. the database is not in the OS filesystem cache (AKA disk cache), so a lot of time is spent waiting for I/O (read from disk).
In all subsequent runs a significant part of the database will be in the cache, especially often used database pages.
You can experiment with PRAGMA cache_size=...; but this will not make a lot of difference (I tried), because it only moves part of the caching from the OS disk cache to the SQLite page cache.
You can stop worrying.
-- Regards, Kees Nuyt
Of course whether all or part of the database is in the system cache is also relevant, and if it is not in the cache then what has to be evicted to make room also will have an effect. After all, not performing physical I/O is the best way to speed up I/O ...