SQLite Forum

Defragment a table ?
Login

Defragment a table ?

(1) By anonymous on 2020-08-20 09:27:14 [link] [source]

Hi,

Is there a way to "defragment" a table, without doing a full vacuum ?

After investing a performance issue, it boiled down to a "fragmentation" of particular table in a rather large database (stored on an SSD).

My table "A" has the same number rows as another table "B", and table "A" has just two fields (an integer primary key and a integer value field), but its rows are modified heavily and regularly, with no particular pattern. Table "B" has 6 fields, it is a snapshot view of data in "A" joined with other tables, and is recreated as a whole periodically.

When running queries against "A", they are repeatably about 3-4 times slower than the same queries against table "B", even with similar indexes, and even for a "select count(*)".

When on the other hand I export both tables to a blank DB, then queries running against table "A" become slightly faster.

Historically table "B" was created to alleviate the performance issue on "A", but it is quite a bit costly to recreate, and means the data presented is not as fresh as it could be.

I am not entirely sure what is fragmented exactly, and why/how it affects performance (storage layer is an SSD), any hints welcome!

Thanks

(2) By anonymous on 2020-08-20 11:52:47 in reply to 1 [link] [source]

I'd try to turn table "A" into a without-rowid table.

That way it's always index-organized, i.e. equivalent to a covering index.
You'll get slower inserts and updates (since needs to re-balance the B-Tree)
but then all table access are equivalent to a covering index, and might be a
bit slower than the best-case access of a rowid table, but (much?) faster
than the fragmented worse-case access. A kind of middle road.

All the above is pure conjecture, but worth a try IMHO.
Please report back your findings either positive or negative.

PS: vacuum is whole or nothing. Or try incremental auto vacuuming instead?

(3) By David Raymond (dvdraymond) on 2020-08-20 13:21:31 in reply to 2 [link] [source]

A without rowid table isn't going to help things if the primary key is already an integer. In those cases I think it's even less efficient since there are a number of optimizations for rowid tables.

Autovacuum can actually increase fragmentation. It doesn't do any re-ordering, all it does it try to shrink the file by moving used pages from the end of the file to free pages towards the front, then truncating the file.

You could put A and B into separate database files and ATTACH them when they're both needed. Then A would be in its own file which can be vacuumed on its own. Though that would of course require there to not be any foreign keys, triggers, etc between the two tables.

(4) By anonymous on 2020-08-20 13:36:10 in reply to 3 [link] [source]

A without rowid table isn't going to help things if the primary key is already an integer

I disagree with that statement. But the OP can find out for him/herself,
if s/he wants to :). I have no time to investigate myself ATM.

(5) By Keith Medcalf (kmedcalf) on 2020-08-20 15:55:13 in reply to 1 [link] [source]

Have you increased the cache size to a decent size for your "rather large" database?

File fragmentation does not affect random block access times for files on an SSD since the block access time is constant (unlike rotating rust which has stepper delays (typically 1/3 stroke) plus rotational delay (typlically 1/3 rotation) plus settling time (which may increase rotational delay by one rotation in the event of a large stroke or noisy servo positioner)).

It is more likely that as the table becomes more disorganized you have insufficient cache defined to hold the working set of the tree traversal data causing excessive I/O. Note that even I/O which is serviced by the "system cache" is considerably slower than access to the same page in "application storage" because access to the "system cache", although the data itself may be cached in RAM, requires accessing "supervisor mode" to perform the I/O operation and the transition itself can be quite expensive (compared to directly accessing a page already stored in user/process memory).

So the first thing I would do is look to see if the cache hit rate for the connection is adequate. See the sqlite3_db_status https://sqlite.org/c3ref/db_status.html for the API you can use to retrieve connection statistics from which you can compute the applicable rates. Adequate in this case means that the hit rate for the "disorganized state" should be the same as for the "organized" state.

Note that this applies only to "read" queries. Updates will always require at least one I/O so it should be expected for the hit rate to fall when mixing reads with writes ... especially as a "write" may change some of the access path pages thus requiring a cache miss on the next access if a change was made though a different context (connection).

(6) By anonymous on 2020-08-20 16:20:47 in reply to 5 [link] [source]

I have tried to set with PRAGMA cache_size up to 5 GB, running only queries on those tables so as not to "pollute" the cache, and only reads, it did not have any effect on performance. The table A is about 400 MB when exported table B is around 850 MB, the whole database on the other hand is around 350 GB.

When running against exported tables, queries against table A come out faster, and those involving full table scans are about twice as fast (as expected) compared to table B.

In the whole database, it's the opposite, any query against table A is quite slower than the same against table B.

Of note, when table A is updated, entries are routinely created/deleted, and when the integer values are updated, changes can span the whole 8bit to 64bits value range, and IIRC SQLite uses a more compact format for smaller integers.

Is it possible that this results in table pages that are effectively quite "empty" (having spilled over at some point), and thus uses many more pages than when exporting the table ?

Table B is created all at once, so its pages would be as compact as possible.

(7) By Richard Hipp (drh) on 2020-08-20 17:46:46 in reply to 6 [link] [source]

Get or compile the "sqlite3_analyzer" (or "sqlite3_analyzer.exe" utility program. You can get precompiled binaries from the SQLite download page. This program is part of the "sqlite-tools-*" packages. Once you have the utility, run it against your database. It will generate information about how much space each of your tables is using, the average row size, the packing efficiency, and so forth. That information might provide new insight into what is going on.

If possible, post the output of "sqlite3_analyzer" here.

(8) By David Raymond (dvdraymond) on 2020-08-20 19:15:39 in reply to 4 [link] [source]

On the without rowid page there's this line down in section 4:

WITHOUT ROWID tables will work correctly (that is to say, they provide the correct answer) for tables with a single INTEGER PRIMARY KEY. However, ordinary rowid tables will run faster in that case. Hence, it is good design to avoid creating WITHOUT ROWID tables with single-column PRIMARY KEYs of type INTEGER.

(9.1) Originally by EricG (EriccG) with edits by Richard Hipp (drh) on 2020-08-20 22:39:01 from 9.0 in reply to 7 [link] [source]

Below are what I think are the relevant snippets from the analyzer, table A reports about 50% unused bytes in its pages, while table B reports just 0.3%... Table A pages are significantly non-sequential. I am currently using 4k pages (running under Windows), would using larger pages help ?

The database as a whole is at 4.6% unused bytes, which is quite good, and table A is the only one with such a ratio of unused bytes, but it is also the only really "dynamically updated" table, other tables are more like indexed logs or regularly regenerated views like table B.

I can send you the whole file by mail if needed.

Thanks


/** Disk-Space Utilization Report

Page size in bytes................................ 4096      
Pages in the whole file (measured)................ 90532790  
Pages in the whole file (calculated).............. 90528869  
Pages that store data............................. 90528868    99.996% 
Pages on the freelist (per header)................ 0            0.0% 
Pages on the freelist (calculated)................ 3922         0.004% 
Pages of auto-vacuum overhead..................... 0            0.0% 
Number of tables in the database.................. 45        
Number of indices................................. 23        
Number of defined indices......................... 14        
Number of implied indices......................... 9         
Size of the file in bytes......................... 370822307840
Bytes of user payload stored...................... 124189467598  33.5% 

*** All tables and indices ****************************************************

Percentage of total database......................  99.996%  
Number of entries................................. 14551461088
Bytes of storage consumed......................... 370806243328
Bytes of payload.................................. 285702356053  77.0% 
Bytes of metadata................................. 68107134984  18.4% 
Average payload per entry......................... 19.63     
Average unused bytes per entry.................... 1.17      
Average metadata per entry........................ 4.68      
Average fanout.................................... 175.00    
Maximum payload per entry......................... 4033      
Entries that use overflow......................... 7            0.0% 
Index pages used.................................. 516031    
Primary pages used................................ 90012830  
Overflow pages used............................... 7         
Total pages used.................................. 90528868  
Unused bytes on index pages....................... 228253065   10.8% 
Unused bytes on primary pages..................... 16768481486   4.5% 
Unused bytes on overflow pages.................... 17740       61.9% 
Unused bytes on all pages......................... 16996752291   4.6% 

*** All tables ****************************************************************

Percentage of total database......................  44.7%    
Number of entries................................. 4881591576
Bytes of storage consumed......................... 165647638528
Bytes of payload.................................. 124189475776  75.0% 
Bytes of metadata................................. 38496387527  23.2% 
Average payload per entry......................... 25.44     
Average unused bytes per entry.................... 0.61      
Average metadata per entry........................ 7.89      
Average fanout.................................... 333.00    
Maximum payload per entry......................... 4030      
Entries that use overflow......................... 0            0.0% 
Index pages used.................................. 121285    
Primary pages used................................ 40320033  
Overflow pages used............................... 0         
Total pages used.................................. 40441318  
Unused bytes on index pages....................... 63087082    12.7% 
Unused bytes on primary pages..................... 2898688143   1.8% 
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 2961775225   1.8% 

*** All indices ***************************************************************

Percentage of total database......................  55.3%    
Number of entries................................. 9669869512
Bytes of storage consumed......................... 205158604800
Bytes of payload.................................. 161512880277  78.7% 
Bytes of metadata................................. 29610747457  14.4% 
Average payload per entry......................... 16.70     
Average unused bytes per entry.................... 1.45      
Average metadata per entry........................ 3.06      
Average fanout.................................... 126.00    
Maximum payload per entry......................... 4033      
Entries that use overflow......................... 7            0.0% 
Index pages used.................................. 394746    
Primary pages used................................ 49692797  
Overflow pages used............................... 7         
Total pages used.................................. 50087550  
Unused bytes on index pages....................... 165165983   10.2% 
Unused bytes on primary pages..................... 13869793343   6.8% 
Unused bytes on overflow pages.................... 17740       61.9% 
Unused bytes on all pages......................... 14034977066   6.8% 


*** Table A and all its indices **************************************

Percentage of total database......................   0.36%   
Number of entries................................. 61907152  
Bytes of storage consumed......................... 1347289088
Bytes of payload.................................. 476164577   35.3% 
Bytes of metadata................................. 336657088   25.0% 
Average payload per entry......................... 7.69      
Average unused bytes per entry.................... 8.63      
Average metadata per entry........................ 5.44      
Average fanout.................................... 193.00    
Maximum payload per entry......................... 13        
Entries that use overflow......................... 0            0.0% 
Index pages used.................................. 1702      
Primary pages used................................ 327226    
Overflow pages used............................... 0         
Total pages used.................................. 328928    
Unused bytes on index pages....................... 2709763     38.9% 
Unused bytes on primary pages..................... 531757660   39.7% 
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 534467423   39.7% 

*** Table A w/o any indices ******************************************

Percentage of total database......................   0.23%   
Number of entries................................. 30953576  
Bytes of storage consumed......................... 847339520 
Bytes of payload.................................. 176578512   20.8% 
Bytes of metadata................................. 242331668   28.6% 
B-tree depth...................................... 4         
Average payload per entry......................... 5.70      
Average unused bytes per entry.................... 13.84     
Average metadata per entry........................ 7.83      
Average fanout.................................... 185.00    
Non-sequential pages.............................. 101735      49.2% 
Maximum payload per entry......................... 9         
Entries that use overflow......................... 0            0.0% 
Index pages used.................................. 1113      
Primary pages used................................ 205757    
Overflow pages used............................... 0         
Total pages used.................................. 206870    
Unused bytes on index pages....................... 2335945     51.2% 
Unused bytes on primary pages..................... 426093395   50.6% 
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 428429340   50.6% 


*** Table B and all its indices ****************************************

Percentage of total database......................   0.41%   
Number of entries................................. 92780529  
Bytes of storage consumed......................... 1501921280
Bytes of payload.................................. 1093102743  72.8% 
Bytes of metadata................................. 405432075   27.0% 
Average payload per entry......................... 11.78     
Average unused bytes per entry.................... 0.04      
Average metadata per entry........................ 4.37      
Average fanout.................................... 295.00    
Maximum payload per entry......................... 21        
Entries that use overflow......................... 0            0.0% 
Index pages used.................................. 1241      
Primary pages used................................ 365439    
Overflow pages used............................... 0         
Total pages used.................................. 366680    
Unused bytes on index pages....................... 286291       5.6% 
Unused bytes on primary pages..................... 3100171      0.21% 
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 3386462      0.23% 

*** Table B w/o any indices ********************************************

Percentage of total database......................   0.20%   
Number of entries................................. 30926843  
Bytes of storage consumed......................... 758759424 
Bytes of payload.................................. 538805686   71.0% 
Bytes of metadata................................. 217693793   28.7% 
B-tree depth...................................... 4         
Average payload per entry......................... 17.42     
Average unused bytes per entry.................... 0.07      
Average metadata per entry........................ 7.04      
Average fanout.................................... 358.00    
Non-sequential pages.............................. 10           0.005% 
Maximum payload per entry......................... 21        
Entries that use overflow......................... 0            0.0% 
Index pages used.................................. 516       
Primary pages used................................ 184728    
Overflow pages used............................... 0         
Total pages used.................................. 185244    
Unused bytes on index pages....................... 271807      12.9% 
Unused bytes on primary pages..................... 1988138      0.26% 
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 2259945      0.30%

(10) By Keith Medcalf (kmedcalf) on 2020-08-20 23:17:21 in reply to 9.1 [link] [source]

So you have an index on the "other column" in table A? So something like:

create table A
(
   id integer primary key,
   oc integer
);
create index idxA on A(oc);

is index idxA unique?

(11) By Richard Hipp (drh) on 2020-08-20 23:35:04 in reply to 9.1 [link] [source]

This seems to support your theory. You can see that there is a lot more unused space in the A table. Only 20% is actually storing content, whereas the B table has a 71% storage efficiency. Vacuuming table A probably would make it run about 3x to 4x faster.

(12) By EricG (EriccG) on 2020-08-21 06:02:53 in reply to 10 [link] [source]

Yes, that's it, and idxA is not unique.

Pretty much all the text/blob data is stored (and de-duplicated) in specific tables, all the other tables are thus only holding relatively few integer IDs, numeric values or timestamps.

I recently added two composite indexes on other tables to speed up some critical queries (20 GB for each index, which is why they had not been added before...), with those indexes table B becomes unnecessary in many queries, but this placed table A straight back into many critical queries...

After running the analyzer, I guess a couple tables and indexes are headed for the denormalization chopping block, I had not realized the indexes had grown so large.

Thanks

(13) By EricG (EriccG) on 2020-08-21 07:46:52 in reply to 11 [source]

Thanks! What would be the best way to vacuum that table using SQL commands as of 3.33.0 ?

I was thinking of running something like this in a transaction:

create temporary table defrag as select * from A
delete from A
insert into A select * from defrag
drop table defrag

I tried it and it fixed the "unused bytes" ratio, but it apparently reused the just-released non-sequential pages for the "insert", so there is still a very high number of non-sequential pages (analyzer results below), and the performance is still quite lower than table B.

Even on an SSD, the thousands of scattered random accesses take a toll, and it takes 6 seconds with a non-primed SQLite cache to do a "select count(*) from A" for instance (vs 400 ms when primed).

By comparison "select count(*) from B" takes about 400ms with a non-primed cache, and 200 ms when primed.

Would using larger database pages help ? Or would it just compound the problem ?

*** Table A w/o any indices ******************************************

Percentage of total database...................... 0.11%
Number of entries................................. 30963748
Bytes of storage consumed......................... 417886208 Bytes of payload.................................. 176640955 42.3% Bytes of metadata................................. 240452642 57.5% B-tree depth...................................... 3
Average payload per entry......................... 5.70
Average unused bytes per entry.................... 0.03
Average metadata per entry........................ 7.77
Average fanout.................................... 334.00
Non-sequential pages.............................. 63580 62.3% Maximum payload per entry......................... 9
Entries that use overflow......................... 0 0.0% Index pages used.................................. 305
Primary pages used................................ 101718
Overflow pages used............................... 0
Total pages used.................................. 102023
Unused bytes on index pages....................... 154461 12.4% Unused bytes on primary pages..................... 638150 0.15% Unused bytes on overflow pages.................... 0
Unused bytes on all pages......................... 792611 0.19%

(14) By Richard Hipp (drh) on 2020-08-21 09:47:56 in reply to 13 [link] [source]

Total pages used when from 206870 down to 102023. Surely that helped some?

(15) By EricG (EriccG) on 2020-08-21 10:24:11 in reply to 14 [link] [source]

Yes it helped in that queries against A are now about 2 to 2.7 times slower than equivalents against B, while they used to be 3-4 times slower.

It is not entirely satisfying though given that table A is almost twice smaller than table B, so I was hoping queries against A to be somewhat faster than against B (and this is the case when querying against both tables exported to a new DB).

I guess I will have to move that table to a separate database file, so that it can be vacuumed regularly on its own.

Thanks!

(16) By Keith Medcalf (kmedcalf) on 2020-08-21 15:30:37 in reply to 15 [link] [source]

Don't forget to drop and re-create the index as well.

(17) By EricG (EriccG) on 2020-08-21 16:03:02 in reply to 16 [link] [source]

You mean in addition to the vacuuming ?

(18) By Keith Medcalf (kmedcalf) on 2020-08-21 16:15:17 in reply to 17 [link] [source]

No. Vacuum will re-pack and consolidate the index as well.

However, your code for "compressing" the table by rebuilding it reorganizes the table but not the additional index. So if the additional index is being used for anything in the query then those pages are still not efficiently packed.

You need to drop and recreate the index, or do a reindex, to cause it to be rebuilt and packed.

Unless, of course, the queries you are talking about are only accessing table A through the rowid ... and not accessing the alternate index at all.

(19) By EricG (EriccG) on 2020-08-22 05:52:26 in reply to 18 [link] [source]

Ah, ok, thanks.

I thought emptying the table would be enough to have the index rebuilt.

(20) By Keith Medcalf (kmedcalf) on 2020-08-22 07:11:16 in reply to 19 [link] [source]

There is a difference between "building as you go" and re-organizing the index. Emptying the table and refilling it will result in the same non-optimally packed index as existed before (in that the data was not loaded in-order as happens when you do a create index or a reindex).

(21) By Lewis (baiwfg2) on 2022-01-06 02:36:00 in reply to 3 [link] [source]

Hi. David.

AFAIK, autovacuum return pages back to OS at the time of commiting the transaction. So every time a transaction commits, it will check whether there're valid pages in the end of file can be relocated (swapped with free pages ?) and thus truncated, right ? It will not guarantee that every commit will results in such truncation, right ?