Why UPDATEs are faster when WHERE key is sorted? How to improve performance?

(1.1) By g00ds (164896512) on 2022-06-27 09:32:12 edited from 1.0 [link] [source]

Hi there! I have a large database and a file with ID-Name pairs in each line, where ID corresponds to the primary key of the table in the database. I need to write all the names from the file into the database using corresponding IDs. Of course, if there is such ID in the table (do not insert a new record if there is no such user with this ID, just UPDATE existing one).

I've written a simple python script for this purpose. I realized that it worked extremely slow. Much slower than it has to be for UPDATE (not even INSERT). ID is a primary key, so it must be indexed automatically and it should not slow down UPDATEs, right?

Then I started experimenting with the script and found that UPDATEs are much faster when IDs are a constant or incrementing for example. The script for experiments are:

    import sqlite3
    import traceback
    import time
    import random
    
    conn = sqlite3.connect('database.sqlite')
    
    c = conn.cursor()
    c.execute('BEGIN')
    
    PERIOD_SEC = 3
    
    prev_tp = 0
    counter = 0
    
    try:
        while True:
            tp = time.time()
            elapsed_sec = tp - prev_tp
            if elapsed_sec > PERIOD_SEC:
                prev_tp = tp
    
                speed = counter / elapsed_sec
    
                print('Speed: %.02f/sec' % (speed))
                counter = 0
    
            c.execute('UPDATE users SET username = ? WHERE id = ?', ('random', 555))
            counter += 1
    
    
        c.execute('COMMIT')
    except Exception:
        c.execute('ROLLBACK')
        traceback.print_exc()

The value of username doesn't affect on performance, so let it be always random. In the example above the ID is constant and equals to 555 (the value doesn't matter, the speed is the same also for big values of ID). For this example the speed is (UPDATEs per second):

Speed: 376665.88/sec
Speed: 404738.08/sec
Speed: 404942.04/sec
Speed: 403681.24/sec

If I replace

c.execute('UPDATE users SET username = ? WHERE id = ?', ('random', 555))

with

c.execute('UPDATE users SET username = ? WHERE id = ?', ('random', inc))

where inc is increased by 1 after every UPDATE, then the results would be:

Speed: 233977.17/sec
Speed: 229630.93/sec
Speed: 223424.88/sec
Speed: 218353.56/sec

It's worse, but it's still not so bad.

But.. If I use a random ID

c.execute('UPDATE users SET username = ? WHERE id = ?', ('random', random.randint(0, 1000000000)))

then the results become disappointing:

Speed: 514.18/sec
Speed: 732.49/sec
Speed: 886.28/sec
Speed: 999.34/sec

More than 100 times slower! I also noticed the bigger range of IDs I choose the slower it becomes, but IDs from that range are always exist in the table.

I also tried using prepared statements (executemany in python). It improves performance a little, maybe 4 times faster, but for random IDs it's still very slow.

So my questions are:

Why this happens? I think it may be somehow related to caching, but I don't know if it's normal. I mean in real sutiations IDs are near to random and as I know the performance is not as bad as mine.
How to improve performance? Especially, if I have a non-sorted list of ID-Name pairs. Are there other ways besides sorting it and only then writing to the database?

Thanks in advance!

(2) By Donal Fellows (dkfellows) on 2022-06-27 10:58:25 in reply to 1.1 [link] [source]

You don't say what the schema is of the table that you're running the UPDATE against, or approximately how many rows are in that table.

(6) By g00ds (164896512) on 2022-06-27 19:07:18 in reply to 2 [link] [source]

I posted the database schema and other details under Gunter Hick's comment.

(3) By Gunter Hick (gunter_hick) on 2022-06-27 11:35:07 in reply to 1.1 [link] [source]

First, you may find that preparing a statement and binding parameters is very much faster than re-preparing the statement for each set of parameters.

Second, repeatedly updating the same row means that the data is already in the cache; updating in sequence means a high likelyhood that the required pages are already in the cache, except when you move to the naxt page; updating in random sequence requires a larger set of working pages to work at the same speed as staying in a limited set of pages.

Thirdly, you are providing way to little information (like the actual schema, the number of records in the table, the indices, the size of the data, ...) for any meaningful analysis.

(5) By g00ds (164896512) on 2022-06-27 19:05:42 in reply to 3 [link] [source]

I created a sample database with a simple schema to demonstrate you the problem. The database is generated using this script: here it is There are no indexes. Total number of records equals to 1.5 billion records and the size of the database is 16.6 GB.

The schema:

0|id|INTEGER|0||1
1|username|TEXT|0||0
2|score|INTEGER|0||0

I've written a simple script for testing UPDATEs performance. Link Uncomment lines starting with "batch = [(123456" to choose which ID to use: constant, incremental or random.

The results:

Constant ID:

Speed: 988930.78/sec (UPDATEs/sec)
Speed: 1114647.55/sec (UPDATEs/sec)
Speed: 972525.72/sec (UPDATEs/sec)
Speed: 980894.68/sec (UPDATEs/sec)
Speed: 1015316.88/sec (UPDATEs/sec)

Incremental ID:

Speed: 1045197.19/sec (UPDATEs/sec)
Speed: 956844.17/sec (UPDATEs/sec)
Speed: 972709.29/sec (UPDATEs/sec)

Random ID:

Speed: 23985.74/sec (UPDATEs/sec)
Speed: 25132.00/sec (UPDATEs/sec)
Speed: 25622.54/sec (UPDATEs/sec)
Speed: 26498.24/sec (UPDATEs/sec)
Speed: 27205.41/sec (UPDATEs/sec)

Random ID is more than 50 times slower than others. It seems like it's related to cache, but anyway are there any ways to optimize it? The database creation takes less than 30 minutes, but updating values at random IDs will take at least 24 hours or even more.

Any ideas? Hope for your help!

(9) By Keith Medcalf (kmedcalf) on 2022-06-27 21:54:04 in reply to 5 [source]

Repetitive update of the same record aside (which should occur at the maximum speed that your computer is capable of attaining), I get the same results (half a million TPS) for the incremental and random ids.

THe diference may be in your table definition (which you still have not shown).

A table definition means the CREATE TABLE statement which was used to create the table.

(10) By Keith Medcalf (kmedcalf) on 2022-06-27 22:05:54 in reply to 5 [link] [source]

I would also note that every update to the database requires deleting and re-allocating the entire row since you have created the database with nothing other than the rowid, and "insert" the data when you "update" it (as you call it).

This is absolutely the most pernicious example and you should expect the absolute worst performance possible (not representative of a functional system or database) performance.

(11) By Gunter Hick (gunter_hick) on 2022-06-28 06:28:44 in reply to 5 [link] [source]

Your initial load does not enter any data, so (nearly) every record's internal representation follows the pattern (int,NULL,NULL;<id>) where the first three elements are the "manifest" (list of what is there) and everything after the semicolon is the payload (in this case, just the variable length id).

Since this is an ordered load, you would typically end up with leaf pages 90% filled with adjacent records and internal pages 90% filled with page references in a balanced tree of minimal height.

1) update of the same record multiple times

The first update requires rewriting the single leaf page that contains the record, so that the longer (int,NULL,int;<id>,<score>) record gets squeezed in. Noting else changes. The whole transaction is exactly 1 page long, and gets written to disk about twice (journal and database).

The transaction needs to read H pages, where H is the height of the tree including the leaf pages. Total I/O H+2

2) update of adjacent records

The first N updates (where N is a function of the database pages size) will require rewriting the same leaf page N times; since it is 90% full to begin with, it will not be able to hold the additional data, meaning it wil be split into two pages; this adds an entry to the parent node and a page to the database file.

The next N updates will require rewriting the next leaf page N times; which will also be split; and adds another entry to the parent node.

In total B/N (where B is the batch size) original leaf pages and B/N additional leaf pages and some number of internal pages need to be written (twice).

So you will be reading and writing a little more than 2*B/N pages, total I/O 6*B/N pages

3) update of random records

Each update will required H pages to be read and 1 page to be written (twice), with some reads being satisfied by the cache. Some pages may be modified more than once, but probably not enough to force a split.

Total I/O B*(H+2) pages, minus cache hits

On a linux system, you should be able to run under strace to count the number of actual I/O operations.

What you have not yet seen is the effect of the username field. This will change the internal format to (int,text(n),int;<id>,<name>,<score>) and an update of score requires skipping the variable length name field. You should consider putting TEXT (and BLOB) fields at the end of the record defintion.

(4) By Bill Wade (billwade) on 2022-06-27 11:58:52 in reply to 1.1 [link] [source]

You don't ever commit in your infinite loop, so sqlite.exe only needs to go to disk to

Get something it doesn't already have in memory.
To free up memory (toss something it doesn't want to remember, or to make room for something new).

For the first case, not counting the first time through the loop, it likely doesn't need to go to disk at all, and it goes through your loop in about 2.5e-6 seconds.

For the second case, it will want to read/write logically consecutive records in a tree. On most systems, sequential i/o is fast, but it still cuts your speed in half, to about 5e-6 seconds per iteration. Each write to disk averages dozens to hundreds of your points.

For the third case, you are accessing and modifying items more or less at random. On average, each disk i/o gives you slightly less than one point (one point from the leaf record, plus additional records related to the interior of the tree).

Change your tests to do raw i/o. Make a big file.

Repeatedly read and write 4k at the same offset.
Repeatedly read and write 4k sequentially.
Repeatedly read and write 4k at random locations.

You'll see a similar progression in timings.

(7) By g00ds (164896512) on 2022-06-27 19:42:35 in reply to 4 [link] [source]

So you think the only way to improve performance is to sort the IDs?

(8) By anonymous on 2022-06-27 20:22:01 in reply to 7 [link] [source]

If your set of random ids spans the whole key space then it will surely slow down as it will need to fetch a lot of data from desk.

That said, it might be beneficial to try and have a large amount of cache. While there will be evictions probably, still you could get value from having many requests hitting the cache. For that I would use mmap and set it to the highest reasonable value for my system. e.g. Pragma mmap = 4 * 1024 * 1024 * 1024 to create a 4GB mmap

Hope that helps