SQLite Forum

slow import of CSV files
Login
This is extremely faked, but notice the difference between in-order insertion and out-of-order insertion:

```
>sqlite sample.db
SQLite version 3.34.0 2020-09-26 18:58:45
Enter ".help" for usage hints.
sqlite> create table x(x text primary key not null);
sqlite> .timer on
sqlite> pragma cache_size=1000;
Run Time: real 0.000 user 0.000000 sys 0.000000
sqlite> insert into x select random() from wholenumber where value between 1 and 10000000;
Run Time: real 197.973 user 36.953125 sys 160.046875
sqlite> .exit
>dir sample.db
2020-09-26  18:22       601,960,448 sample.db
>del sample.db
>sqlite sample.db
SQLite version 3.34.0 2020-09-26 18:58:45
Enter ".help" for usage hints.
sqlite> pragma cache_size=1000;
sqlite> create table x(x text primary key not null);
sqlite> .timer on
sqlite> insert into x select random() from wholenumber where value between 1 and 10000000 order by 1;
Run Time: real 14.937 user 17.078125 sys 2.843750
```

In the first case the insertion is in "random" order, meaning that the index structures must be continually updated and the B-Tree rebalanced as the insert progresses.

In the second case the insertion is "in-order" so the B-Tree is always built in-order with a minimum of rebalancing as the insert progresses.

Note that I have reduced the cache to itty-bitty (which is the default) since this will have an devastatingly huge impact on the amount of I/O performed (I have a cruise-ship-load of RAM for caching, so without this the difference is less noticeable).

Even with the sort (order by clause) the in-order insertion is more than 10 times faster than the out-of-order insertion.