SQLite Forum

memory vs mmap
Login
Thanks for the technical information, but I don't understand why you are expecting variable results from the same data file. ie sometimes 2.5X slower (with short runs), sometimes 20X slower (in long runs). ie compared to memory.

Basically if we use an input file with 1000 records in it, and it takes 1 minute (while accessing the read-only 1.6 GB database on a network drive), then I expect that if we process 10,000 records that it should take 10 minutes.

But that is not what we see. For a single job on a single machine, every time we process larger amounts of data it starts getting "exponentially" slower.

It is somewhat amusing that this word was used to report the problem instead of focusing on the numbers, because the developer thought it was an odd complaint, and said that yes, it will obviously increase exponentially as you give it larger input files, but when I was presented with the numbers myself, I asked him what his expectations were (like above), he did in fact expect it to SCALE, and he agreed that "exponential" was the wrong word to use.

I then told him IT WASN'T SCALING AS EXPECTED. And only then did he actually start investigating the problem and it was actually him who somehow thought to try putting the database in /tmp which he somehow knew wasn't on the network.

I had previously been telling him that disks are cached in memory, so it isn't an issue with disk speed. I didn't expect that in 2021 I would be in a commercial environment where we had uncached disks, which effectively bypasses decades of ridiculously high CPU clock speeds and ridiculous amounts of memory. I started using SMARTDRV.SYS in 1991 with MSDOS 5.0. After a fairly short number of years, the OSes started doing the caching automatically and unconditionally and without scope for user configuration, so that was the end of any disk caching interest.

Until someone said "NFS3" and someone else said "delegations".