SQLite User Forum: memory vs mmap

> If network disks were massively slower than local hard disks, …it would be a well-known problem.

It *is* a well-known problem. Maybe not well-known to you and yours, but I can dig up references from the dawn of network filesystem based computing in the early 1980s for this, if you require it.

> The network team didn't believe that a local disk was faster that a network disk

If you're testing matching storage on both sides of the connection, then of course the local disk will not be any faster than the actual remote *disk*, but the remote disk isn't attached to the local CPU in the same way the local disk is. There are a whole *pile* of other considerations stemming from that fact:

1. Local NVMe latency is going to be orders of magnitude faster than ping time to the remote NAS just on speed-of-light delays alone. The distance between the NVMe connector and the CPU is maybe a few cm long. The Ethernet cabling, switch fabric, internal MAC to CPU link, etc. is likely to come to tens of meters, so you're orders of magnitude off already. Ya canna change the laws o' physics, captin'! [A foot of copper is a nanosecond](https://hackaday.com/2012/02/27/visualizing-a-nanosecond/), and at today's speeds, nanoseconds matter.

    Once upon a time, speed-of-light delays only affected computer design at the extreme high end, but we've been at that point down at the low end now too, for a very long time. If you did nothing to a Raspberry Pi but double its physical size, scaling everything on that board evenly, it would *cease…to…work!*

2. The "stack" depth of a bare NVMe disk is a lot shallower than for

          Ethernet+ICMP+IP+UDP+kernel+NVMe+UDP+IP+ICMP+Ethernet

    Every packet of data to a typical remote NAS has to go clear down through the TCP/IP stack into the fileserver software, then back up that same long stack to return the result.

    And that's only for in-kernel network filesystems. If you're dealing with a userspace NAS technology like Samba, you have a kernel-to-userspace transition and probably a deeper file server code stack besides.

3. Cooperation delays aren't zero. You speak of having 10 readers, but at bare minimum that must cut per-reader bandwidth by 10 on the remote side, and it's only that low only if there is no arbitration among the readers, so that all readers can proceed in parallel, the ÷10 factor coming only from the shared disk bandwidth.

Let's get concrete. I just did a ping time test here to a nearby NAS on a quiet gigabit LAN and got an average of 350 µs over 25 packets. The read latency on a good NVMe SSD is likely to be 10 times lower than that, and with the ping test, you haven't even *considered* disk access latency yet.

If your own tests tell you there's an order of magnitude or two difference between two test cases, *believe the test result*, don't resort to argument-from-incredulity to talk yourself into believing that the test was wrong.

> The characteristics of the network disk is that in the beginning, maybe for a couple of minutes, it is about 2.5 times slower than a local disk

I'm stunned it's that low! I'd consider that wild success.

> over time it ends up being 20 times slower

*Just* a one order of magnitude speed drop? If after reading my arguments above you think that's unreasonable, go read them again. Point 1 alone shows you're experiencing an excellent result.

> it has no way of knowing if another system updated the file.

As others have shown, that's a solvable problem, but you can't solve it for free. All coordination I/O has to go between the nodes somehow, and that has a cost.

> the conclusion has been drawn that it is a problem with NFS3 and can't be changed.

NFS certainly has problems, but you can't charge all of the problems you're seeing to a network protocol alone. You can invent the bestest baddest speediest network file sharing protocol, and you'll still have the speed-of-light delays and coordination overhead among cooperating nodes.

TANSTAAFL.