SQLite Forum

Network filesystems for sqlite DBs
Login
Firstly, iSCSI is not a filesystem but rather a block transport protocol.  Where is the filesystem located?  It does not matter whether the SCSI cable is a couple of inches or thousands of miles long, nor whether it is parallel or serial or glass fiber; nor does it really matter if the "transport" layer for the SCSI protocol is differential direct signalling, single-ended direct signalling, phase-shift or pulse-modulated signalling, or via some other transport such as ethernet, IP, IPX/SPX, DecNET, GPIB, TCP/IP, UDP/IP, postal mail or via secretaries and airplanes.  iSCSI simply transports blocks between a "storage device" and "a filesystem".

What is important is the filesystem and where it is located and that it works correctly, not the mode by which the filesystem transports blocks to and from the storage device.

Secondly, it is likely that what you are seeing with respect to "RAID" is probably due to the fact that the "filesystem" placed over top of the RAID controller is either (a) choosing performance over reliability by "cheating" or (b) is a professional grade block caching controller that costs quite a lot of money and that high price is to allow "performance" while still remaining "reliable" (this is uncommon except on very high cost professional stand-alone RAID controllers).

Thirdly, you do not specify that your NFS tests are actually multileaved.  That is, each one of your "testbeds" must be executing on a separate machine from each other and each of those separate from the machine "serving" the filesystem, otherwise you results are not actually testing distributed multiple access arbitrage to a remote filesystem.

To summarize, the root of the issue for *all* systems which have multiple access to a file located on a "remote" filesystem is the overhead and work that is required to maintain consistency of the "remote view" of the file (and a drectory is nothing more than a file containing directory entries, and a filesystem a directoy file of other directory files) amongst multiple accessors located on different machines.  Maintaining this consistency with "full reliability" is extremely expensive and slow.  Because a "remote shared filesystem" is generally used to permit access BY ONE client to only ONE FILE at a time (not arbitrate multiple access to the same file) most "remote filesystems" implement lots of shortcuts and are ill-prepared to deal with multiple access to the same file -- that is they assume that the usage pattern will be single access per file and are optimized for that scenario and are inherently unreliable (or very slow) when they must deal with arbitrage of multiple access to the same file (unless that access is read-only).

Consider that when a "local filesystem" has to maintain consistency of view of a single file between multiple accessors, the "turn around" latency for doing so is relatively tiny -- it is local after all -- and often measured in nanoseconds.  But to do the same thing over a network requires "adding" a "network turnaround latency" to and from each distributed node accessing the file for each operation, to achieve the same result.  This is a "hard problem" and most solutions heavily trade "opportunism" for "reliability".

This is why just about every single database product requires the filesystem to be "local" or, if remote, then"exclusive access" in order to work reliably.