sector flush atomic
(1) By sqlite3_preupdate_count (XiongZaiBingGan) on 2021-01-02 06:34:54 [link] [source]
if the sector write is linear at disk driver layer,sqlite transaction is still atomic?
in this case: when save the page count(at journal header) to journal file,the page count is 4 bytes size,so if the power lose between the first 2 bytes save in journal file successful and last 2 bytes not save in file. then power restore,the page count at the journal file head is the first 2 bytes value,the last 2 bytes value is missed.the rollback only rewrite the frist 2 bytes count page content to database file.yeah,crictal error happens.
is my personal understanding right？
(2) By Larry Brasfield (LarryBrasfield) on 2021-01-02 12:33:23 in reply to 1 [source]
While it is unclear what you mean by "critical error happens", I can say that you are overlooking critical aspects of block storage devices. One aspect is that at the smallest block level, writes are effectively of whole blocks at once. For non-volatile memory, this is because the block's group of bytes are set to the new value in a single operation. For spinning (or sliding) magnetic media, the group is concatenated with an error check code (sometimes called "CRC") such that the sequence of bits constituting the block will be read as erroneous unless the whole block with its check code is written. Another aspect is that "power loss" is not ordinarily an instantaneous event. (A sledge hammer crashing down upon a circuit board could be an extraordinary event producing an effectively instantaneous loss of power to critical circuits.) Ordinarily, after line power ceases or a battery becomes low enough to indicate imminent power loss, a system has a certain amount of stored energy downstream of the "lost" power source, and that stored energy is sufficient to complete one of those block write operations. (And, for spinning disc drives, there is enough to also move the read/write heads to a place where they will not scrape the magnetic medium as the disc slows to a stop.)
Hence, your worry about writes being abruptly terminated at the byte-by-byte level is overwrought, provided sledge hammers (or equivalents) are not involved.
(3) By sqlite3_preupdate_count (XiongZaiBingGan) on 2021-01-02 15:07:01 in reply to 2 [link] [source]
thanks for your careful explanation.i get it. ^_^
(4) By Keith Medcalf (kmedcalf) on 2021-01-02 21:15:43 in reply to 3 [link] [source]
A partial write to a single sector or of a filesystem block (consisting of multiple sectors) is called a torn write.
Torn write handling is also described in the documentation and found here https://sqlite.org/atomiccommit.html for the standard journal mode. There are links on that page to how torn page / torn block errors are handled when using write-ahead logging.
SQLite3 basically makes no distinction between a torn write which renders a sector unreadable due to corruption of the hardware FEC and CRC data and a torn write which occurs on a sector boundary within a filesystem block (which cannot be detected by the hardware) but takes precautions effective against both simultaneously.
There is a capability for the underlying OS to indicate that it is tear-proof but correctly implementing tear-proof storage is very expensive so very few OS's or filesystems even bother to indicate it even if it is present.