SQLite Forum

How much would you trust a page-level checksum ?
Login
For Simon's original question:

"what's the risk that a change in a page's contents would leave the checksum unchanged ?"

I agree that the birthday paradox does not apply when you are considering only two pages (before and after). Accidental collisions would be exceedingly rare for some reasonable definition of a good 64-bit checksum. As others have pointed out, checksums can reliably tell you that they are different, and if "exceedingly rare" isn't good enough for you, do the full sequence test determine if they are indeed the same.

However the subject had drifted a bit. For malicious attackers, I'd worry that it wasn't enough. I'd worry about any 64-bit "digital signature." Birthday paradox isn't directly applicable, but it, along with knowledge of the checksum internals are tools for the attacker, and the smaller N gets, the more useful some such tools become.

The birthday paradox really kicks in when you start comparing the checksum of every page in the database (or across databases), perhaps to reduce duplication. There the numbers start to get large enough to "reasonably" expect 64-bit checksums to match when the contents don't, even in the absence of bugs and malicious actors.

I probably extracted Simon's original post a bit too far. My apologies.