SQLite Forum

null character sorts greater than 0x7f and less than 0x80
Login
I believe Tim was responding to your assertion, "The sequence C0 80 is perfectly valid UTF-8 encoding."

It does follow a mechanistic application of the encoding rules for multi-byte sequences. But the standard cited in the Wikipedia article and the objective of keeping a one-to-one mapping between code points and UTF-8 encodings are violated and defeated by use of 0xC0 0x80 to encode 0x00. The standard clearly calls for avoidance of "overlong" encodings, in service of the unique mapping objective, and 0xC0 0x80 is clearly in that category.

This point is unrelated to how people might feel about that standard.