SQLite Forum

TEXT values can contain arbitrary binary data
Login
> If you do not, SQLite3 believes and relies on the fact that you are telling the truth

While I understand that this is a stance a project might take, part of the point of my original post is that I think this is not properly documented, if it is the stance SQLite desires. (Vs. considering storing non-text in `text` columns a bug.) I think where the docs state,

> TEXT. The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).

This is empirically not true: it simply isn't possible to store 0xff in UTF-8. If SQLite is literally storing 0xff <the rest of my query's string, in UTF-8>, I think that's okay, just that the docs should be more forthcoming that SQLite doesn't validate the data stored in it.

I also dug into, a bit, the docs around the actual storage format; they state,

> Value is a string in the text encoding and (N-13)/2 bytes in length. The nul terminator is not stored. 

"the text encoding" is the `PRAGMA encoding;` of the SQLite file, in my case, UTF-8. But, again, empirically, we seem to have managed to store something that isn't UTF-8 there.

> Similarly, if the API used to "retrieve" the value from SQLite3 specified a different encoding that the database encoding, then SQLite3 will translate the "value" for presentement to the user.

I want to make sure something else is clear, around this point. I submit queries to SQLite, which are text strings in some encoding. In the example I presented, the query is entirely ASCII, and thus also valid UTF-8. It ends up computing and storing a value that isn't UTF-8.

(The other stance being that this behavior is a bug. I personally think that's the far more useful stance, to a user of the library, as it catches errors and prevents further nonsense from occurring down the road. But I acknowledge that SQLite might disagree with that; but then I think the invariants, or lack thereof, should be clearly documented.)