Results of queries containing null bytes may be incorrect
(1.1) By Saki Takamachi (sakiot) on 2024-04-15 14:02:24 edited from 1.0 [source]
Hi, I came from php-src.
I posted the code to gist, so please take a look.
https://gist.github.com/SakiTakamachi/62368c37a355e58a19efb8d4aa802ab5
If null bytes are present, the resulting string does not match the number of characters obtained by LENGTH.
Also, sqlite3_snprintf doesn't seem to support null bytes, are there any plans to support it in the future?
There have been reports of problems with pdo_sqlite and null bytes, and I first reported them to upstream in order to consider how to deal with them on the php side.
https://github.com/php/php-src/issues/13952
Regards,
Saki
(2) By Gunter Hick (gunter_hick) on 2024-04-15 14:05:20 in reply to 1.1 [link] [source]
Emebdded NUL bytes are not supported for TEXT fields. Use BLOB type for things with embedded NUL characters.
(5) By Saki Takamachi (sakiot) on 2024-04-17 14:12:23 in reply to 2 [link] [source]
As you said, it worked fine with blob. thank you.
(3) By Simon Slavin (slavin) on 2024-04-16 12:17:39 in reply to 1.1 [link] [source]
SQLite is written in C. In C, a null byte is the terminator for a string. It says "That string you were reading stops here.".
It wouldn't be the right thing to do if SQLite was written now, but back then all the good stuff was written in C and every programmer would expect this behaviour. It cannot be changed without changes to the internals of SQLite, and some sort of PRAGMA that says whether you want old behaviour or new behaviour.
If anything, the behaviour of LENGTH with regard to strings containing a null character is at fault.
(6) By Saki Takamachi (sakiot) on 2024-04-17 14:14:35 in reply to 3 [link] [source]
Yes, that's right. However, as in the repro code I wrote, it is possible to retrieve the string itself, so I asked the question.
This doesn't mean that I want such a feature to be supported, I just wanted to hear SQLite's thoughts on this in order to decide on a php implementation policy :)
(9) By Gunter Hick (gunter_hick) on 2024-04-17 14:45:58 in reply to 6 [link] [source]
SQLite is very good at returning exactly the data you gave it, even if that data does not conform to what you promised it does, provided you do not attempt to interpret/convert it inside SQLite. It will happily return ISO encoded strings unchanged, unless you insist on converting between encodings. Likewise with embedded NUL bytes, unless you pass them to functions that expect text input, which will stop at the first NUL byte.
(4) By ddevienne on 2024-04-16 12:47:05 in reply to 1.1 [link] [source]
While SQLite is happy to store null bytes in text
-typed values,
i.e. you can read and write them fine, most functions, either
SQLite SQL functions or C function (like the printf()
in your code),
will stop at the first null byte. Therefore, as others have already pointed out,
it's typically not a good idea to have embedded nulls.
OTOH, octet_length()
is a recently addition that differs in that regard,
and looks at the record format's length (and text-encoding), so O(1), instead of scanning the text, O(N).
E:\oss\sqlite>.\sqlite3.exe
SQLite version 3.45.2 2024-03-12 11:06:23 (UTF-16 console I/O)
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite>
sqlite> select 'foo' || char(0) || 'bar';
foo
sqlite> select length('foo' || char(0) || 'bar');
3
sqlite> select octet_length('foo' || char(0) || 'bar');
7
sqlite> select printf('%s', 'foo' || char(0) || 'bar');
foo
sqlite> select hex('foo' || char(0) || 'bar');
666F6F00626172
(7) By Saki Takamachi (sakiot) on 2024-04-17 14:15:49 in reply to 4 [link] [source]
I learned about octet_length
for the first time. thank you.
(8) By Saki Takamachi (sakiot) on 2024-04-17 14:20:51 in reply to 1.1 [link] [source]
Personally, I don't really like using hacks on the php side to make it appear as if SQLite supports features that SQLite doesn't support.
Thank you for all the feedback.