CLI does not escape control characters in its output.
(1.1) Originally by siddharth with edits by Richard Hipp (drh) on 2025-02-24 11:05:35 from 1.0 [link] [source]
(↑ Title edited by the moderator for accuracy)
Hi,
sqlite3
parses escape sequences does not sanitize its output by escaping or omitting control characters found in database content (← edit by moderator) which allows a malicious actor to perform anti-forensics.
To prove that sqlite3
parses escape sequences does not suppress control characters in output (← edit by moderator), we can create a simple database that has a table named test
with a column named text
. We can then use the command below to insert the string containing the the escape sequence into text
. The reason why the string is inserted through a shell command is because the the literal escape character (\e
) needs to to be interpreted into the actual escape character (ASCII 27). The escape character can be copied and pasted, but using printf
is a little easier.
sqlite3 demo.db "insert into test values ('$(printf "\e[32mThis text is green.\e[0m")');"
As seen in the screenshot below, when we query the contents of the table, the text that was inserted is returned in the color green, indicating that sqlite3
parses escape sequences.
This vulnerability can be exploited for anti-forensics. Lets say a malicious actor compromises a site and creates a new administrative user named hacker
. They can hide this user from a forensics analyst or the server admin by creating another user who's username is the escape sequence for moving the cursor up twice (\e[2A
). This leads to the username hacker
being overwritten with the the next username in the table, when all usernames are listed. The proof of concept below demonstrates this.
The following code was used to create the table in the proof of concept:
sqlite3 database.db "insert into user values ('alice');"
sqlite3 database.db "insert into user values ('jonathan');"
sqlite3 database.db "insert into user values ('hacker');"
sqlite3 database.db "insert into user values ('$(printf "\e[2A")');"
sqlite3 database.db "insert into user values ('charlotte');"
sqlite3 database.db "insert into user values ('bob');"
This vulnerability can be fixed by simply replacing 0x1b
with \e
.
(2) By Stephan Beal (stephan) on 2025-02-22 17:00:08 in reply to 1.0 [link] [source]
"insert into test values ('$(printf "\e[32mThis text is green.\e[0m")');"
SQLite does not parse those. It emits them as-is to your terminal and your terminal is processing them. Try it on a terminal which does not support ANSI escape sequences, or simply redirect the output to a file, instead of a terminal, to confirm that.
(3.1) By Adrian Ho (lexfiend) on 2025-02-22 17:05:05 edited from 3.0 in reply to 1.0 [link] [source]
As seen in the screenshot below, when we query the contents of the table, the text that was inserted is returned in the color green, indicating that sqlite3 parses escape sequences.
Nope, it indicates that SQLite preserves string data as-is. It's your terminal that's colouring the SELECT output based on the embedded escape sequences, not SQLite (try using a web front-end or any of the SQLite GUIs to display the results of the same SELECT, and you'll see what I mean).
This vulnerability can be exploited for anti-forensics.
If "anti-forensics" are limited to manual SQL querying on the command line and visual scanning of results, sure. But such "anti-forensics" would be laughed out of business in short order.
This vulnerability can be fixed by simply replacing 0x1b with e.
SQLite stores whatever data your application gives it very very carefully, so that it can return the exact bytes you gave it when asked. Calling this a "vulnerability" makes about as much sense as calling a passenger jet a "weapon".
And guess what happens when you SELECT username FROM user ORDER BY username
, or perform any query that reorders the rows?
(4) By Spindrift (spindrift) on 2025-02-22 17:32:35 in reply to 1.0 [link] [source]
This vulnerability can be fixed by simply replacing 0x1b with e
You're suggesting that sqlite should mess with the data that you have asked it to store and not return the exact data that you presented?
That seems unwise.
(5) By siddharth on 2025-02-22 17:55:53 in reply to 4 [link] [source]
I understand replacing \0x1b
with \e
would be modification of the stored data. But if we take another example where we store \e[2J
in the username column, then the whole screen would be cleared upon returning the contents of the username column. Another example could be to store \e[8m
which makes preceding text invisible. This would lead to the user not being able to see whats in the table through the command line. They would then have to use a GUI tool to view and fix the table.
(6) By Stephan Beal (stephan) on 2025-02-22 18:01:38 in reply to 5 [link] [source]
then the whole screen would be cleared upon returning the contents of the username column.
That behavior is part of your terminal, not of the sqlite library or its CLI shell application. It is most definitely not a bug in SQLite, which makes every effort to give the user back precisely what they put into it. If it didn't, that would be a bug.
Working around this "problem" would require that the CLI shell app be able to parse ANSI escape codes in arbitrary user input. That Way Lies Madness.
(7) By Simon Slavin (slavin) on 2025-02-24 15:09:47 in reply to 5 [link] [source]
The sequences you're quoting like \e[8m
are ANSI escape sequences. You intentionally told SQLite to output an ANSI escape sequence, so that's what it did.
SQLite, which just stores data and retrieves it, doesn't do anything special because SQLite doesn't know what you're going to do with its data. Perhaps you want to count how many characters are in it. Perhaps its encrypted text and means nothing. Perhaps you want to send it to the Mars Explorer. Perhaps you are intentionally outputting an ANSI control sequence and want it to have an affect.
However, the terminal program you're using thinks it's special, analyses it, and does what it thinks you want done. If you want to disable that, tell the terminal program you're using to ignore ANSI control codes and/or ANSI colours. I doubt there's a terminal program which can't do that, but if there is, pick another.
To build similar functionality into the SQLite CLI would take a lot of work, since it does not currently understand ANSI control codes. And if this was done for the ANSI set it should also have to be done for Microsoft's Virtual Terminal sequences, ECMA-48, VT-52, C0 and C1 control codes and the whole godforsaken word salad that has grown up over the years.
(8) By Richard Hipp (drh) on 2025-02-24 15:45:17 in reply to 7 [link] [source]
To build similar functionality into the SQLite CLI would take a lot of work
Yes it is. But I've spent the weekend working on it. The current prototype is on the unistr branch. Please compile the tip of that branch and give it a try. Provide feedback.
With the proposed changes, the output of Siddharth's query like the following, by default:
␛[32mThis text is green.␛[0m
The U+001b character (ESC) gets mapped into U+241b for display. You can also add the "--escape ascii" command-line option to get a more traditional unix-like output:
^[[32mThis text is green.^[[0m
Or, you can use the "--escape off" command-line option so that no translations are made, and the text really will be green on a VT100-compatible device.
List Of Changes So Far
- Three control-character output modes in the CLI: symbol, ascii, off. Use the "--escape MODE" command-line option to select which on to use. "symbol" is the default.
- New SQL function: unistr(X) which works the same as in PostgreSQL (and SQL Server, and Oracle - MySQL lacks this function).
- New SQL function: unistr_quote(X) - works like the existing quote(X) with the addition that it escape control characters in a way that can be decoded by unistr().
- The alternate-form flag on "%#Q" and "%#q" conversions in format() and similar inserts "\u00XX" escapes for control characters.
- The --escape option is also available on the ".mode" command.
- The output of ".dump" now uses the new unistr() SQL function.
I'm still working through all the details
Questions:
- Is "unistr_quote(X)" the right name for that new function?
- Should the default escape mode by "ascii" instead of "symbol"?
- Are "symbol" and "ascii" the right names for the control-character escape modes?
- The CLI lets through TAB, NL, and CRNL without escaping. Should there be an alternative mode or modes that escapes those characters too?
(9) By ian wild (ianwild) on 2025-02-24 16:31:15 in reply to 8 [link] [source]
What (if anything) happens to the 8-bit control codes?
select char (0x9b) || '32mThis text is also green' || char (0x9b) || 'm';
(10) By Richard Hipp (drh) on 2025-02-24 16:46:13 in reply to 9 [link] [source]
It only translates U+0001 through U+001f. All other characters are passed-through as-is.
Depending on context, U+0009 (TAB), U+000a (NL), and U+000d,U+000A (CRNL) might also get passed through as-is. Isolated CRs that are not immediately followed by NL are escaped.
(11) By anonymous on 2025-02-24 17:34:32 in reply to 10 [link] [source]
I think that DEL
(0x7f
) can also be included in the conversion. It is a non-printing character that can't normally be seen.
␡ = U+2421 = UTF-8: 0xE2 0x90 0xA1 = UTF-16: 0x2421 = C octal escaped UTF-8: \342\220\241 = XML decimal entity: ␡
(12.1) By ian wild (ianwild) on 2025-02-25 10:58:45 edited from 12.0 in reply to 8 [link] [source]
...give it a try. Provide feedback.
a) To my old eyes, the "symbol" format is far less legible than "ascii". I find "^[
" is much easier to read at a glance than (what looks like) a faded backslash character.
b) By default, .dump
could now create files that older versions can't read, because they lack the unistr()
function. This could be fixed by judicious use of "escape off", but no existing script knows how to do that. Maybe the default should be "off" if stdout is not a tty? (Or maybe that opens a different can of worms.)
c) To fully address the issue that started this thread, something should really be done about the 8-bit CSI and friends. (Though, to be honest, I don't see the problem. sqlite
is no worse than awk
or cat
in this regard, and I would probably be tempted to add ".mode -escape off
" to my .sqliterc
file.)
d) I like the unistr()
function: it's much nicer than the ad hoc replace(...,'\n',char(10))
type of construct I occasionally need.
(Edit:)
e) Having both .mode ascii
and .mode -escape ascii
is likely to lead to some very confused bug reports.
(13) By Richard Hipp (drh) on 2025-02-25 19:03:01 in reply to 1.1 [link] [source]
The concern has now been addressed on trunk. A more detailed explanation of the changes can be seen at https://sqlite.org/draft/cli.html#ctrlchr.
(14) By Bo Lindbergh (_blgl_) on 2025-02-25 19:57:27 in reply to 13 [link] [source]
The symbol
mode assumes a UTF-8 terminal. If that isn't so, you get nonsense like â›
instead. The default ought to be ascii
.
(15) By mgr (mgrmgr) on 2025-03-02 19:53:34 in reply to 13 [source]
The unistr_quote(...)
and printf('%#Q',...)
functions return garbage for some characters like e.g. 'ä' (unicode 00E4):
sqlite> select unistr_quote(unistr('\u00E4')), unistr('\u00E4');
unistr('\u0003\u0004')|ä
same for ü, Ä, é, ç, ñ, â ... anything beyond pure ASCII ?
Therefore, .dump
basically destroys any text with accented letters, this is pretty bad.
On latest version from trunk as of 'now' (46479c2e30b9676e0fa8da117ba67f673671fb340c9bea38ece19a1b2371a57b)
(16) By Richard Hipp (drh) on 2025-03-02 20:30:39 in reply to 15 [link] [source]
Thanks for the report. Please try again with check-in b7c5ce84216cc7f5 or later.
(18) By mgr (mgrmgr) on 2025-03-02 21:38:51 in reply to 16 [link] [source]
looks good now, thank you.
Quick check for all possible unicode codepoints:
sqlite> select value, unistr_quote(char(value)) from generate_series(1,0x10fffd) where quote(char(value))<>unistr_quote(char(value));
1|unistr('\u0001')
2|unistr('\u0002')
3|unistr('\u0003')
4|unistr('\u0004')
5|unistr('\u0005')
6|unistr('\u0006')
7|unistr('\u0007')
8|unistr('\u0008')
9|unistr('\u0009')
10|unistr('\u000a')
11|unistr('\u000b')
12|unistr('\u000c')
13|unistr('\u000d')
14|unistr('\u000e')
15|unistr('\u000f')
16|unistr('\u0010')
17|unistr('\u0011')
18|unistr('\u0012')
19|unistr('\u0013')
20|unistr('\u0014')
21|unistr('\u0015')
22|unistr('\u0016')
23|unistr('\u0017')
24|unistr('\u0018')
25|unistr('\u0019')
26|unistr('\u001a')
27|unistr('\u001b')
28|unistr('\u001c')
29|unistr('\u001d')
30|unistr('\u001e')
31|unistr('\u001f')
sqlite>
(17.1) By Aask (AAsk1902) on 2025-03-02 20:49:48 edited from 17.0 in reply to 15 [link] [source]
UPPER & LOWER fail (are ignored) for accented characters, e.g.
select upper('éæ');--Unicode 233 & 230, respectively. Expected ÉÆ i.e. Unicode 201 198
upper('éæ')
-----------
éæ
(19) By mgr (mgrmgr) on 2025-03-02 21:42:42 in reply to 17.1 [link] [source]
this is documented behaviour:
The lower(X) function returns a copy of string X with all ASCII characters converted to lower case. The default built-in lower() function works for ASCII characters only. To do case conversions on non-ASCII characters, load the ICU extension.
(20) By Spindrift (spindrift) on 2025-03-02 21:45:49 in reply to 17.1 [link] [source]
This is a separate and off topic issue for this thread, but unless you have loaded the ICU extension, upper()
(and indeed lower()
) only work on ASCII
(21) By mgr (mgrmgr) on 2025-03-02 22:04:24 in reply to 13 [link] [source]
is this .dump
behaviour expected?
sqlite> create table a(a);
sqlite> insert into a values(char(13));
sqlite> .once default.sql
sqlite> .dump
sqlite> .mode --escape ascii
sqlite> .once ascii.sql
sqlite> .dump
sqlite> .mode --escape symbol
sqlite> .once symbol.sql
sqlite> .dump
sqlite> .mode --escape off
sqlite> .once off.sql
sqlite> .dump
leading to
INSERT INTO a VALUES(unistr('\u000d'));
for default.sql
, ascii.sql
and symbol.sql
but to
INSERT INTO a VALUES('^M');
for off.sql
?
That ^M in off.sql
seems strange.
(22) By Richard Hipp (drh) on 2025-03-02 22:18:41 in reply to 21 [link] [source]
What command are you using to view the "off.sql" file? Because I'm not getting a literal "^M" but rather a single 0x0d byte at that point, which is exactly what you would expect to get with "--escape off".
When I "cat" the off.sql" file from Linux I see:
BEGIN TRANSACTION; CREATE TABLE t1(x TEXT); ');ERT INTO t1 VALUES(' COMMIT;
The \r caused some overwriting. But when I use "cat -A" I see:
BEGIN TRANSACTION;$ CREATE TABLE t1(x TEXT);$ INSERT INTO t1 VALUES('^M');$ COMMIT;$
If you still believe that ".dump" is malfunctioning on your system, please provide additional details about what OS you are using and the exact sequence of commands you are running to generate the anomaly. Maybe also post a hex dump of the relevant portions of the off.sql file.
(26.1) By mgr (mgrmgr) on 2025-03-03 12:58:33 edited from 26.0 in reply to 22 [link] [source]
Oh, sorry for the noise. Off course you are right, hex dump confirms that there is a single byte 0x0d at that place.
I was using nano
for viewing, turns out it displays ^M for (isolated) 0x0d bytes.
(23) By siddharth on 2025-03-03 08:32:41 in reply to 13 [link] [source]
Thank you!
Would it be fine if I request a CVE for this vulnerability?
(24) By Richard Hipp (drh) on 2025-03-03 09:46:42 in reply to 23 [link] [source]
Would it be fine if I request a CVE for this vulnerability?
No. For context see https://sqlite.org/cves.html. To call this issue a "vulnerability" is really stretching the term. I can't stop you from reporting this as a CVE, but I will not encourage or approve of that action.
(27) By siddharth on 2025-03-03 13:03:16 in reply to 24 [link] [source]
I have to disagree with you about it being a stretch if we call this a vulnerability. According to MITRE, improper neutralization of escape, meta, or control sequences, is seen as a vulnerability. There is also a really good talk from DEF CON 31, where STÖK explains how issues like this can be a serious problem.
(29) By Richard Hipp (drh) on 2025-03-03 13:55:31 in reply to 27 [link] [source]
If you want to create a CVE, then you do whatever you want to do. Just don't expect my approval or cooperation.
But if you're going to report this as a vulnerability, then you should perhaps also write a CVE on the unix "cat" command and the Windows11 "type" command because they both handle ANSI escape codes exactly the same way as the "sqlite3" command did prior to recent changes.
(30) By Tim Streater (Clothears) on 2025-03-03 14:42:52 in reply to 27 [link] [source]
This is merely their opinion. Why should anyone pay any attention to it?
(25) By Stephan Beal (stephan) on 2025-03-03 09:47:52 in reply to 23 [link] [source]
Would it be fine if I request a CVE for this vulnerability?
Before considering doing so, please read this project's stance on CVEs.
Largely relevant here is the point that this "vulnerability" does not affect the library. It affects (for a given value of "affects") only the CLI shell. People tend to confuse that point when it comes to CVEs, causing much more widespread concern than is generally warranted.
(28) By siddharth on 2025-03-03 13:04:49 in reply to 25 [link] [source]
People tend to confuse that point when it comes to CVEs, causing much more widespread concern than is generally warranted.
That makes sense. But would highlighting that this is an issue with the CLI tool make it fine to request for a CVE?
PS: I'm unsure how to cc you in a reply to another comment, so I'll link it here: https://sqlite.org/forum/forumpost/b7306cdabf65c365