SQLite User Forum

Unicode Character in TEXT column
Login

Unicode Character in TEXT column

(1) By anonymous on 2022-03-30 12:56:57 [link] [source]

I am using the CLI; this query is either not inserting the second row of the first column or prompting me for the end of statement - typing ; does not resolve it.

DROP TABLE IF EXISTS tblCountry;

	CREATE TABLE tblCountry (
		Country TEXT,
		ISO2 TEXT,
		ISO3 TEXT,
		ISONum INT,
		PRIMARY KEY (
			Country,
			ISO2,
			ISO3,
			ISONum
			)
		);

INSERT INTO tblCountry (
	Country,
	ISO2,
	ISO3,
	ISONum
	)
VALUES ('Afghanistan','AF','AFG',4),('Åland Islands','AX','ALA',248);

The problem appears to be he Unicode character Å. The SQL works when I replace Å by A.

How do I insert unicode characters in a TEXT column (I'm using V3.38.2)?

(2) By Gunter Hick (gunter_hick) on 2022-03-30 13:21:14 in reply to 1 [link] [source]

Are you sure you are running on a console with UTF-8 character set? Maybe you are running under windows in ISO-8859-1?

What does hex('Åland Islands') return?

SQLite version 3.24.0 2018-06-04 19:24:41
asql> create temp table country(name text,iso2 text, iso3 text, isoNum int);
asql> insert into country values ('Afghanistan','AF','AFG',4),('Åland Islands','AX','ALA',248);
rows inserted
-------------
2
asql> select * from  country;
name                 iso2  iso3  isoNum
-------------------  ----  ----  ------
Afghanistan          AF    AFG   4
Åland Islands        AX    ALA   248

(4) By anonymous on 2022-03-30 13:37:48 in reply to 2 [link] [source]

I am using Windows & simply starting SQLite3.EXE without any start-up parameters.

SELECT hex('Åland Islands');

prompts me for the end of statement character.

In 22. Command-line Options I do not see an option for starting SQLite3.EXE with UTF-8 character set: presumably, there is a way. What is it?

(9) By Larry Brasfield (larrybr) on 2022-03-30 14:19:14 in reply to 4 [link] [source]

The CLI has no such option. It reads stdin to get interactive input, and writes either UTF-8 (or, on Windows, UTF-16 sometimesa) to stdout. It deals with UTF-8 internally, without exception. (The DB it wraps is another matter.)

Your problem, and the challenge to getting past it, is how to type or paste UTF-8 as input, apparently. The CLI does nothing for input except read it in and treat it as UTF-8 text. This is done exactly the same way for stdin as it is done when .read FILE is used. And, as we have seen, that works without character set issues.

[a. On Windows, when stdout is seen to be the console, the UTF-8 to be output is translated to what the Windows Console API demands. This is not germane to your problem, which is input going awry rather than output. ]

(10) By Larry Brasfield (larrybr) on 2022-03-30 14:28:40 in reply to 4 [link] [source]

A work-around for your strange platform is:
Run the SQLite CLI shell as an inferior shell process in Emacs running in GUI mode. You will want to pass the -interactive flag to sqlite3.exe if you like to see prompts in interactive use.

(3) By Larry Brasfield (larrybr) on 2022-03-30 13:35:09 in reply to 1 [link] [source]

Gunter has his finger on the issue.

If I create a proper, UTF-8 file with the content you posted, in one editor I use (mg, a micro-Emacs descendent), the last line renders as: VALUES ('Afghanistan','AF','AFG',4),('\303\202land Islands','AX','ALA',248); .
Then, after I .read it into the shell and query, I get: select * from tblCountry; Afghanistan|AF|AFG|4 Âland Islands|AX|ALA|248 .

There are highly platform/tool dependent reasons that your cap A with circumflex never made it to the either the SQLite CLI or the library it exposes. I think that to diagnose such issues, we would need to know: Your platform; what terminal (or console, etc.) you are using the CLI with; how you entered that text that is not reaching the CLI intact.

(5) By anonymous on 2022-03-30 13:49:47 in reply to 3 [link] [source]

When I save my script from NOTEPAD using Encoding UTF-8, the script works.

sqlite> .read e:/temp/sql.txt
Country        ISO2  ISO3  ISONum
-------------  ----  ----  ------
Afghanistan    AF    AFG   4
Åland Islands  AX    ALA   248

Is there a way to start SQLite3.EXE with UTF-8 encoding on Windows?

(6) By Simon Slavin (slavin) on 2022-03-30 14:01:54 in reply to 5 [link] [source]

One way to do it is to use the 'chcp' command:

https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/chcp

This may or may not help with your problem. First I would issue that command without arguments to find out what the current setting is.

(7) By anonymous on 2022-03-30 14:10:01 in reply to 6 [source]

I get the following:

sqlite> .shell chcp
Active code page: 65001

chcp expect an integer:

sqlite> .shell chcp /?
Displays or sets the active code page number.

CHCP [nnn]

  nnn   Specifies a code page number.

Type CHCP without a parameter to display the active code page number.

What is nnn for UTF-8?

(8) By anonymous on 2022-03-30 14:13:00 in reply to 6 [link] [source]

A quick Google search, code page 65001 is UTF-8!

(I 'm back to not understanding my original query fails.)