SQLite User Forum

CSV files with multibyte character headers cannot be imported using the ".import" command (CLI).
Login

CSV files with multibyte character headers cannot be imported using the ".import" command (CLI).

(1) By anonymous on 2022-04-08 23:38:21 [link] [source]

Ver.3.38.2
CSV files with multibyte character headers cannot be imported.

■Data(UTF8)
あい,うえお
1,2
3,4

■Oparation
sqlite> .mode csv
sqlite> .import a.csv a
sqlite> .mode list
sqlite> select * from a;

■Result
?|う   <----- Header is corrupted
1|2
3|4

(2) By anonymous on 2022-04-09 14:43:54 in reply to 1 [link] [source]

It Looks to me as if this is not a header. Try with .header on

(3) By Mark Lawrence (mark) on 2022-04-09 15:55:03 in reply to 1 [link] [source]

Your "header" comment confuses things, but yes, it appears that the first row is not handled well by SQLite when treating it as a table definition.

$ cat a.csv
one,two
あい,うえお
1,2
3,4

$ cat b.csv
あい,うえお
1,2
3,4

This works fine:

$ sqlite3 :memory: '.import --csv a.csv a' '.dump'
CREATE TABLE IF NOT EXISTS "a"(
"one" TEXT, "two" TEXT);
INSERT INTO a VALUES('あい','うえお');
INSERT INTO a VALUES('1','2');
INSERT INTO a VALUES('3','4');

Importing the UTF8 line as a data row (not a header/column definition) also works:

$ sqlite3 :memory: 'create table b(three,four)' '.import --csv b.csv b' '.dump'
CREATE TABLE b(three,four);
INSERT INTO b VALUES('あい','うえお');
INSERT INTO b VALUES('1','2');
INSERT INTO b VALUES('3','4');

This does not:

$ sqlite3 :memory: '.import --csv b.csv c' '.dump'
CREATE TABLE IF NOT EXISTS "c"(
"�" TEXT, "う" TEXT);
INSERT INTO c VALUES('1','2');
INSERT INTO c VALUES('3','4');

(4.1) By Larry Brasfield (larrybr) on 2022-04-09 18:58:14 edited from 4.0 in reply to 1 [source]

Thanks for reporting this (and for the simple repro.) Fixed here.

I'm taking it on faith that taking your formerly beaten up header line as a test case has not put Asian profanity into the test suite.

(5) By anonymous on 2022-04-10 01:06:11 in reply to 4.1 [link] [source]

If Google Translate is correct:

あい,うえお

translates from (Japanese) to

AIUEO

... no profanity just greedy use of vowels

(6.1) By midijohnny on 2022-04-10 12:23:52 edited from 6.0 in reply to 5 [link] [source]

https://en.wikipedia.org/wiki/Hiragana Just the first line from the standard Japanese Hiragana chart. (The vowels, as pointed out by anonymous poster).

(7) By ddevienne on 2022-04-11 10:09:59 in reply to 4.1 [link] [source]

Nice. And interesting uses of nested printf(), alternate-forms as well as optional-precision for %s and %c substitutions.

printf('\"%w\"',printf('%!.*s%s', nlen-chop,name,suff))

Note though that I wish we had the same %q and %Q alternatives but for %w.
Would allow to write a cleaner printf('%W' instead of printf('\"%w\"'.

In fact, what I really would like is a foolproof way to output a name/identifier,
but in a more user-friendly way, adding the double-quotes ONLY IF NECESSARY.

(perhaps using the %N or %I notation, for Identifier or Nname,
although with lower-case forms that mean difference things, maybe not...)

Which means knowing about the quoting rules (why in Draft form only?) and keywords,
something complex to do outside the SQLite code, thus best built-in.

Having a %W that systematically adds the surrounding double-quotes would already be nice,
and symmetrical to %q and %Q, so logical to add IMHO.

And since I'm on the subject of printf(), I think %Q in addition to supporting the text and null types,
should also support the blob type, and add the leading x implicitly. This is a common pitfall, not knowing
how to output blobs as blob literals, it's by far my most popular question on StackOverflow in fact, from 13 years ago!
(could also be a separate and new %B for blobs only, but %Q is also consistent with the existing quote() function)

C:\Users\ddevienne>sqlite3
SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table table (id);
Error: near "table": syntax error
sqlite> create table "table" (id);
sqlite>

(8) By Stephan Beal (stephan) on 2022-04-11 13:42:04 in reply to 7 [link] [source]

Would allow to write a cleaner printf('%W' instead of printf('\"%w\"'

FWIW, in my fork of what used to (long ago) be the sqlite3 print code, i changed %Q to accept the ! flag so that %!Q tells it to quote with double-quotes instead of single-quotes. That would be a trivial, backwards-compatible change to sqlite, AFAIK.

(9) By ddevienne on 2022-04-11 15:07:12 in reply to 8 [link] [source]

Thanks! The blitz skin does it, and doesn't hurt my eyes like others I've checked.
Wasn't aware of that link, honestly a bit buried with the several dozens other links.
Might be useful to make it more prominently accessing from the main forum page perhaps.

Aaaah! Blitz makes links almost the same as surrounding text, so basically invisible to me...
One must happen to hover over them to see the underline. Too bad, it was exactly what I wanted on code-fragments...

(10) By anonymous on 2022-05-01 15:15:36 in reply to 1 [link] [source]

"the first input row is used to set the name of all the columns in the table"

In Ver.3.37.2, work well.

In Ver.3.38.3, not work well.

(11) By Larry Brasfield (larrybr) on 2022-05-01 20:27:32 in reply to 10 [link] [source]

"the first input row is used to set the name of all the columns in the table"

In Ver.3.37.2, work well.

In Ver.3.38.3, not work well.

The 3.38.3 patch release does not include the fix mentioned above, so this is unsurprising.

If your comment is about some other problem than the one reported by the OP, it is too vague to be helpful with respect to further improvement.