Importing csv Files
(1) By Greg (gjroos) on 2022-06-21 15:13:43 [source]
If I import a csv file using DB Browser for SQLite any unicode character displays � which I can handle in my program. If I import using the same csv file using sqlite3.exe then they are displayed as BLOB (binary) which my program cannot handle. My commands for sqlite3.
.mode csv .import "C:\Testing\Markers.csv" Markers .save "C:\Testing\Historical Markers.db" .quit
I am wanting to create a DB using vbscript. Is there anything that I can add that will make sqlite3 give the same result as DB Browser for SQLite?
(2) By Larry Brasfield (larrybr) on 2022-06-22 15:11:16 in reply to 1 [link] [source]
You seem to assert that the sqlite3.exe shell will display unicode characters as BLOB when somehow imported. If that actually happens as your posts vaguely suggests, it would be a bug. Can you please provide verbatim input to the sqlite3.exe shell which reproduces this behavior? That will make clear what you're talking about here, and should allow either: a diagnosis of what you are doing that can be done better in another way; or the misbehavior to be fixed.
Is there anything that I can add that will make sqlite3 give the same result as DB Browser for SQLite?
I cannot say until "the same result" is more specifically described. Further, I would not go off experimenting with "DB Browser for SQLite" on such a vague quest.
(3) By Greg (gjroos) on 2022-06-22 18:18:12 in reply to 2 [link] [source]
Larry, I am sorry that my explanation of my issue was not clear enough for you. So let me try again. I have a csv file that has Unicode data in it. After starting "DB Browser for SQLite" on my PC I select "File > Import > Table from CSV file...". Then I select the csv file that contains the Unicode characters from my PC file system. I get a dialog with the following fields, Table Name filled in with the table name taken from the csv file, Column names in first line - checked, Field separator - comma, Quote character - double quotes, Encoding - UTF-8, and Trim fields - checked. I Select OK. The table is created and populated. When browsing the table's data the Unicode characters are shown with the black diamond and a white question mark in it. My android app changes any of these to a single quote. I then delete the data base. Then I open a command prompt on my PC. I go to the directory "C:\Program Files\DB Browser for SQLite\sqlite-tools-win32-x86-3380500" that has the sqlite3 executable and start it. The the following commands are issued. .mode csv .import "C:\\Historical Marker\\Input\Markers.csv" Markers .save "C:\\Historical Marker\\Historical Markers.db" .quit When I Browse the table's data in "DB Browser for SQLite" the same Unicode fields shows as a BLOB. The binary data for one is thus. 0000 4c 69 6e 63 6f 6c 6e 92 73 20 46 61 72 65 77 65 Lincoln.s Farewe 0010 6c 6c 20 74 6f 20 53 70 72 69 6e 67 66 69 65 6c ll to Springfiel 0020 64 d So either I need to add something to the sqlite3 program or there is a bug. BTW, I copied the new database file to my android program and find out that my app can handle (read for text to speech) the Unicode data just fine so I apparently don't have an issue anymore. But it would be nice to have sqlite3 program behave the same as "DB Browser for SQLite".
(4) By Donald Griggs (dfgriggs) on 2022-06-22 19:38:58 in reply to 3 [link] [source]
Hi Greg,
I (at least) understand so much better with your rewording of your question.
Your question might better have been posed to a DB Browser forum, because it seems that it is DB Browser that has code in its import function that treats such characters differently on import than when it encounters them in an existing database.
Sqlite is performing exactly as designed -- it's importing exactly what you provided it. It would upset a lot of folks if it behaved any other way.
If your issue hadn't become moot, you could have modified your script to perform changes you might like. For example, just after the ".import" (and assuming your text field was named "title") you could have:
UPDATE Markers SET title = replace(title, x'92', '''' );
If you had gobs of text fields, or if they might vary, you could even use readfile() and writefile() to read the entire csv file into a blob field of a temporary table, perform the UPDATE command, and write it back out before your .import command.
(5) By Kees Nuyt (knu) on 2022-06-22 21:41:15 in reply to 3 [link] [source]
0x92 is not UTF-8 for any kind of quote. It appears to be some kind of PostScript encoding.
I did not delve into the details, but this article on ASCII and Unicode quotation marks may give more insight.
--
Regards,
Kees Nuyt
(6) By Tim Streater (Clothears) on 2022-06-23 09:28:36 in reply to 5 [link] [source]
0x92 as a quote character is Windows-1252.
(7) By anonymous on 2023-03-24 05:13:26 in reply to 1 [link] [source]
Not sure if I saw this answered elsewhere. When importing a csv under linux, the data looks like this. "data1","data2","data3" using the import command .mode ascii .separator ";" "\n" it comes out like this. data1","data2","data3 where the first quote of the line and the last quote of the line are removed, but the rest remain. Is it the separator line that needs to be fixed?
(8) By Larry Brasfield (larrybr) on 2023-03-24 05:32:42 in reply to 7 [link] [source]
Is it the separator line that needs to be fixed?
Here is what the CLI says when asked for help on .import:
sqlite> .help import
.import FILE TABLE Import data from FILE into TABLE
Options:
--ascii Use 037 and 036 as column and row separators
--csv Use , and n as column and row separators
--skip N Skip the first N rows of input
--schema S Target table to be S.TABLE
-v "Verbose" - increase auxiliary output
Notes:
* If TABLE does not exist, it is created. The first row of input
determines the column names.
* If neither --csv or --ascii are used, the input mode is derived
from the ".mode" output mode
* If FILE begins with "|" then it is a command that generates the
input text.
So, I would say it's the (not shown) import line that is fix-worthy. Note in particular the command's --csv option.
(9) By Harald Hanche-Olsen (hanche) on 2023-03-24 08:58:59 in reply to 7 [link] [source]
As Larry said, it's hard to say without seeing the .import
line that you used. What I find really odd is that you specify the column separator as a semicolon, while your data is comma separated. What do you expect to happen? Another oddity: The ascii mode does not treat quote characters specially, yet in your example, something happens to quotes.
Perhaps you could post a complete transcript showing what you did, and the result? Like the below, where I tried (and failed, as I fully expected) to replicate your result.
⬥ cat foo
"data1","data2","data3"
⬥ sqlite3
SQLite version 3.41.0 2023-01-26 11:12:47
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table foo(x,y,z);
sqlite> .mode ascii
sqlite> .separator ";" "\n"
sqlite> .import foo foo
foo:1: expected 3 columns but found 1 - filling the rest with NULL
sqlite> .mode qbox
sqlite> select * from foo;
┌───────────────────────────┬──────┬──────┐
│ x │ y │ z │
├───────────────────────────┼──────┼──────┤
│ '"data1","data2","data3"' │ NULL │ NULL │
└───────────────────────────┴──────┴──────┘
(Note that the qbox output mode added the extra single quotes, to more clearly see the exact value being shown.)
(10) By anonymous on 2023-03-24 11:43:02 in reply to 9 [link] [source]
I got the ";" thing off a stackoverflow post. The poster said the character didn't matter, but just to put it in there. I was testing with the csv sample file on this page. https://www.irs.gov/businesses/international-businesses/irs-ffi-list-schema-and-test-files The file has quotes around the data for each field, which is not a needed part of the data. here is the exact code I used under Ubuntu linux running under windows 11 WSL .mode ascii .separator ";" "\n" .import /tmp/FFIListExample.csv IRS_FFI_LIST And of course, now I can't replicate it... When I did it again, all the quotes remained...
(11) By Harald Hanche-Olsen (hanche) on 2023-03-24 14:21:19 in reply to 10 [link] [source]
Whoever wrote that stackoverflow post was dead wrong (or possibly partially right in whatever special circumstance the post was about). It's dangerous to rely on stackoverflow answers without proper understanding.
The shell has a nice help feature:
sqlite> .help separator
.separator COL ?ROW? Change the column and row separators
The file you tried on is a classic csv file. The easy way to import it, is like this:
sqlite> .import --csv FFIListExample.csv example
sqlite> .mode qbox
sqlite> select * from example limit 4;
┌───────────────────────┬──────────────────────┬──────────────────┐
│ GIIN │ FINm │ CountryNm │
├───────────────────────┼──────────────────────┼──────────────────┤
│ '98Q96B.00000.LE.250' │ 'Test Bank One, LLP' │ 'France' │
│ '98Q96B.00000.BR.826' │ 'Branch' │ 'United Kingdom' │
│ '98Q96B.00000.BR.036' │ 'Branch' │ 'Australia' │
│ '98Q96B.00000.BR.076' │ 'Branch' │ 'Brazil' │
└───────────────────────┴──────────────────────┴──────────────────┘
That is when you import into a new table. To import into an existing table with three columns, use .import --csv --skip 1 …
instead. .help import
is your friend.
(12) By anonymous on 2023-04-03 00:04:30 in reply to 11 [link] [source]
Is there a recommended way to do this import using the Perl DBI?
(13) By Keith Medcalf (kmedcalf) on 2023-04-03 03:34:57 in reply to 12 [link] [source]
This is not the Perl forum. Questions regarding Perl should be directed there.