Need double quotes in scv export
(1) By Dutch (Dutch123) on 2022-07-18 10:41:46 [link] [source]
Hello! I using sqlite3.exe for export DB to csv file with this parameters: .headers on .mode csv .separator ";". In result i see double quotes in some fileds. Other fields not have quotes. How i can make doubel quotes for all fields?
(2) By Gunter Hick (gunter_hick) on 2022-07-18 11:26:43 in reply to 1 [link] [source]
Can you show some actual input and output? Maybe the double quotes you are seeing are field contents?
(3) By Dutch (Dutch123) on 2022-07-18 12:17:06 in reply to 2 [link] [source]
In picture is okay? https://dropmefiles.com/HtkS1
(4) By Richard Hipp (drh) on 2022-07-18 13:14:09 in reply to 3 [link] [source]
No. Please use your words. No pictures.
(5.2) Originally by Dutch (Dutch123) with edits by Richard Hipp (drh) on 2022-07-18 13:39:50 from 5.1 in reply to 4 [link] [source]
CREATE TABLE input (ls TEXT, street TEXT, dom TEXT, kvart TEXT, ipu_num TEXT, description TEXT, cons_old REAL, oneS_id TEXT, phone TEXT, type_supply INT, ipu_blocked INT, tariff INT, data_base TEXT, date_verifi TEXT, date_last_cons TEXT)
CREATE TABLE cons (ls TEXT, ipu_num TEXT, type_supply TEXT, cons_new TEXT, cons_new_date DATETIME)
CREATE TABLE type_supply (type_supply TEXT, description TEXT)
Then will data.
Then make this scrypt:
.headers on .mode csv .separator ";" .once Exported.csv SELECT cons.ls as "Лицевой счет", input.street as "Улица", input.dom as "№ дома", input.kvart as "№ квартиры", cons.ipu_num as "№ счетчика", input.description as "Описание", input.cons_old as "Старые показания", cons.cons_new as "Новые показания", MAX(cons.cons_new_date) as "Дата подачи", input.oneS_id as "Код 1с", input.phone as "Телефон", input.type_supply as "Тип снабжения", input.ipu_blocked as "Прибор заблокирован", input.tariff as "Тариф", input.data_base as "База данных", input.date_verifi as "Дата поверки", "Telegram bot" as "Источник", "none" as "Email", MAX(cons.cons_new_date) as "Дата внесения последних показаний", "Обычный" as "Суточная зона" FROM cons INNER JOIN input ON cons.ls = input.ls AND cons.ipu_num= input.ipu_num AND cons.type_supply = input.type_supply AND cons_new_date >= datetime('2022-07-01 00:00:00') GROUP BY cons.ls, cons.ipu_num, cons.type_supply ORDER BY cons.cons_new_date;
Then got result file Exported.csv :
"Лицевой счет";"Улица";"№ дома";"№ квартиры";"№ счетчика";"Описание";"Старые показания";"Новые показания";"Дата подачи";"Код 1с";"Телефон";"Тип снабжения";"Прибор заблокирован";"Тариф";"База данных";"Дата поверки";"Источник";Email;"Дата внесения последних показаний";"Суточная зона" 6880340;"30 лет Победы ул.";11;34;0102823860;"ГВС (688_34)";27,00000;30,0;"2022-07-18 14:30:55";000096081;;1;0;1;VDN-RCVTS;2026-09-09;"Telegram bot";none;"2022-07-18 14:30:55";"Обычный" 6880340;"30 лет Победы ул.";11;34;27815266;"ВХВ общий";224.333;250,0;"2022-07-18 14:30:55";041621;;2;0;1;"РЦ ВТС";2022-10-03;"Telegram bot";none;"2022-07-18 14:30:55";"Обычный" 6880340;"30 лет Победы ул.";11;34;6880340;"ЭЛН";;1000,0;"2022-07-18 14:30:55";;;3;0;18;TNS-energo;;"Telegram bot";none;"2022-07-18 14:30:55";"Обычный" 4250050;"М.Горького ул.";194;5;0102563414/010256341;"ГВС (425_5)";45,00000;55,0;"2022-07-18 14:31:16";000091291;;1;0;1;VDN-RCVTS;2026-02-01;"Telegram bot";none;"2022-07-18 14:31:16";"Обычный" 4250050;"М.Горького ул.";194;5;4250050;"ЭЛН";;30,0;"2022-07-18 14:31:16";;;3;0;18;TNS-energo;;"Telegram bot";none;"2022-07-18 14:31:16";"Обычный" 4250050;"М.Горького ул.";194;5;"48565961 / 48343637";"ВХВ общий";15.0;20,0;"2022-07-18 14:31:16";064715;;2;0;1;"РЦ ВТС";2028-01-04;"Telegram bot";none;"2022-07-18 14:31:16";"Обычный" 3210060;"М.Горького ул.";169;6;0103215605/010321561;"ГВС (321_6)";31,18000;33,0;"2022-07-18 14:31:27";000103377;;1;0;1;VDN-RCVTS;2027-02-05;"Telegram bot";none;"2022-07-18 14:31:27";"Обычный"
Then i see this:
6880340 - without quote; "30 лет Победы ул." - with double quote.
Me need double quote with all fields, like this "6880340"
(6) By Richard Hipp (drh) on 2022-07-18 13:59:27 in reply to 5.2 [link] [source]
The "csv" output mode in the SQLite CLI omits quotes when the quotes are not required by RFC4180. I don't think there is a way to turn that feature off.
(7.1) By Ryan Smith (cuz) on 2022-07-18 17:59:39 edited from 7.0 in reply to 1 [link] [source]
CSV is a standard with roughly the following basic rules:
- All fields are written one-after-each without any quotes.
- Fields are separated by comma characters
- Rows (Records) are separated by Carriage Returns(r)/Linefeed(n) characters
- Exception 1: When a field contains within the field-text a Field or Record separator, or a Double-Quote character, then it must be enclosed with double-quote characters.
- Exception 2: When a field is enclosed with double-quotes, any double-quote characters within the field must be escaped by doubling up the character.
This means:
ABC | DEF --> ABC,DEF
AB,C | DEF --> "AB,C",DEF
A",C | DEF --> "A"",C",DEF
A\nC | DEF --> "A
C",DEF
That is the standard. What you are asking for is no longer CSV, it is something else, and so SQLite's CSV translator cannot give what you ask.
(8) By Dutch (Dutch123) on 2022-07-19 06:12:16 in reply to 7.1 [link] [source]
This helped me - thanks a lot
(9) By Ryan Smith (cuz) on 2022-07-19 10:25:13 in reply to 8 [link] [source]
While on the subject, you seem to be making a CSV parser, so let me add a note:
Try to make the separators variables, preferably strings.
For the CSV standard it's technically suggested to be a Comma (hence the name), but most CSV creators and CSV interpreters use the "Field Separator" and "Record Separator" values from the OS "Locale" settings (especially on windows).
Mostly this Field Separator is simply also "," (Comma) for most countries, but it can be ";" (Semi-Colon) for some countries, like the Netherlands.
The Record separator is mostly "\n" (<LF> Char(10))
but can also just be "r" or a combination of "\r\n" (<CR><LF> Char(13)+Char(10))
depending on OS and the whims of the public.
Also, if you change Field separator to "\t" (<TAB> Char(9))
most spreadsheets (even via Copy-Paste) and TSV (Tab-separated-variable) interpreters would understand your data, and you would understand theirs.
Good luck!
(10) By Harald Hanche-Olsen (hanche) on 2022-07-19 10:43:00 in reply to 9 [link] [source]
I believe the semicolon ";" is the usual field separator in most European countries, because the comma is commonly used for the decimal marker here. So you definitely want to support that.
The record separator is particularly thorny, because csv files can have their line endings changed as they move between computers running different operating systems. So the default record separator should perhaps be one optional CR followed by a single line feed, corresponding to the regular expression "\r?\n"
(Or even "\r*\n"
, as I have encountered text files with multiple CRs in front of each LF, presumably caused by software trying to be helpful in translating line endings!)
I am not suggesting that you should allow general regexps as record separators. But the CR?LF ending could be worth treating specially, because it is so common.
(11) By Andreas Kupries (andreas-kupries) on 2022-07-19 11:01:00 in reply to 10 [link] [source]
And from the depths of history note that there are designated separator characters in ASCII, and thus Unicode:
man ascii:
Oct Dec Hex Char Oct Dec Hex Char
────────────────────────────────────────────────────────────────────────
034 28 1C FS (file separator) 134 92 5C \ '\\'
035 29 1D GS (group separator) 135 93 5D ]
036 30 1E RS (record separator) 136 94 5E ^
037 31 1F US (unit separator) 137 95 5F _
From largest (file) to smallest (unit = field) structure.
Quick googling for ascii record separator character usage
yields
with some more information.
(12) By Dutch (Dutch123) on 2022-07-21 11:49:26 in reply to 6 [link] [source]
How should a table field or its contents be changed so that it is guaranteed to be enclosed in double quotes during export?
(13) By Ryan Smith (cuz) on 2022-07-21 12:10:43 in reply to 12 [link] [source]
You can add any of a comma, double-quote or line-feed character to the data fields, which would all cause CSV quoting to happen. Of course the destination system has to be made to ignore those.
Why would you want this? Which system(s) are you using that do(es) not understand real CSV format?
Even if you are coding that yourself, there are a hundred examples out there (for most languages) of the implemented CSV algorithm.
It's such a low-level trivial and basic part of computing, that it is hard to imagine a system that doesn't know how to correctly cope with CSV.
It's like going to a shop and asking for an individual check-out receipt for every item you buy, one item at a time. I mean sure it can be done, but all check-out points on Earth are designed to let you buy all your shopping in one go, make one payment and get one receipt. For this reason, all accounting software is able to handle purchases in the same groups with single receipts and payments. It's unthinkable that there exists a system that cannot handle that and requires individual items per transaction - and in the same way it is unthinkable that there exists a CSV file/data consumer that requires all fields to be Quoted - it's completely outside the norm and also just not sensible.
(14) By Dutch (Dutch123) on 2022-07-21 13:03:33 in reply to 13 [link] [source]
The appointment system is a different organization. Each field of this file must be in double quotes - this is a requirement for this organization. There are several such organizations. What you said is logical and obvious. But in order to make this a reality, I will need to convince each of the organizations to change their system. I'm sure they don't want to do it. So I can't change the destination system. I see the easiest solution is to modify my file to match the requirements of the destination system. I'll have to write a parser that will insert double quotes where there are none. This is terrible.
(15) By Gunter Hick (gunter_hick) on 2022-07-21 14:01:07 in reply to 14 [source]
This is not "terrible". It is actually quite easy to do. See https://sqlite.org/c3ref/exec.html All you need to do is to write a callback function for sqlite3_exec() that delimits and concatenates the result rows into text lines for output.
(16) By Ryan Smith (cuz) on 2022-07-21 14:37:38 in reply to 14 [link] [source]
Which appointment system? Does it have a name? Which organizations?
Do they advertise that their system reads CSV? If so, either they are lying or you are wrongly assuming they support CSV when in fact they do not.
If they do not support CSV, then you are wrongly expecting SQLite to output whatever data format they do support. It's not CSV. SQLite can't be expected to make CSV into some arbitrary system's non-csv format.
If they do claim to support CSV, then their systems are broken and you have a very good reason to "convince each of the organizations" to fix their systems, and they have good reason to listen.
This is terrible
Your requested output is only one of an infinite number of random formats and arbitrary adjustments to formats that can exists and that SQLite, along with most other systems, do not support or claim to support. Any of those you will need to write a custom parser/formatter for if you wish to produce their specific output. You can say this is terrible, but that's just what all programmers face when trying to push square data into round holes - Welcome to the struggle.
If all you want to do is read CSV from file and output it as always-quoted-values to the same file (or another file), I already have such a tool which you can have for free along with the code for it.
(17) By Dutch (Dutch123) on 2022-07-21 16:48:31 in reply to 16 [link] [source]
What do I need to do to get this tool? That would be great
(18) By Michael A. Cleverly (cleverly) on 2022-07-21 19:19:41 in reply to 12 [link] [source]
If you want something guaranteed to be enclosed in double quotes you'd best enclose it in double quotes yourself.
.headers on
.mode column
.once Exported.csv
SELECT '"' || replace(ifnull(cons.ls, ''), '"', '""') || -- as "Лицевой счет",
'";"' || replace(ifnull(input.street, ''), '"', '""') || -- as "Улица",
'";"' || replace(ifnull(input.dom, ''), '"', '""') || -- as "№ дома",
'";"' || replace(ifnull(input.kvart, ''), '"', '""') || -- as "№ квартиры",
'";"' || replace(ifnull(cons.ipu_num, ''), '"', '""') || -- as "№ счетчика",
'";"' || replace(ifnull(input.description, ''), '"', '""') || -- as "Описание",
'";"' || replace(ifnull(input.cons_old, ''), '"', '""') || -- as "Старые показания",
'";"' || replace(ifnull(cons.cons_new, ''), '"', '""') || -- as "Новые показания",
'";"' || replace(ifnull(MAX(cons.cons_new_date), ''), '"', '""') || -- as "Дата подачи",
'";"' || replace(ifnull(input.oneS_id, ''), '"', '""') || -- as "Код 1с",
'";"' || replace(ifnull(input.phone, ''), '"', '""') || -- as "Телефон",
'";"' || replace(ifnull(input.type_supply, ''), '"', '""') || -- as "Тип снабжения",
'";"' || replace(ifnull(input.ipu_blocked, ''), '"', '""') || -- as "Прибор заблокирован",
'";"' || replace(ifnull(input.tariff, ''), '"', '""') || -- as "Тариф",
'";"' || replace(ifnull(input.data_base, ''), '"', '""') || -- as "База данных",
'";"' || replace(ifnull(input.date_verifi, ''), '"', '""') || -- as "Дата поверки",
'";"' || replace(ifnull('Telegram bot', ''), '"', '""') || -- as "Источник",
'";"' || replace(ifnull('none', ''), '"', '""') || -- as "Email",
'";"' || replace(ifnull(MAX(cons.cons_new_date), ''), '"', '""') || -- as "Дата внесения последних показаний",
'";"' || replace(ifnull('Обычный', ''), '"', '""') -- as "Суточная зона"
AS '"Лицевой счет";"Улица";"№ дома";"№ квартиры";"№ счетчика";"Описание";"Старые показания";"Новые показания";"Дата подачи";"Код 1с",Телефон";"Тип снабжения";"Прибор заблокирован";"Тариф";"База данных";"Дата поверки";"Источник";"Email";"Дата внесения последних показаний";"Суточная зона"'
FROM cons INNER JOIN input ON
cons.ls = input.ls
AND cons.ipu_num= input.ipu_num
AND cons.type_supply = input.type_supply
AND cons_new_date >= datetime('2022-07-01 00:00:00')
GROUP BY cons.ls, cons.ipu_num, cons.type_supply
ORDER BY cons.cons_new_date;
Obviously you could remove the ifnull()
for any column that was guaranteed to never contain nulls. And you don't need to do the ifnull()
, let alone replace()
, on columns were you are selecting a literal string (as you do with 'none'
for 'Email'
). I just did a simple mechanical transformation of your query to show you one way of doing it.
The result is a single column per row that contains your semicolon delimited data all concatenated together using your rather quixotic double quoting requirements.
(19) By Ryan Smith (cuz) on 2022-07-21 22:22:30 in reply to 17 [link] [source]
Simply send an email to me on: ryansmithhe at gmail dot com
I will reply with the tool.
(20) By Dutch (Dutch123) on 2022-07-22 11:16:42 in reply to 18 [link] [source]
Wow, this really solved my problem! Thank you! I tried something similar but it didn't work for me because I don't know the syntax. Something like this: '"' + cons.ls + '"' as "Personal account",