Support for Markdown formatted table output from a query

(1) By anonymous on 2020-05-28 18:52:00 [link] [source]

Markdown extended specification outlines syntax for formatting a table using pipe-char (for columns) and dash (for header row). While there are some aditional features and other specifications related to Markdown tables, for the most part the pipe-and-dash formatting is common.

I'd like SQLite to be able to generate Markdown-ready output from SQLite queries. Such output could be readily usable in reports, even can be inserted into GitHub Issues (which does support Markdown tables).

Currently, it's possible in the SQLite client to set the .separator to pipe to get the column formatting, however there's currently no way to generate dash-formatting for the table header (.header on).

It'd be useful to have some Markdown setting/mode, that would trigger |---|---| kind of formatting when (.header on), maybe (.header md ??).

Here's an example of a Markdown table formatting:

 id|type|value
 --|----|-----
  1| A  |A-val
  2| B  |B-val
  3| C  |C-val

In HTML it results in a table:

id	type	value
1	A	A-val
2	B	B-val
3	C	C-val

(2) By Richard Hipp (drh) on 2020-05-28 19:07:05 in reply to 1 [link] [source]

Markdown also allows you to insert raw HTML. So you could set the SQLite shell into ".mode html" and then copy/paste the raw table code into your Markdown document. Like this:

id	type	value
1	A	A-val
2	B	B-val
3	C	C-val

Click the "source" button on the header of this post to see the raw text, which is somewhat unreadable. But it does render nicely. And you have the option to add things like "border='1'" in the <table> to make it look a little nicer:

id	type	value
1	A	A-val
2	B	B-val
3	C	C-val

Steps to reproduce the above:

   CREATE TEMP TABLE t1(id,type,value);
   INSERT INTO t1 VALUES(1,'A','A-val'),(2,'B','B-val'),(3,'C','C-val');
   .mode html
   SELECT * FROM t1;

(3) By anonymous on 2020-05-28 19:27:30 in reply to 2 [link] [source]

Markdown also allows you to insert raw HTML

Well, then it's HTML, not Markdown... Markdown files with HTML tables would mandate rendering into HTML. It's preferable for Markdown files to remain readable in raw form too.

There's definitely difference in reading a raw Markdown text with table in the form:

 id|type|value
 --|----|-----
  1| A  |A-val
  2| B  |B-val
  3| C  |C-val

... compared to the text output from the .mode html:

<table>
<TR><TH>id</TH>
<TH>type</TH>
<TH>value</TH>
</TR>
<TR><TD>1</TD>
<TD>A</TD>
<TD>A-val</TD>
</TR>
<TR><TD>2</TD>
<TD>B</TD>
<TD>B-val</TD>
</TR>
<TR><TD>3</TD>
<TD>C</TD>
<TD>C-val</TD>
</TR>
</table>

(4) By Warren Young (wyoung) on 2020-05-28 21:39:09 in reply to 1 [link] [source]

This conversion is readily scripted. Here's one. Just pipe the SQLite output to that script.

There's another one also based on Node, but it rolls its own CSV parser with regexes, which makes me nervous.

Don't like Node? It should take you less than an hour to write something suitable in your scripting language of choice.

(5) By Richard Hipp (drh) on 2020-05-29 00:45:25 in reply to 1 [link] [source]

There is a prototype for three new output modes for the SQLite CLI on trunk. New modes are:

JSON
markdown
table (looks like MySQL output)

The JSON output seems to work well since JSON is a well-defined format. The other two modes are in limbo, though. Issues include:

How to display BLOB values
How to distinguish between NULL and the four-character string 'NULL'
How to display strings with special characters like \n
For markdown mode, should it escape characters that have special meaning to markdown, like |, [, ', and *

Even thought the table and markdown modes are currently on trunk, they might not survive there until the next release.

(6) By Warren Young (wyoung) on 2020-05-29 00:52:47 in reply to 5 [source]

How to distinguish between NULL and the four-character string 'NULL'

Why not just use the .nullvalue setting?

(7) By Larry Brasfield (LarryBrasfield) on 2020-05-29 00:57:44 in reply to 5 [link] [source]

At www.json.org, at Introducing JSON, the railroad diagram for 'value' (about 1/4 down the page) shows the special name 'null', literally without the quotes, to be the representation for null. IMO, this is more likely to be adopted as a convention than anything hokey such as special string values.

(8) By Larry Brasfield (LarryBrasfield) on 2020-05-29 01:14:51 in reply to 5 [link] [source]

A possible alternative for table output: The .once and .output commands get a -h option, similar in operation to the -x or -e options, except output is to an HTML file which, after the output completes, is opened by whatever application is designated to open .html files.

I did this with very few changes to the shell, other than to add stuff before and after the present html mode output that makes a pretty table display. (A 1 pixel border really improves it, to my taste.)

The motivation was to see if, for modern systems, there might be a simple alternative to fixed-width output that worked better than fiddling with .width. To my pleasant surprise, the two browsers I used, given no guidance on how wide the columns need to be, display quite nicely, with multi-line, wider columns where there is an absurd amount of text and narrower columns where the values fit. In fact, it looked better than dumping TSV into a spreadsheet.

(9) By anonymous on 2020-05-29 01:44:04 in reply to 5 [link] [source]

For markdown mode, should it escape characters that have special meaning to markdown, like |, [, ', and *

In list-mode the .separator ocurrence is not being escaped, when it's part of a value string. For example, string value that includes a comma, pipe-char, single-quote, double-quote.

In quote-mode (or SQL-mode), by default all strings are quoted between single-quotes, any in-string single-quote is escaped (doubled).

In csv-mode, an in-string column .separator causes the string-value to be double-quoted. An in-string double-quote is escaped (doubled).

I didn't check html-mode, as it's must be following the HTML escaping rules.

So, if Markdown is to be a separate mode, then for practical purposes the special in-string chars need to be escaped. The question is what is the full set of special chars to escape?

From the extended syntax link mentioned earlier:

Formatting Text in Tables

You can format the text within tables. For example, you can add links, code (words or phrases in backticks (`) only, not code blocks), and emphasis.

You can't add headings, blockquotes, lists, horizontal rules, images, or HTML tags.

Escaping Pipe Characters in Tables

You can display a pipe (|) character in a table by using its HTML character code (|).

Not sure why the second paragraph in the quote above got emphasized...

sqlite3
create table x(id, val);
insert into x(id, val) values(1,'text'),(2,',comma'),(3,'|pipe'), (4,'''s-quote'),(5,'"d-quote');

.mode list 
select * from x;

.separator ,
select * from x;
...

(10) By anonymous on 2020-05-29 21:13:06 in reply to 5 [link] [source]

Testing 5865d2f2d... WOW!

sqlite> .mode markdown
sqlite> select * from event limit 2;
| type |      mtime       | objid | tagid | uid | bgcolor | euser | user | ecomment |        comment         | brief |      omtime      |
|------|------------------|-------|-------|-----|---------|-------|------|----------|------------------------|-------|------------------|
| ci   | 2454303.0902662  | 1     |       |     |         |       | drh  |          | initial empty baseline |       | 2454303.0902662  |
| ci   | 2454304.00006944 | 181   |       |     |         |       | drh  |          | Setup webpag updates.  |       | 2454304.00006944 |

It's so nice to read results in CLI, I'd make it my default .mode!

The Markdown table rendering in Fossil is a different question, but the table above appears cleanly in GitHub.

Speaking of escaping any characters in strings: on GitHub it appears that only pipe needs escaping. Stray brackets, backticks, asterisks, quotes are properly shown as characters; when balanced, they are rendered accordingly to Markdown.

WOW!

(11) By anonymous on 2020-05-30 18:35:16 in reply to 5 [link] [source]

Of course there are other formats that may be wanted, although in many cases it can be done using SQL functions provided by extensions, but in some cases (such as the cases you listed here) cannot so easily be done with extension functions, so is good to add them as built-in formats, I suppose.

One case where an extension function could do is PostScript format (so it probably should not be built-in (especially considering that most users probably will not be programming in PostScript anyways), although it is possible that the user will want to customize which extensions are available by default in the command shell before compiling it, especially to implement a custom VFS; this would probably be a good idea, I think). It could be represented as a sequence of procedure tokens (one procedure per row), for example, one row could be: {//null 32 -5.6 (Hello \(\(| \n) <B2A700010000>} A null should be //null (the two slashes indicate to retrieve the value that the name refers to and enter that, rather than entering the name itself); the next two items are an integer and a floating number, and then a string, and then a blob. A string in PostScript is in parentheses; backslashes and unbalanced parentheses inside must be escaped, and the C escapes \b, \t, \n, \f, and \r are also supported, and other bytes are using octal escapes with three digits. The next one is a hex blob; in PostScript it is actually just another syntax for a string, and is treated in the same way (and PostScript strings are limited to 65535 bytes, although I think recent versions of Ghostscript increase this limit). A PostScript program could then load a file (or use %pipe% to execute SQLite and read the rows that way) and then use the token operator to extract each row; or something like [(filename.ps) run] could be used to load all of it into a single array at once. (If I need this, then I would probably write such an extension.)

In both JSON and PostScript though, there are limits to numbers; they can't represent all 64-bit integers. PostScript supports 32-bit integers and 32-bit floating point; JSON supports only 64-bit floating point. JavaScript does have big integers now (denoted by writing n at the end of the number), but this is not possible with JSON.

That is just one example; of course, there are many other possibilities too, and which ones are useful depend on your use of them, I suppose.

I do not know the answer of your first three questions, but for the fourth question, I should expect yes (at least by default; you may have an option to disable it), since otherwise it won't work. But, not using a lot of Markdown stuff, I wouldn't really know; perhaps someone else on here knows better than I do.