SQLite User Forum

doc: date/time functions are technically non-compliant with ISO 8601
Login

doc: date/time functions are technically non-compliant with ISO 8601

(1) By anonymous on 2022-05-07 06:58:59 [link] [source]

The documentation page "Date And Time Functions" (https://www.sqlite.org/lang_datefunc.html) mentions ISO 8601. It meantions that date and time values can be stored as "text in a subset of the ISO-8601 format". However, ISO 8601 does not permit space as a separator between date and time (ISO 8601-1:2019, ยง 5.4.1). This is an RFC 3339 extension. Perhaps this part should be reworded as "text in a subset of the ISO 8601 and RFC 3339 formats"?

The documentation page "Quirks, Caveats, and Gotchas In SQLite" (https://www.sqlite.org/quirks.html) notes that dates and times can be stored "[a]s a TEXT string in the ISO-8601 format. Example: '2018-04-02 12:13:46'." The example is non-compliant with ISO 8601 (but with RFC 3339) because it uses space as a separator. This example should be changed by replacing the space with 'T' or alternatively replacing "ISO-8601" with "RFC 3339".

In all mentions, it says ISO-8601, but the standard appears to be normally written as "ISO 8601" with a space instead of a hyphen.

(2) By Holger J (holgerj) on 2022-05-07 22:35:41 in reply to 1 [link] [source]

Still, the format with a space separator instead of the 'T' remains the most common format, not only in the context of database systems.

Maybe the documentation should mention the RFC as well.

(3.1) By Keith Medcalf (kmedcalf) on 2022-05-07 23:34:39 edited from 3.0 in reply to 2 [source]

However, the datetime functions do not require either ISO 8601 nor RFC 3339. It uses ISO 8601 and RFC-3999 in respect of requiring the date/time to be stored in "big endian" notation (and the requirement for the separators used between subcomponents and the digit and padding requirements of subcomponents). That is, biggestest unit first, followed, in order from left to right, by decreasing units, followed finally by an optional localization (timezone).

The format for a date is required to be YYYY-MM-DD. The format for a time is required to be HH:MM:SS[.fffffffffffffffff] (there is no limit on the number of fractional second digits, but only the value half-even rounded to 3 places (milliseconds) will be used internally by the datetime functions), and the localization as 'Z' or '+HH:MM' or '-HH:MM'.

Each component (date, time, timezone, in that order only) may be separated from the other component by any arbitrary number of spaces. The date component and the time component must be separated by at least one space or capital T. No separation is required before the localization (timezone).

Between the date and time components a capital letter T will be treated as a space (and there may be one or more of them, intermixed with space characters, to as much as you little heart desires -- lowercase t is an error). The timezone may be separated from the time by 0 to as many spaces as your little heart desires. The entire thing may have as many trailing spaces as your little heart desires (but not leading spaces).

That means that the following are the same:

'2022-05-07 16:58:30.234 -06:00'  
'2022-05-07    TT     TTTTT  16:58:30.234-06:00                 '  

and so on and so forth.

Obviously, if your expect the text format to sort correctly then you must format the text consistently and all text representations (being compared) must have the same localization (timezone).

Properly, you should have a collation defined which sorts the text representations using full ordering (preferable) or sort only by the julianday number (representations as text that do not all have the same localization cannot be sorted properly via a string sort).