Docs: julianday() returns a real number, not a string
(1.2) By Harald Hanche-Olsen (hanche) on 2022-01-26 10:34:45 edited from 1.1 [link] [source]
The docs for Date And Time Functions leaves the reader with the impression that
julianday(…) returns a string, same as
strftime('%J', ...). But a bit of experimentation reveals that it returns a real number instead.
That little bit of information may be worth documenting. It could be done by adding a few words (shown here in boldface):
The julianday() function returns the Julian day - the number of days since noon in Greenwich on November 24, 4714 B.C. (Proleptic Gregorian calendar) as a real number.
I have little doubt, though I haven't tested it, that
CAST(strftime('%J',…) AS REAL) would produce the exact same number, down to the least significant bit. But that should be unnecessary. Possibly, the example showing
strftime('%J',…) as being equivalent to
julianday(…) should also be amended to use this CAST expression.
(2) By Larry Brasfield (larrybr) on 2022-01-26 15:51:41 in reply to 1.2 [link] [source]
Having gone over that doc earlier, and seen it as a model of clarity, I was tempted to disagree with you that more clarity was needed. However, having read it again with the idea that it might mislead, or fail to overcome a common (but not required) interpretation of certain words, I have come around to your view. The phrase "day number" is too easily assumed to be a "whole day number" (or "an integer" in computer/mathematician-speak.) So, to ward off that not quite required, but natural interpretation (for some fraction of readers), I have made this change. I think that suffices, but will carefully consider opinions, and especially arguments, to the contrary.
Thanks for the tip.
(3) By Harald Hanche-Olsen (hanche) on 2022-01-26 16:33:10 in reply to 2 [link] [source]
That is a good change, but it is somewhat orthogonal to my concern, which is that
strftime('%J',…) are not exactly equivalent, as they return different types:
▶▶▶ create table tbl(date, type generated as (typeof(date))); ▶▶▶ insert into tbl(date) values (julianday('now')), (strftime('%J','now')); ▶▶▶ select * from tbl; date type ----------------- ---- 2459606.18498622 real 2459606.184986215 text
The difference probably doesn't matter in the vast majority of uses, but it is a difference.
But I agree that the doc is indeed very clear, with this itty-bitty reservation.
(4) By Larry Brasfield (larrybr) on 2022-01-26 21:14:40 in reply to 3 [link] [source]
Ok, I see your point now, although I could not easily grasp that you might expect anything but text from something named "str*". (I know you said it, but I read too fast when busy sometimes.)
I don't agree that the type of the return breaks the claimed equivalency. And I am not convinced that such difference as there is in return type is worth the doc dilution effect of explaining it, in part because words like 'number' and names like 'str%' nearly shout that they are not precisely equivalent.
If the results were in fact equivalent when converted to REAL or TEXT (one way or the other), I would agree that the difference probably does not matter.
However, I ran a little experiment, running the following repeatedly:
select x, y, x-y from (select julianday('now') as x, strftime('%J', 'now') as y); select x, y, x-y from (select strftime('%J', 'now') as x, julianday('now') as y);
This yields a result that makes me think some words on the difference could matter, to those naive enough to be comparing floating point numbers. The difference term always has one of 3 tiny values, set of which is centered on the origin. To get it to be 0.0 consistently, I have to say:
select x, y, x-y from (select strftime('%J', 'now') as x, printf('%17.9f',julianday('now')) as y);
I'm still mulling over whether mentioning these minutia is a net benefit. While doing so might be more technically correct, they will likely be a distraction for the vast majority of users who either just want to know what to use to get what they want and care little about numerical accuracy limitations of binary floating point values rendered into decimal form.
(5) By Harald Hanche-Olsen (hanche) on 2022-01-26 21:41:07 in reply to 4 [link] [source]
but I read too fast when busy sometimes
Me too, all the time.
I don't think the tiny difference in values is important enough to mention. But a user with an application processing loads of daytime values might want to avoid the conversion from real to string and back for performance reasons.
And here is one example where the differnt results could be important, if the result of the query is exported:
▶▶▶ select json_array(julianday('now'),strftime('%J','now')) daytimes; daytimes ------------------------------------- [2459606.4024086,"2459606.402408599"]
(Note the quotes around the second array element.)
(6.3) By Keith Medcalf (kmedcalf) on 2022-01-26 21:52:07 edited from 6.2 in reply to 4 [link] [source]
Those three results represent -1, 0 and +1 ULP of precision and represents the conversion discrepancy between the actual value (as returned by the julianday function as a double precision floating point) and the ASCII representation of the same value after conversion to and fro double precision floating point.
The current epsilon of juliandate('now') expressed as a double precision floating point number is 4.65661287307739e-10 days (0.0402331352233887 milliseconds).
The maximum difference is therefore about 40 microseconds and is really due to the conversion to and from ASCII TEXT base 10 representation.
It should be pointed out that iJD (the internal timestamp used by SQLite3) is only precise to the millisecond, the difference in value is entirely an aftifact of conversion and does not represent any actual difference in the source value.
(7) By Harald Hanche-Olsen (hanche) on 2022-01-26 22:36:10 in reply to 6.3 [link] [source]
I apologize for the final paragraph of my initial post. It has led the discussion astray, or at least not where I wanted it to go. To me, the important point is the type of the result, not whether or not it is some microseconds off.
(8.1) By Keith Medcalf (kmedcalf) on 2022-01-26 23:19:04 edited from 8.0 in reply to 7 [link] [source]
I can see your point. While it is indeed true that CURRENT_TIME, CURRENT_DATE, CURRENT_TIMESTAMP, date(...), time(...) and datetime(...) are merely aliases for the strftime function specifying a pre-bound "format string" and returning text, the julianday(...) function is not an overload of strftime(...) -- although there is a format code that can be used in a strftime format string to return a textual representation of the julian epoch day offset.
The julianday(...) function takes the same parameters (without a format string) as the strftime function -- however it returns a dirrectly calculated floating point value.
Conceptually, the '%J' format string to strftime returns the same number, just in ASCII TEXT (since that is what strftime does). The difference is, of course, that conversion between base-10 ASCII representation and base-2 binary floating point representation introduces "conversion error" and should be expected.
In other words, the Julian Epoch Offset is retrieved by the julianday(...) function. This same value can be "converted to text" by formatting the floating point result using whatever method the heart desires, including providing the appropriate formating codes to the strftime function, however, as everyone should be aware, conversion to and from ASCII base-10 print format and double precision base-2 floating point are not exact.
It would be a rational expectation that each conversion would introduce additional non-stochastic error.
It other words, I agree that the first paragraph in the documentation is misstated.
julianday(...) does not return the same value as strftime('%J', ...); nor is juliandate an overload of strftime (as are the other functions). It instead returns the accurately computed (within the bounds of hardware capability) Julian Epoch offset.
The same caveats apply to the unixepoch(...) function, which also returns a directly computed integer value and is not an overload of strftime -- although there is a format code for strftime that will produce similar text output.
(11.1) By Stephan (stephancb) on 2022-01-27 09:35:13 edited from 11.0 in reply to 8.1 [link] [source]
Investigating the "conversion error" reveals:
The precision of IEEE745 double precision (used for SQLite REALs in most/all? implementations) when representing a fractional Julian day corresponds to a bit less than 0.1 milliseconds for contemporary times (it is higher for times way back in the past and slowly detoriating towards the future). The julianday(...) functions accepts date-time strings having submillisecond precision, but seems to ignore the submillisecond part:
sqlite> SELECT julianday('2022-01-27 07:34:42.1239')-julianday('2022-01-27 07:34:42.123'); 0.0 sqlite> SELECT julianday('2022-01-27 07:34:42.124')-julianday('2022-01-27 07:34:42.123'); 1.16415321826935e-08
even though the double precision would be (just) enough for also taking into account the digit representing hundred microseconds. One could argue that the documention cites
%f fractional seconds: SS.SSS, i.e. no precision below millisecond.
The strftime(...) also accepts date-time strings with submilliseconds and returns them faithfully (it just needs to repeat the input):
sqlite> SELECT strftime('2022-01-27 07:34:42.1239'); 2022-01-27 07:34:42.1239
Just an observation, I'm not suggesting to change anything (perfect is the enemy of the good, Voltaire)
(14) By Harald Hanche-Olsen (hanche) on 2022-01-27 13:20:09 in reply to 11.1 [link] [source]
perfect is the enemy of the good, Voltaire
… who was quoting an old Italian proverb, Le meglio è l'inimico del bene, in which it is the better that is the enemy of the good.
I like the Shakespeare version, though:
Were it not sinful then, striving to mend,
To mar the subject that before was well?
Sorry for the off-off-topic tangent. To get back to the merely off-topic tangent, every floating point number, being a dyadic rational, has a finite decimal expansion. I would not dream of suggesting one should print that, but in a “better” world, one could print enough digits so that reading it back would yield the exact same number. I believe Common Lisp implementations strive to achieve this goal, possible at a loss of performance. There is a large literature on the efficient and exact printing of floating point numbers, apparently a tricky subject. Conversely, can one assume that reading the decimal representation of a number would result in the exact float that is closest to the given number, subject to appropriate rounding behaviour? I suspect not.
Everything in the previous paragraph is indeed the enemy of the current perfectly(!) good state of affairs, and striving to mend it seems sinful indeed.
(15) By Keith Medcalf (kmedcalf) on 2022-01-27 20:47:43 in reply to 11.1 [source]
The strftime(...) also accepts date-time strings with submilliseconds and returns them faithfully (it just needs to repeat the input):
sqlite> SELECT strftime('2022-01-27 07:34:42.1239');
In the same vein, strftime accepts are returns as a result what was given as the format string, if it contains no valid "formatting specifiers" -- therefore it would appear that it also accepts dates in "descriptive format" and returns them unmolested ...
select strftime('yapapalooza sex party');
All builtin date/time handling in SQLite3 is limited to millisecond precision (though the accuracy may be considerably more or considerably less -- that is an Operating System restriction and not due to SQLite3) because datetime values are stored and computed internally using an internally maintained "Offset from the Epoch" which is stored as an integer number of milliseconds since the Julian Epoch. It can be manipulated and presented in a multiplicity of formats however it can never express more accuracy than one millisecond in whatever format it is presented.
(9) By Larry Brasfield (larrybr) on 2022-01-27 00:35:13 in reply to 7 [link] [source]
I was mindful of a certain Aesop's fable ("The Man, The Boy, and The Donkey") while making this last1 change. However, in light of your comments, Keith's comments, and what I understand to be an accurate disclosure policy for API behavior, I decided that the additional changes about return type and its effects is worth the extra verbiage2. Please let me know if this falls short of your expectations for what the docs should accomplish. (This invitation goes to Keith also.) Please note that when converted to HTML, this renders prettily, so critiques should be about content and clarity rather than aesthetics.
At least, I hope it's the last for the present issue.
It can be a narrow path between glossing over occasionally relevant detail and verbosity that hinders comprehension of more commonly important points. The fact that the added verbiage was already in a footnote-like paragraph made this change easier to justify.
(10.1) By Harald Hanche-Olsen (hanche) on 2022-01-27 08:33:28 edited from 10.0 in reply to 9 [link] [source]
That’s good. And thanks for the bonus link! It’s been too long since I read any of Aesop’s fables.
PS. Glad to see the
unixepoch function too. I don’t see it in the current online docs, so it (or its mention in the documentation) must be a recent addition.
(12.1) By Warren Young (wyoung) on 2022-01-27 08:51:11 edited from 12.0 in reply to 10.1 [link] [source]
(13.1) By Harald Hanche-Olsen (hanche) on 2022-01-27 09:23:55 edited from 13.0 in reply to 12.1 [link] [source]
I don’t see it in the current online docs
It’s right here.
I see only the modifier
’unixepoch’ there, not the function.
I guess the changes to the source have not made it to the online docs yet, or did my browser cache an ancient version?
(16) By bucweat on 2022-01-28 04:18:22 in reply to 13.1 [link] [source]
This forum is great place to pick up little tidbits of info about SQLite...thanks to the patient folks who contribute here :-)
This gave me an excuse to poke at source tree in fossil for the first time, and if I'm interpreting correctly, the unixepoch() function was added 29 Nov 2021 two days after the creation of the 3.37 branch on 27 Nov 2021. So looks like unixepoch() function is a feature destined for upcoming release.
(17) By Keith Medcalf (kmedcalf) on 2022-01-28 05:52:56 in reply to 13.1 [link] [source]
There is a
draft version of the website that contains all the proposed updates in progress (I do not know what the regular update schedule is, but it is re-built from source periodically).
(18) By Vadim Goncharov (nuclight) on 2022-01-28 10:52:20 in reply to 6.3 [link] [source]
It should be pointed out that iJD (the internal timestamp used by SQLite3) is only precise to the millisecond
Could you please point to source where it is, as quick search on site show nothing about iJD? Do I understand correctly that SQLite uses Julian Day instead of Unix Time internally?
(19) By Richard Hipp (drh) on 2022-01-28 11:02:50 in reply to 18 [link] [source]
All of the SQLite date/time functions are implemented by a single source-code file src/date.c. There are comments in the code explaining how everything works.