bug revealed by strftime()
(1) By anonymous on 2024-08-19 04:38:24 [source]
Consider the following SQL statements:
select 'JD ' || 1867217.5 || ': ' || strftime('%F %T', 1867217.5);
select 'JD ' || 1867217.0 || ': ' || strftime('%F %T', 1867217.0);
select 'JD ' || 1867216.5 || ': ' || strftime('%F %T', 1867216.5);
select 'JD ' || 1867216.0 || ': ' || strftime('%F %T', 1867216.0);
select 'JD ' || 1867215.5 || ': ' || strftime('%F %T', 1867215.5);
select 'JD ' || 1867215.0 || ': ' || strftime('%F %T', 1867215.0);
select 'JD ' || 1830692.5 || ': ' || strftime('%F %T', 1830692.5);
select 'JD ' || 1830692.0 || ': ' || strftime('%F %T', 1830692.0);
select 'JD ' || 1830691.5 || ': ' || strftime('%F %T', 1830691.5);
select 'JD ' || 1830691.0 || ': ' || strftime('%F %T', 1830691.0);
select 'JD ' || 1830690.5 || ': ' || strftime('%F %T', 1830690.5);
select 'JD ' || 1830690.0 || ': ' || strftime('%F %T', 1830690.0);
The first six statements show dates and times around the leap day of AD 400, and the second six statements show dates and times of a century earlier:
JD 1867217.5: 0400-03-02 00:00:00
JD 1867217.0: 0400-03-01 12:00:00
JD 1867216.5: 0400-03-01 00:00:00
JD 1867216.0: 0400-02-29 12:00:00
JD 1867215.5: 0400-02-29 00:00:00
JD 1867215.0: 0400-02-28 12:00:00
JD 1830692.5: 0300-03-02 00:00:00
JD 1830692.0: 0300-02-29 12:00:00
JD 1830691.5: 0300-02-29 00:00:00
JD 1830691.0: 0300-02-28 12:00:00
JD 1830690.5: 0300-02-28 00:00:00
JD 1830690.0: 0300-02-27 12:00:00
Since SQLite exclusively uses the proleptic Gregorian calendar, it should never return a leap day in AD 300; the ”0300-02-29” dates should be returned as “0300-03-01” instead. Perhaps this is the result of using Meeus’ algorithm in the computeJD() function in src/date.c without checking for invalid Gregorian leap days. (Similar results happen every 400 years before this, i.e. for the years -0100, -0500, etc.)
(2.1) By Richard Hipp (drh) on 2024-08-19 11:11:10 edited from 2.0 in reply to 1 [link] [source]
The problem appears to be in conversion from JD into YYYY-MM-DD, in subroutine computeYMD(). The conversion from YYYY-MM-DD into JD in subroutine computeJD() appears to be correct.
Meeus's algorithm for Gregorian leap-year correction fails for JD before 1867216 (0400-02-29) due to problems with negative numbers. In fairness to Meeus, he never claims otherwise, and in fact he only uses the Gregorian leap-year correction for dates after JD 2299160 (1582-10-14).
In the following script, you can see that JD into YYYY-MM-DD conversions go wonky for negative years:
.mode box .width -11 -9 -11 -11 WITH RECURSIVE years(y) AS (VALUES(-4700) UNION ALL SELECT y+100 FROM years WHERE y<2500), marchfirst(d) AS MATERIALIZED ( SELECT CASE WHEN y<0 THEN format('-%04d-03-01',-y) ELSE format('%04d-03-01',y) END FROM years), jdnum(jd,d) AS MATERIALIZED (SELECT julianday(d), d FROM marchfirst) SELECT d AS 'date', jd AS 'JD', date(jd-0.01) AS 'day before', date(jd+1.0) AS 'day after' FROM jdnum;
(3) By Alan L (ludlovian) on 2024-08-19 11:17:19 in reply to 1 [link] [source]
What an intersting find!
Looking at the code in date.c, I suspect the bug is in computeYMD
.
Specifically line 470 has:
A = (int)((Z - 1867216.25)/36524.25);
If my rusty C knowledge is right, casting to an int will round down if the value is positive, but round up if negative.
In Meeus algorithm (page 63 of the second addition), he calls this variable alpha and it should be rounding down.
The difference only matters for dates below '0400-02-29 18:00:00' (Julian Day 1_867_216.25).
(4) By Bo Lindbergh (_blgl_) on 2024-08-19 11:57:11 in reply to 2.1 [link] [source]
static int ifloor(
double d)
{
int i=d;
return i-(i>d);
}
(Yes, I know. Far too terse!)
(5) By Richard Hipp (drh) on 2024-08-19 12:14:41 in reply to 3 [link] [source]
Changing that line to:
A = (int)((Z + 32044.75)/36524.25) - 52;
Helps, but there are still some issues that I haven't yet figured out.
(6) By Alan L (ludlovian) on 2024-08-19 12:46:46 in reply to 3 [link] [source]
If anyone is interested in Meeus book, there's an online copy here:
https://www.agopax.it/Libri_astronomia/pdf/Astronomical%20Algorithms.pdf
(7) By Richard Hipp (drh) on 2024-08-19 12:59:49 in reply to 1 [link] [source]
Working on video-chat with Alan L, I managed to get the proposed fix shown in check-in 00cae11fffaf50e2.
Apparently when Meeus says "INT()", he really means "FLOOR()". Probably it says as much somewhere in the text, that I missed. The difference occurs when the argument is a negative number. I made adjustments so that the computation always occurs on positive numbers now, even for proleptic gregorian calendar days. The issue does not come up for Meeus, since he only computes using the gregorian calendar after 1582, and the values are all positive then.
Please try this patch and report any anomalies.
(8) By Alan L (ludlovian) on 2024-08-19 14:36:12 in reply to 5 [link] [source]
Richard, The new date code works - or at least I cannot break it.
I have tried roundtrips (string to string, and JD to JD) for every day from JD 1.
And also checked the Feb-28 to Mar-1 gap is correctly 1 or 2 days from -4700 to 2500
(And I learnt how to compile from source, so today is a good learning day)
(9) By anonymous on 2024-08-19 17:20:23 in reply to 7 [link] [source]
Richard, thanks to you and Alan L for your efforts in devising and testing a fix so promptly.
It looks as though Meeus defined INT() as “FLOOR()” on page 60, though he didn’t call it by that name. His concern appears to have been distinguishing INT() from what he’d called TRUNC(). In C, (int) -7.83
is -7
[Meeus’ TRUNC()] rather than -8
[Meeus’ INT()].
I’m not sure that I could test your fix more thoroughly than Alan L already has, but I’ll have a go at it as time allows.
Unrelated to this bug, is there any interest in having SQLite’s strftime() supplemented with additional conversion specifiers from the current definition of C strftime() in IEEE Std 1003.1-2024? (By that definition, SQLite’s %k
, %l
, and %P
conversion specifiers are extensions to C strftime() just as much as SQLite’s %f
and %J
conversion specifiers are.)
On the SQLite Date and Time Functions documentation page, it might be clearer to describe strftime()’s %e
, %k
, and %l
conversion specifiers as “with leading space for single digit” rather than “without leading zero”. Also, I presume that the %H
and %k
conversion specifiers should have an upper limit of 23 rather than 24. (The same applies for the initial comment for strftimeFunc() in src/date.c.)
(10) By anonymous on 2024-08-20 09:41:27 in reply to 9 [link] [source]
As an aside: INT()
is indeed defined explicitly as a floor operation on page 60: “we will indicate by INT(x) the greatest integer less than or equal to x”.
But this behaviour is already implied in the introduction: “The few programs we give are in standard BASIC” (p. 2). All documentation of BASIC dialects I can find online specify INT()
as a floor operation (or remain ambiguous).
(11.1) By sbooth on 2024-08-20 17:38:47 edited from 11.0 in reply to 1 [link] [source]
There are other algorithms besides Meeus' available to compute Julian Day Numbers (JDN) and Julian Dates (JD), for example Richards, E.G. 2012, "Calendars," from the Explanatory Supplement to the Astronomical Almanac, 3rd edition, S.E Urban and P.K. Seidelmann eds., (Mill Valley, CA: University Science Books), Chapter 15, pp. 585-624.
The Richards' JDN interconversion algorithms use exclusively integer math. Here is some C code I wrote as proof-of-concept for the Gregorian calendar conversion:
#include <stdio.h>
#include <math.h>
int jdn_from_ymd(int year, int month, int day, int *jdn)
{
if(jdn == NULL)
return 1;
int Y = year;
if(Y <= -4716)
return 1;
int h = month - 2;
int g = Y + 4716 - (12 - h) / 12;
int f = (h - 1 + 12) % 12;
int e = (1461 * g + 0) / 4 + day - 1 - 1401;
int J = e + (153 * f + 2) / 5;
J = J - (3 * ((g + 184) / 100)) / 4 - -38;
*jdn = J;
return 0;
}
int ymd_from_jdn(int jdn, int *year, int *month, int *day)
{
if(year == NULL || month == NULL || day == NULL)
return 1;
int J = jdn;
if(J < 0)
return 1;
int f = J + 1401;
f = f + (((4 * J + 274277) / 146097) * 3) / 4 + -38;
int e = 4 * f + 3;
int g = (e % 1461) / 4;
int h = 5 * g + 2;
int D = (h % 153) / 5 + 1;
int M = ((h / 153 + 2) % 12) + 1;
int Y = e / 1461 - 4716 + (12 + 2 - M) / 12;
*year = Y;
*month = M;
*day = D;
return 0;
}
int jd_from_ymd(int year, int month, double day, double *jd)
{
if(jd == NULL)
return 1;
double day_integral;
double day_fraction = modf(day, &day_integral);
int jdn;
if(jdn_from_ymd(year, month, day_integral, &jdn))
return 1;
*jd = jdn - 0.5 + day_fraction;
return 0;
}
// The following two functions assume a day is exactly 86400 seconds
static double fractional_day_from_hms(int hour, int minute, int second)
{
return (hour / 24.0) + (minute / 1440.0) + (second / 86400.0);
}
static void hms_from_fractional_day(double fractional_day, int *hour, int *minute, int *second)
{
double hour_integral;
double hour_fraction = modf(fractional_day * 24.0, &hour_integral);
double minute_integral;
double minute_fraction = modf(hour_fraction * 60.0, &minute_integral);
double second_integral;
/*double second_fraction =*/ modf(minute_fraction * 60.0, &second_integral);
*hour = hour_integral;
*minute = minute_integral;
*second = second_integral;
}
int jd_from_ymdhms(int year, int month, int day, int hour, int minute, int second, double *jd)
{
return jd_from_ymd(year, month, day + fractional_day_from_hms(hour, minute, second), jd);
}
int ymdhms_from_jd(double jd, int *year, int *month, int *day, int *hour, int *minute, int *second)
{
if(year == NULL || month == NULL || day == NULL || hour == NULL || minute == NULL || second == NULL)
return 1;
double jdPlus12Hours = jd + 0.5;
int J = floor(jdPlus12Hours);
if(ymd_from_jdn(J, year, month, day))
return 1;
double day_integral;
double day_fraction = modf(jdPlus12Hours, &day_integral);
if(day_fraction < 0.0)
day_fraction += 1.0;
hms_from_fractional_day(day_fraction, hour, minute, second);
return 0;
}
void print_ymdhms_from_jd(double jd)
{
int year, month, day, hour, minute, second;
if(ymdhms_from_jd(jd, &year, &month, &day, &hour, &minute, &second))
return;
printf("%.1f -> %.4d-%.2d-%.2d %.2d:%.2d:%.2d\n", jd, year, month, day, hour, minute, second);
}
void print_jd_from_ymdhms(int year, int month, int day, int hour, int minute, int second)
{
double jd;
if(jd_from_ymdhms(year, month, day, hour, minute, second, &jd))
return;
printf("%.4d-%.2d-%.2d %.2d:%.2d:%.2d -> %.1f\n", year, month, day, hour, minute, second, jd);
}
int main(int argc, const char * argv[])
{
print_ymdhms_from_jd(1867217.5);
print_ymdhms_from_jd(1867217.0);
print_ymdhms_from_jd(1867216.5);
print_ymdhms_from_jd(1867216.0);
print_ymdhms_from_jd(1867215.5);
print_ymdhms_from_jd(1867215.0);
print_ymdhms_from_jd(1830692.5);
print_ymdhms_from_jd(1830692.0);
print_ymdhms_from_jd(1830691.5);
print_ymdhms_from_jd(1830691.0);
print_ymdhms_from_jd(1830690.5);
print_ymdhms_from_jd(1830690.0);
return 0;
}
The output is
1867217.5 -> 0400-03-02 00:00:00
1867217.0 -> 0400-03-01 12:00:00
1867216.5 -> 0400-03-01 00:00:00
1867216.0 -> 0400-02-29 12:00:00
1867215.5 -> 0400-02-29 00:00:00
1867215.0 -> 0400-02-28 12:00:00
1830692.5 -> 0300-03-02 00:00:00
1830692.0 -> 0300-03-01 12:00:00
1830691.5 -> 0300-03-01 00:00:00
1830691.0 -> 0300-02-28 12:00:00
1830690.5 -> 0300-02-28 00:00:00
1830690.0 -> 0300-02-27 12:00:00
There are also modifications to allow negative JDs.
If there is any interest in using Richards' instead of Meeus' method I'm happy to help out if desired.
(12) By anonymous on 2024-08-23 00:12:44 in reply to 7 [link] [source]
Richard, I’ve found no anomalies with your checked-in patch; as far as I can tell, it’s a complete fix for the bug.
Since the computeJD() and computeYMD() functions in src/date.c are based on Meeus’ algorithms, it seems as though adding support for the Julian calendar would be relatively straightforward. I might experiment with that, by adding support for two new modifiers for the date/time functions, “gregorian” (a no-op for dates/times in the Gregorian calendar) and “julian” (for dates/times in the Julian calendar).
If it does turn out to be a straightforward enhancement, then adding further modifiers for “style” support (i.e. for years that begin on a day other than January 1) would be the next step. “Style” support would also require additional conversion specifiers in strftime(), so that a date like George Washington’s birthdate could be formatted as “1731/2-02-11” (i.e. using dual dating with the “Annunciation” style in the Julian calendar; the Annunciation style began its years on March 25).
(13) By anonymous on 2024-08-23 00:32:16 in reply to 11.1 [link] [source]
As far as I can tell, Richard Hipp’s patch in comment 7 above has completely fixed the bug.