SQLite Forum

Bug converting date to ISO date
Login

Bug converting date to ISO date

(1) By anonymous on 2024-01-18 09:38:33 [link] [source]

Today I stumbled on a strange bug when converting a date to the ISO week number.

Here is the SQL-statement to reproduce the bug:

select
	(strftime('%W', substr(20210101, 1, 4) || '-' ||	substr(20210101, 5, 2) || '-' || substr(20210101, 7, 2) ) ) as week_53_2020,
	(strftime('%W', substr(20210102, 1, 4) || '-' ||	substr(20210102, 5, 2) || '-' || substr(20210102, 7, 2) ) ) as week_53_2020,
	(strftime('%W', substr(20210103, 1, 4) || '-' ||	substr(20210103, 5, 2) || '-' || substr(20210103, 7, 2) ) ) as week_53_2020,
	(strftime('%W', substr(20210104, 1, 4) || '-' ||	substr(20210104, 5, 2) || '-' || substr(20210104, 7, 2) ) ) as week_1_2021,
	(strftime('%W', substr(20210113, 1, 4) || '-' ||	substr(20210113, 5, 2) || '-' || substr(20210113, 7, 2) ) ) as week_2_2021,
	(strftime('%W', substr(20220101, 1, 4) || '-' ||	substr(20220101, 5, 2) || '-' || substr(20220101, 7, 2) ) ) as week_1_2022,
	(strftime('%W', substr(20220102, 1, 4) || '-' ||	substr(20220102, 5, 2) || '-' || substr(20210102, 7, 2) ) ) as week_1_2022,
	(strftime('%W', substr(20220103, 1, 4) || '-' ||	substr(20220103, 5, 2) || '-' || substr(20210103, 7, 2) ) ) as week_1_2022,
	(strftime('%W', substr(20220113, 1, 4) || '-' ||	substr(20220113, 5, 2) || '-' || substr(20210113, 7, 2) ) ) as week_2_2022,
	(strftime('%W', substr(20230101, 1, 4) || '-' ||	substr(20230101, 5, 2) || '-' || substr(20230101, 7, 2) ) ) as week_1_2023,
	(strftime('%W', substr(20230102, 1, 4) || '-' ||	substr(20230102, 5, 2) || '-' || substr(20230102, 7, 2) ) ) as week_1_2023,
	(strftime('%W', substr(20230103, 1, 4) || '-' ||	substr(20230103, 5, 2) || '-' || substr(20230103, 7, 2) ) ) as week_1_2023
;

produces:
week_53 week_53 week_53 week_1  week_1  week_1  week_1  week1   week_2  week_1 week_1  week_1
00	00	00	01	02	00	00	01	02	00	01	01

(2.3) By punkish on 2024-01-18 11:21:43 edited from 2.2 in reply to 1 [link] [source]

Deleted

(3.1) By Dan Kennedy (dan) on 2024-01-18 11:21:36 edited from 3.0 in reply to 1 [link] [source]

Running this SQL:

    WITH dates(d) AS (
      VALUES ('2121-01-01'), ('2121-01-02'), ('2121-01-03'), ('2121-01-04'), ('2121-01-13'),
             ('2122-01-01'), ('2121-02-02'), ('2122-01-03'), ('2122-01-13'),
             ('2123-01-01'), ('2123-01-02'), ('2123-01-03')
    )
    SELECT d, strftime("week=%W day=%u", d) FROM dates;

Produces:

    ┌────────────┬───────────────────────────────┐
    │     d      │ strftime("week=%W day=%u", d) │
    ├────────────┼───────────────────────────────┤
    │ 2121-01-01 │ week=00 day=3                 │
    │ 2121-01-02 │ week=00 day=4                 │
    │ 2121-01-03 │ week=00 day=5                 │
    │ 2121-01-04 │ week=00 day=6                 │
    │ 2121-01-13 │ week=02 day=1                 │
    │ 2122-01-01 │ week=00 day=4                 │
    │ 2121-02-02 │ week=04 day=7                 │
    │ 2122-01-03 │ week=00 day=6                 │
    │ 2122-01-13 │ week=02 day=2                 │
    │ 2123-01-01 │ week=00 day=5                 │
    │ 2123-01-02 │ week=00 day=6                 │
    │ 2123-01-03 │ week=00 day=7                 │
    └────────────┴───────────────────────────────┘

Which looks correct to me. Am I missing something?

Dan.

(4) By punkish on 2024-01-18 11:22:00 in reply to 2.3 [link] [source]

deleted because I misread the question :(

(5) By anonymous on 2024-01-18 12:18:16 in reply to 3.1 [link] [source]

Which is incorrect. The ISO calendar for 2020 has 53 weeks. Its a leap year.
So the first three days of January in the Gregorian calendar are still in week 53 in 2020.

It is a bit confusing. That's why the test declares it as week 53:

select
	(strftime('%W', substr(20210101, 1, 4) || '-' ||	substr(20210101, 5, 2) || '-' || substr(20210101, 7, 2) ) ) as week_53_2020,
	(strftime('%W', substr(20210102, 1, 4) || '-' ||	substr(20210102, 5, 2) || '-' || substr(20210102, 7, 2) ) ) as week_53_2020,
	(strftime('%W', substr(20210103, 1, 4) || '-' ||	substr(20210103, 5, 2) || '-' || substr(20210103, 7, 2) ) ) as week_53_2020,
	(strftime('%W', substr(20210104, 1, 4) || '-' ||	substr(20210104, 5, 2) || '-' || substr(20210104, 7, 2) ) ) as week_1_2021,
	(strftime('%W', substr(20210113, 1, 4) || '-' ||	substr(20210113, 5, 2) || '-' || substr(20210113, 7, 2) ) ) as week_2_2021,
	(strftime('%W', substr(20220101, 1, 4) || '-' ||	substr(20220101, 5, 2) || '-' || substr(20220101, 7, 2) ) ) as week_1_2022,
	(strftime('%W', substr(20220102, 1, 4) || '-' ||	substr(20220102, 5, 2) || '-' || substr(20210102, 7, 2) ) ) as week_1_2022,
	(strftime('%W', substr(20220103, 1, 4) || '-' ||	substr(20220103, 5, 2) || '-' || substr(20210103, 7, 2) ) ) as week_1_2022,
	(strftime('%W', substr(20220113, 1, 4) || '-' ||	substr(20220113, 5, 2) || '-' || substr(20210113, 7, 2) ) ) as week_2_2022,
	(strftime('%W', substr(20230101, 1, 4) || '-' ||	substr(20230101, 5, 2) || '-' || substr(20230101, 7, 2) ) ) as week_1_2023,
	(strftime('%W', substr(20230102, 1, 4) || '-' ||	substr(20230102, 5, 2) || '-' || substr(20230102, 7, 2) ) ) as week_1_2023,
	(strftime('%W', substr(20230103, 1, 4) || '-' ||	substr(20230103, 5, 2) || '-' || substr(20230103, 7, 2) ) ) as week_1_2023

(6) By Gunter Hick (gunter_hick) on 2024-01-18 13:07:57 in reply to 1 [link] [source]

You cannot compute ISO week numbers with the %W qualifier.

STRFTIME(3) states:

%u day of week, 1=Monday
%v day of Week, 0=Sunday
%U week number, first Sunday defines week 1
%W week number, first Monday defines week 1
%V ISO week number, week starts on Sunday, days vote on which year they want to belong to, so Wednesday casts the deciding vote.

(7) By anonymous on 2024-01-18 14:17:13 in reply to 6 [link] [source]

%V ISO weeknumber → ?

Firstly, the ISO week number always starts on Monday; therefore, Monday is day 1 and Sunday is day 7.

Secondly, the second statement(week=%V)will fail

  VALUES ('2021-01-01'), ('2021-01-02'), ('2021-01-03'), ('2021-01-04'), ('2021-01-13'),
		 ('2022-01-01'), ('2021-02-02'), ('2022-01-03'), ('2022-01-13'),
		 ('2023-01-01'), ('2023-01-02'), ('2023-01-03')
)
SELECT d, strftime("week=%V day=%u", d) FROM dates;

Thirdly, Monday defines the first day of the new year and the first week of the new ISO year.

Fourthly, this will produce the ISO day in the week.

  VALUES ('2021-01-01'), ('2021-01-02'), ('2021-01-03'), ('2021-01-04'), ('2021-01-13'),
		 ('2022-01-01'), ('2021-02-02'), ('2022-01-03'), ('2022-01-13'),
		 ('2023-01-01'), ('2023-01-02'), ('2023-01-03')
)
SELECT d, strftime("week=%W day=%u", d) FROM dates;

2021-01-01	week=00 day=5
2021-01-02	week=00 day=6
2021-01-03	week=00 day=7
2021-01-04	week=01 day=1
2021-01-13	week=02 day=3
2022-01-01	week=00 day=6
2021-02-02	week=05 day=2
2022-01-03	week=01 day=1
2022-01-13	week=02 day=4
2023-01-01	week=00 day=7
2023-01-02	week=01 day=1
2023-01-03	week=01 day=2

(8) By Gunter Hick (gunter_hick) on 2024-01-18 14:42:07 in reply to 7 [link] [source]

OK Thursday instead of Wednesday. Still does not fix that %W does not supply the ISO week number and that %V is not implemented in SQLite.

(9) By anonymous on 2024-01-18 15:06:55 in reply to 8 [link] [source]

So I need to write some code to achieve this.

Do you have any ideas on how to incorporate this in SQLite? I don't mean how to code an extension (I have written quite a few of them). For example, I could write a function like iso_date or datatime(@iso'now').

The last one is a nice challenge and only possible if I am able to overload the datetime function of SQLite. I know C doesn't support that.

(10) By Stephan Beal (stephan) on 2024-01-18 15:12:00 in reply to 9 [link] [source]

Do you have any ideas on how to incorporate this in SQLite?

You may not need to. i have it on good authority that %U and %V are currently being strongly considered for inclusion.

The last one is a nice challenge and only possible if I am able to overload the datetime function of SQLite. I know C doesn't support that.

C doesn't but sqlite permits overloading of functions. You would not be able to support the @iso prefix you demonstrate because that level would be handled by sqlite's tokenizer.

(11) By anonymous on 2024-01-18 15:44:03 in reply to 10 [link] [source]

I can wait a version or two. I will wait patiently for the version which will add full support for the ISO calendar.

(12) By Stephan Beal (stephan) on 2024-01-18 15:58:44 in reply to 11 [link] [source]

I can wait a version or two.

Just note that "being strongly considered" does not mean "will definitely be added"! It is under investigation, in any case.

This part is just speculation, but there are presumably reasons (now lost to history) why %U and %V were not added along with the rest of the formats years ago.

(13.1) By Harald Hanche-Olsen (hanche) on 2024-01-21 13:55:54 edited from 13.0 in reply to 9 [link] [source]

So I need to write some code to achieve this.

No need; it has already been done.

You might look at this thread (now obsoleted by the inclusion of these features in the source, see elsewhere in this thread) for an extensive discussion of the issue, with patches to add the desired functionality. Note that the source code has changed a bit over time, so you need to locate the appropriate patch in the trhead. Most likely the latest one, if you are compiling from a recent source.

Of course, official support for the %U, $v, and %G substitutions would be most welcom.

Edited to note the happy update of the official source

(14) By Holger J (holgerj) on 2024-01-18 16:47:05 in reply to 13.0 [link] [source]

I regard it as very important to note that besides %V also %G is crucial, without it %V is useless.

From the documentation of GNU C's strftime():

 %G  The  ISO 8601  week-based year (see NOTES) with century as 
     a decimal number.  The 4-digit year corresponding to the ISO
     week number (see %V).  This has the same format and value as %Y, 
     except that if the ISO week number belongs to the previous or next year, 
     that year is used instead.

(15.1) By Stephan Beal (stephan) on 2024-01-18 17:38:46 edited from 15.0 in reply to 14 [link] [source]

I regard it as very important to note that besides %V also %G is crucial, without it %V is useless.

See src:e1155d6a. The checkin comment mentions only %g, but the code also handles %G. (Edit: not true - my eyes just don't work right.)

(16.2) By Nuno Cruces (ncruces) on 2024-01-19 23:52:53 edited from 16.1 in reply to 1 [source]

If patching SQLite is acceptable, this patch should apply cleanly to the amalgamation, and/or date.c.

It's tested relatively extensively.

(17) By anonymous on 2024-01-20 06:30:14 in reply to 13.0 [link] [source]

To verify the SQLite computes the Julian date correctly, I visited the NASA website https://ssd.jpl.nasa.gov/tools/jdc/#/cd

SQLite computes the Julian date (← please note that it is a date!) correctly, with the roll-over to the next Julian date at noon of a Gregorian day (that is the calendar we use daily).

Keith Medcalf ((12) By Keith Medcalf (kmedcalf) on 2020-05-04 17:09:32 in reply to 11 [link] [source]) said: However, the concept is the same. "Julianday" has its 0 defined as Noon at the prime meridian on November 24, 4714 BC. The internal datetime structure stores the datetime as the offset (in ms) from Noon on November 24, 4714 BC at "some (local) meridian" which is neither specified nor stored.

The date November 24, 4714 BC will ring the creationists bell; it was the day the world was created. The reason for using this date way back has to do with the invention of the negative numbers. The reasoning for using this date (for which we have to thank Julius Scaliger) was that there is nothing before this date, and this avoided having to deal with negative numbers.

The SQLite version I am using is SQLite version 3.44.2 (compiled on 2023-11-24 11:41:44) and that is a pretty recent version that has no support for the ISO date. To prove my statement: select strftime('week=%W day=%u', '2021-01-01'); Gives as result week=00 day=5 which is incorrect if it is an ISO date; however the weekday is the correct ISO weekday. The correct answer is: The year is still 2020, week 53, day 5.

If you type: select strftime('week=%w day=%u', '2021-01-01');

The result is week=5 day=5 and this makes no sense. It was luuk (By luuk on 2020-05-03 08:28:45 in reply to 1 [link] [source]) who explained what %V and %w should do.

However, the ISO week numbers are from 1 to 53 (for the leap year) using the %V. And one could keep the %W for a semi-ISO week system with weeks from 0 to 52.

(18) By anonymous on 2024-01-20 06:34:47 in reply to 16.2 [link] [source]

You must include tests for leap years, such as the 3th of January 2021 and the 1st of January 2000.

(19) By Nuno Cruces (ncruces) on 2024-01-20 09:07:56 in reply to 18 [link] [source]

The test cases are from Wikipedia, and they cover 1980 which is a leap year, but yes I can always add more.

(20) By Nuno Cruces (ncruces) on 2024-01-20 09:11:41 in reply to 15.1 [link] [source]

One less patch for me to support. Great!

(21) By Nuno Cruces (ncruces) on 2024-01-20 10:39:07 in reply to 15.1 [link] [source]

I wish you could consider this patch too, or the (reduced functionality) version in this post: https://sqlite.org/forum/forumpost/e7a939e074

It prevents repeated "utc" and "localtime" modifiers from shifting the time repeatedly, recognizes "now" and "unixepoch" as UTC (so they work correctly with the modifiers) and, the more complex version, stores a useful timezone in p->tz when converting to "localtime" (which you could use to extend strftime to print the timezone).

I know it's external code, but as much as legally possible, I place it under the public domain. Feel free to adapt it.

(22) By luuk on 2024-01-20 16:34:54 in reply to 17 [link] [source]

docs should be changed too (when this is correct), they currently say: %W week of year: 00-53

https://www.sqlite.org/lang_datefunc.html#:~:text=with%20Sunday%3D%3D0-,%25W,week%20of%20year%3A%2000%2D53,-%25Y

(23) By Richard Hipp (drh) on 2024-01-20 16:41:04 in reply to 18 [link] [source]

The date4.test script checks the %V and %G results against libc for every one of the 24,858 days between 1970-01-01 and 2038-01-19, inclusive. They give the same result in all cases.

So, either SQLite is correct, or libc is wrong.

(24) By luuk on 2024-01-20 16:44:03 in reply to 19 [link] [source]

'%u' and '%V' do not seem to work

use this source on the latest version to test: https://dbfiddle.uk/sMjxkbg_

(DBFIDDLE is on 3.39, but I tested on 3.43.2 (Windows))

D:\TEMP>sqlite3
-- Loading resources from C:\Users\Luuk/.sqliterc
SQLite version 3.43.2 2023-10-10 12:14:04
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .headers on
sqlite> with dates as (
(x1...>   select strftime('%Y-%m-%d','1979-01-01') as d
(x1...>   union all
(x1...>   select date(d,'+1 day') from dates where d<=strftime('%Y-%m-%d','2023-01-10'))
   ...> select
   ...>   d,
   ...>   strftime('%W',d) as W,
   ...>   strftime('%V',d) as V,
   ...>   strftime('%u',d) as "u_1=Monday",
   ...>   strftime('%w',d) as "w_0=Sunday"
   ...> from dates
   ...> WHERE strftime('%W',d) in ('00','53')
   ...> ;
d           W   V  u_1=Monday  w_0=Sunday
----------  --  -  ----------  ----------
1979-12-31  53                 1
1980-01-01  00                 2
1980-01-02  00                 3
1980-01-03  00                 4
1980-01-04  00                 5
1980-01-05  00                 6
1980-01-06  00                 0
1981-01-01  00                 4
1981-01-02  00                 5
1981-01-03  00                 6
1981-01-04  00                 0
1982-01-01  00                 5
1982-01-02  00                 6
1982-01-03  00                 0
1983-01-01  00                 6
....
2014-01-01  00                 3
2014-01-02  00                 4
2014-01-03  00                 5
2014-01-04  00                 6
2014-01-05  00                 0
2015-01-01  00                 4
2015-01-02  00                 5
2015-01-03  00                 6
2015-01-04  00                 0
2016-01-01  00                 5
2016-01-02  00                 6
2016-01-03  00                 0
2017-01-01  00                 0
2018-12-31  53                 1
2019-01-01  00                 2
2019-01-02  00                 3
2019-01-03  00                 4
2019-01-04  00                 5
2019-01-05  00                 6
2019-01-06  00                 0
2020-01-01  00                 3
2020-01-02  00                 4
2020-01-03  00                 5
2020-01-04  00                 6
2020-01-05  00                 0
2021-01-01  00                 5
2021-01-02  00                 6
2021-01-03  00                 0
2022-01-01  00                 6
2022-01-02  00                 0
2023-01-01  00                 0
sqlite>

(25) By Stephan Beal (stephan) on 2024-01-20 18:29:59 in reply to 24 [link] [source]

'%u' and '%V' do not seem to work ... (DBFIDDLE is on 3.39, but I tested on 3.43.2 (Windows))

%u and %V were only added two days ago. See src:e1155d6a.

(26) By Richard Hipp (drh) on 2024-01-20 18:32:54 in reply to 24 [link] [source]

Maybe try out your tests on https://sqlite.org/fiddle, which has all of the latest enhancements. Report any problems you find.

(27) By Harald Hanche-Olsen (hanche) on 2024-01-20 20:17:37 in reply to 17 [link] [source]

(Warning: This is completely off topic on this forum, but likely of interest to some. Stop reading now if you wish.)

The date November 24, 4714 BC will ring the creationists bell; it was the day the world was created. The reason for using this date way back has to do with the invention of the negative numbers. The reasoning for using this date (for which we have to thank Julius Scaliger) was that there is nothing before this date, and this avoided having to deal with negative numbers.

This account seems somewhat garbled to me. According to Wikipedia, and I think I have seen it elsewhere, the choice of that date is based on three cycles: The solar cycle: 28 years, as the Julian calendar repeats itself every 28 years; the metonic (or lunar) cycle: 19 years, as the dates of the phases of the moon repeat (approximately!) every 19 years; and, most cute of them all, the indiction cycle: 15 years, as taxes in ancient Rome was reassessed every 15 years (!). Noting that the numbers 28, 19, and 15 are pairwise mutually prime, their least common multiple is 28×19×15=7980, so all the cycles repeat every 7980 years (a Julan period).

It turns out the three cycles were all at their beginning together in the year 4713 BC, and so Julian day 0 was January 1, 4713 BC on the proleptic Julian calendar. Which becomes November 14, 47414 BC in the proleptic Gregorian calendar. (“Proleptic” here means you just extrapolate the respective calendars, leap years and all, into the past.)

Nowhere, as far as I can tell, was any notion of the creation of the world involved in this. I am well aware that some creationists have settled on a date roughly 6000 years in the past, however. Possibly, some of them might have locked onto Julian day 0, but I know of no evidence that creationist ideas inspired the choice of Julian day numbering.

But for sure, the desire to avoid negative numbers may well have inspired the choice of such an early date for the epoch. Even if the world existed before then, there were likely no recorded astronomical observations earlier thatn 4714 BC.

Oh, and if you can believe Wikipedia, it was Joseph Scalinger who came up with Julian days, not his father Julius. And he is supposed to have named it so not after his father, but from the Julian calendar. I have not dug deeper, but I too used to think it was named for Julius Scalinger.

(28) By Richard Hipp (drh) on 2024-01-20 22:29:16 in reply to 22 [link] [source]

That fix was made even before you posted. See https://sqlite.org/draft/lang_datefunc.html

(29) By Nuno Cruces (ncruces) on 2024-01-20 23:29:35 in reply to 23 [link] [source]

I'll retire my patch as soon as 3.46 is released. :)

(30) By anonymous on 2024-01-21 06:31:00 in reply to 27 [link] [source]

A bit of history, although it may seem off-topic, may shed some light on how time is measured and it is an interesting rabbit-hole.

The timescale offered by Jospehb Scaliger has nothing to do with the Gregorian calendar that was introduced on October the 15th of October 1582. The calendar replaced the old Julian (Roman) calendar. The document the “Inter Gravissimas pastoralis” (https://thelatinlibrary.com/gravissimas.html) stipulates:

Postremo, quoniam partim ob decem dies de mense octobris anni MDLXXXII (qui correctionis annus recte dici debet) exemptos, partim ob ternos etiam dies quolibet quadringentorum annorum spatio minime intercalandos, interrumpatur necesse est cyclus litterarum dominicalium XXVIII annorum ad hanc usque diem usitatus in Ecclesia Romana, volumus in eius locum substitui eumdem cyclum XXVIII annorum, ab eodem Lilio, tum ad dictam intercalandi bissexti in centesimis annis rationem, tum ad quamcumque anni solaris magnitudinem, accommodatum; ex quo littera dominicalis beneficio cycli solaris, aeque facile ac prius, ut in proprio canone explicatur, reperiri possit in perpetuum.

If you read the papal decree “Inter Gravissimas pastoralis” from Pope Gregorius you will discover that the reason for replacing the Julian calendar was:

  1. Determine the correct date for the spring-equinox (this had also a religious reason!)
  2. The correct date for the 14th moon directly after the spring-equinox (the religious reason is; determine the correct date for Easter)
  3. Easter will be celebrated on the first Sunday after the 14th moon.

So the 4th of October 1582 was followed by the 15th of October 1582. You will also discover the rule that we skip the leap year every century rollover, except if the century modulus 400 is zero. The leap year wasn’t skipped in 1600 and 2000. But was skipped in 1700, 1800, and 1900.

Mr. Scaliger, a Calvinist (protestant) had a strong dislike for papal meddling, and he was not the only one! England and all of her American colonies adopted the Gregorian calendar in 1750. That is the reason we celebrate Christmas with Saint Nicolas, for which we have to thank the Dutch (← this is a different rabbit-hole).

Yes, mr. Scaliger also used the solar cycli (← Latin origin) for his timescale, that was also had a strong religious motive to determine the date when God created the world and that is based on the bible. That is why this timescale rings the creationists bell. The 6,000 years that is something that rings another bell. It was mentioned during the Monkey trials and it was a correction on what Scaliger originally had proposed.

There is also a timescale that uses the birthday of Jesus as the point of origin. This timescale introduced the anno domini nostri Jesu Christi (in the year of our Lord Jesus Christ) which is abbreviated to A.D. and ante anno domini nostri Jesu Christi (in the year before of our Lord Jesus Christ) which is abbreviated to B.C.

And yes, Joseph Scaliger named the timescale Julius to honor his father.

So 4717 B.C. is a different timescale than Julianday 0. It was Julius Cæsar that reformed the old Roman calendar in a strict solar calendar in about 46 B.C. The month July refers to Julius Cæsar and August refers to Augustus and that is why those month have the same length. The new year originally started on the first of March. That is why February is a bit odd, which is also a fossil of the calendar reforms proposed by Numa Pompillus.

And to get back on topic: That is why the ISO calendar was introduced because it is almost free of religious influences.

It is time to get out of this rabbit-hole …

(31) By anonymous on 2024-01-21 06:36:17 in reply to 19 [link] [source]

There are exceptions in the calendar.

For example, if you want to check the algorithm to compute the date for Easter, you have to test for the exceptions. So, always test for the date of Easter 1943 and 1954.

Easter in 1943 was on the 25th of April and in 1954 was on the 18th of April. Two exceptions.

The year 2000 is also an exception. We didn't skip the leap year.

(32) By anonymous on 2024-01-21 08:34:10 in reply to 16.2 [link] [source]

There is a difference between % and / and the use of div and ldiv. The operators % and / depend on the implementation, and div or ldiv do not depend on the implementation.

Assuming the C89 standard, let's analyze this code:

ISO week date specifiers.

https://sqlite.org/forum/forumpost/73d99e4497e8e6a7

--- sqlite3.c.orig +++ sqlite3.c @@ -1373,6 +1373,29 @@ static void strftimeFunc( sqlite3_str_appendchar(&sRes, 1, c); break; }

  • case 'V': /* Fall thru */
  • case 'G': {
  • DateTime y = x;
  • computeJD(&y);
  • y.validYMD = 0;
  • /* Adjust date to Thursday this week:
  • The number in parentheses is 0 for Monday, 3 for Thursday */
  • y.iJD += (3 - (((y.iJD+43200000)/86400000) % 7))*86400000;
  • computeYMD(&y);
  • if( cf=='G' ){
  • sqlite3_str_appendf(&sRes,"%04d",y.Y);
  • }else{
  • int nDay; /* Number of days since 1st day of year */
  • i64 tJD = y.iJD;
  • y.validJD = 0;
  • y.M = 1;
  • y.D = 1;
  • computeJD(&y);
  • nDay = (int)((tJD-y.iJD+43200000)/86400000);
  • sqlite3_str_appendf(&sRes,"%02d",nDay/7+1);
  • }
  • break;
  • }

change nDay to a long which prevents a typecast later on. The statement nDay = (int)((tJD-y.iJD+43200000)/86400000) can be rewritten as: nDay = ldiv((tJD - y.iJD + 43200000), 86400000)

Since nDay is of type long (or int if you typecast the result) we have no use for the remainder, and therefore we use ldiv.

Let's analyze the part that determines if the Gregorian date 2021 January 5th is the new ISO year 2021 or still in week 53 of the ISO year 2020. Using fiddle:

.headers on .width 10 .mode column

-- compute ISO date for 2021 January 5th JD=2459219.5 and is stored in y(?) select (2459219.5 + 43200000)/86400000 as 'step 1', ((2459219.5 + 43200000)/86400000) % 7 as 'step 2', 3-((2459219.5 + 43200000)/86400000) % 7 as 'step 3', 86400000 * (3-((2459219.5 + 43200000)/86400000) % 7) as 'step 4' ;

Gives this result: 0.52846318 0.0 3.0 259200000.0

I fail to understand this, and it looks to me as a problem to determine the ISO year rollover.

(33) By Harald Hanche-Olsen (hanche) on 2024-01-21 10:12:09 in reply to 32 [link] [source]

I haven't studied your analysis in detail, but in response to this …

There is a difference between % and / and the use of div and ldiv. The operators % and / depend on the implementation, and div or ldiv do not depend on the implementation.

… I might remark that, as far as I know, the results of m%n and m/n are not implementation dependent if m is nonnegative and n is positive. And it has been my impression that the code in SQLite is taking some pains to ensure that. Am I wrong?

(34) By Harald Hanche-Olsen (hanche) on 2024-01-21 14:13:02 in reply to 25 [link] [source]

Now that we have ISO weeks in the output, it would be great also to have SQLite recognise a date string like '2021W52-6' (with or without an added timestamp) as an alias for '2022-01-01'.

Granted, ISO 8601 specifies too many ways to format a date, and I can understand the desire to support only one of them. But I think the notation using week and day numbers is sufficiently common to also merit inclusion.

(I think you meant to say that %G and %V where added recently. %u has been there for a while.)

(35) By anonymous on 2024-01-21 14:28:28 in reply to 33 [link] [source]

What is meant with depending on the implementation?

Suppose you want only the integer part of the division; for example: int(47 / 17). What is the expected result? Will the result be 3 (the implementation rounds it off) or is it 2 (implementation throws the remainder away).

I referred to the C89 standard because of the possibility that legacy code is still being used by some library. It will not be the first time I got such a nasty surprise.

If the code is compliant with the C99 standard, then there is no difference between the result obtained by / or div of %.

But using div is telling the compiler explicitly to use the machine code div which is a very fast integer division that throws the remainder away.

I think that you are correct, but I will experiment a bit to test our assumption with SQLite fiddle.

I am also working on a faster and more efficient algorithm to compute the ISO date. The faster the better is my motto.

(36) By Harald Hanche-Olsen (hanche) on 2024-01-21 15:59:28 in reply to 35 [link] [source]

I don't have access to either the C89 standard or the C99 standard, as I believe either costs a veritable fortune. What I do have is the second edition of the Kernighan and Ritchie book, with its Appendix A, which is claimed to be based on the submitted proposal for the C89 standard. It says:

The binary / operator yields the quotient, and the % operator the remainder, of the division of the first operand by the second; if the second operand is 0, the result is undefined. Otherwise, it is always true that (a/b)*b + a%b is equal to a. If both operands are non-negative, then the remainder is non-negative and smaller than the divisor; if not, it is guaranteed only that the absolute value of the remainder is smaller than the absolute value of the divisor.

(A7.6; my emphasis.) That (and nothing more) is what I meant by not depending on the implementation.

(37.1) By Nuno Cruces (ncruces) on 2024-01-22 10:57:58 edited from 37.0 in reply to 31 [link] [source]

Since you're unconvinced, the test now covers all days between the years 1500 and 2500, and compares against the Go standard library.

Instead of arguing that there are exceptions and implying my patch (or drh's) might be wrong, it'd be more helpful you could find a counter example where our patches actually fail? Then we'd have something to fix, which would be great.

PS: I ran the same tests (all days between the years 1500 and 2500, against the Go standard library) on drh's implementation, and it similarly passes the test (so his implementation agrees with both the C and the Go standard libraries). I hope that settles the matter.

(38) By anonymous on 2024-01-22 10:01:01 in reply to 29 [link] [source]

As promised, I wrote a simplified calculation to determine the number of ISO weeks in a year.

Here is the SQL-statement:

with dates as (
	select date(strftime('%Y-%m-%d','1892-02-29'), '-2 months', '+3 days') as d, 2351%7 as p0, 2351%7 as p_0, 366 as days_in_year, 1892 as y1
	union all
	select
		date(d,'+4 years'),
		(
		strftime('%Y', date(d, '+4 years'))
		+ cast( strftime('%Y', date(d, '+4 years')) / 4 as int)
		- cast( strftime('%Y', date(d, '+4 years')) / 100 as int)
		+ cast( strftime('%Y', date(d, '+4 years')) / 400 as int)
		)%7,
		(
		strftime('%Y', date(d, '+3 years'))
		+ cast( strftime('%Y', date(d, '+3 years')) / 4 as int)
		- cast( strftime('%Y', date(d, '+3 years')) / 100 as int)
		+ cast( strftime('%Y', date(d, '+3 years')) / 400 as int)
		)%7,
		cast( (julianday(date(d, '+1 year')) - julianday(d)) as int),
		y1+4
	from dates
	where cast(strftime('%Y',d) as int) between 1889 and 2026
)
select
	d,
	p0,
	p_0,
	days_in_year,
	(y1 + cast( y1/4 as int) - cast( y1/100 as int) + cast(y1/400 as int))%7 as p1,
	(
	case when (p0 == 4) or p_0 == 3
		then 53
		else 52
		end
		) as "weeks in. year",
	strftime('%Y-%m-%d',
	strftime('%V', d) as "SQLite ISO weeks in year",
	strftime('%W', d) as "SQLite weeks in year"
from dates
;

Here is the result produced by fiffle:
d           p0  p_0  days_in_year  p1  weeks in. year  Date used   SQLite ISO weeks in year  SQLite weeks in year
----------  --  ---  ------------  --  --------------  ----------  ------------------------  --------------------
1892-01-01  6   6    366           6   52              1892-01-01  53                        00                  
1896-01-01  4   2    366           4   53              1896-01-01  01                        00                  
1900-01-01  1   0    366           1   52              1900-01-01  01                        01                  
1904-01-01  6   4    365           6   52              1904-01-01  53                        00                  
1908-01-01  4   2    366           4   53              1908-01-01  01                        00                  
1912-01-01  2   0    366           2   52              1912-01-01  01                        01                  
1916-01-01  0   5    366           0   52              1916-01-01  52                        00                  
1920-01-01  5   3    366           5   53              1920-01-01  01                        00                  
1924-01-01  3   1    366           3   52              1924-01-01  01                        00                  
1928-01-01  1   6    366           1   52              1928-01-01  52                        00                  
1932-01-01  6   4    366           6   52              1932-01-01  53                        00                  
1936-01-01  4   2    366           4   53              1936-01-01  01                        00                  
1940-01-01  2   0    366           2   52              1940-01-01  01                        01                  
1944-01-01  0   5    366           0   52              1944-01-01  52                        00                  
1948-01-01  5   3    366           5   53              1948-01-01  01                        00                  
1952-01-01  3   1    366           3   52              1952-01-01  01                        00                  
1956-01-01  1   6    366           1   52              1956-01-01  52                        00                  
1960-01-01  6   4    366           6   52              1960-01-01  53                        00                  
1964-01-01  4   2    366           4   53              1964-01-01  01                        00                  
1968-01-01  2   0    366           2   52              1968-01-01  01                        01                  
1972-01-01  0   5    366           0   52              1972-01-01  52                        00                  
1976-01-01  5   3    366           5   53              1976-01-01  01                        00                  
1980-01-01  3   1    366           3   52              1980-01-01  01                        00                  
1984-01-01  1   6    366           1   52              1984-01-01  52                        00                  
1988-01-01  6   4    366           6   52              1988-01-01  53                        00                  
1992-01-01  4   2    366           4   53              1992-01-01  01                        00                  
1996-01-01  2   0    366           2   52              1996-01-01  01                        01                  
2000-01-01  0   5    366           0   52              2000-01-01  52                        00                  
2004-01-01  5   3    366           5   53              2004-01-01  01                        00                  
2008-01-01  3   1    366           3   52              2008-01-01  01                        00                  
2012-01-01  1   6    366           1   52              2012-01-01  52                        00                  
2016-01-01  6   4    366           6   52              2016-01-01  53                        00                  
2020-01-01  4   2    366           4   53              2020-01-01  01                        00                  
2024-01-01  2   0    366           2   52              2024-01-01  01                        01                  
2028-01-01  0   5    366           0   52              2028-01-01  52                        00                  
 

I've checked my algorithm and it seems to be correct.

(39) By anonymous on 2024-01-22 11:33:10 in reply to 37.1 [link] [source]

My post implied that the code written to implement the support for the ISO calendar might cause a problem. I was venting a concern that the remainder of the fractions could come in to play and might cause an erroneous result. That was the reason, I proposed an approach to avoid type casting.

Today I posted a small test (it was executed on fiddle) to compare the number of weeks in the ISO year based on the Gregorian year and this result was compared with SQLite's %V and %W.

I also posted something I stumbled upon with the leap year. Initially, what happened to 1904, but the issue extends to all leap years.

As for the test, that is a well-written test and I have no comment!

A good article on calendar algorithms:
Calendrical Calculations 
NACHUM DERSHOWITZ AND EDWARD M. REINGOLD 
Department of Computer Science, University of Illinois at Urbana- Champaign, 
1304 W. Springfield Avenue, Urbana, II, 618014987, U.S.A.

(40) By anonymous on 2024-01-22 11:35:21 in reply to 23 [link] [source]

I think libc is wrong or at least suspect.

(41.1) By Spindrift (spindrift) on 2024-01-27 07:21:45 edited from 41.0 in reply to 40 [link] [source]

I will be very interested to hear their response to this assertion. Please do feed back here.

(42) By Daniel Dumitriu (danield) on 2024-01-24 09:22:25 in reply to 36 [link] [source]

The relevant section in freely accessible resources: an as close as possible to a C89 draft and an officially sanctioned (HTML version of a) C99 draft.

Generally, this SO question is a good resource.

(43) By Nuno Cruces (ncruces) on 2024-01-27 00:18:56 in reply to 38 [link] [source]

I can't really understand what you want to show with this.

I'll just note that the column you've labeled "SQLite ISO weeks in year" is not the number of "ISO weeks" (sic) in (e.g.) the year 1904.

Rather, 53 is the "ISO week number" for the date 1904-01-01. You get 53 becase the "ISO week date" for 1904-01-01 is 1903-W53-5: it's the Friday (5) from the 53rd week, belonging to the 1903 "week-based year".

(44) By Harald Hanche-Olsen (hanche) on 2024-01-27 07:52:54 in reply to 43 [link] [source]

I can't really understand what you want to show with this.

I don°t understand it either, but it°s really very easy.

52 weeks is 364 days. A week is assigned to a given year if and only if at least four days of that week fall within the year. Do the math: For a year to have 53 weeks assigned to it, weeks 1 and 53 need to have at least four days each within the year, totalling eight days. Add to this the remaining 51 weeks, and you have 365 days.

This means that a non-leap year has 53 weeks assigned to it if, and only if, January 1 of that year is a Thursday. In leap years, you have an extra day of slack, so the requirement is for January 1 to be either a Wednesday or a Thursday.

The SQL expression strftime('%u',year||'-01-01') will get the (numerical) weekday of January 1 of a given year.

Simple, no?