Issues with sqlite3IsNaN() and HAVE_ISNAN

(1) By anonymous on 2021-09-05 20:40:13 [link] [source]

Hello,

I'm running SQLite under the somewhat esoteric RISC OS and I'm having an issue where sqlite3IsNaN() sometimes returns the wrong result. I believe that the issue is that double and u64 (unsigned long long int) have different word orders on this platform, and therefore the call to memcpy() gives us a u64 with the words the wrong way around.

The easiest solution is to use the isnan() function provided by the system. I've changed sqlite3IsNaN() accordingly and have confirmed that it works, but I'd prefer not to make changes to core files if it all possible.

According to Compile-time Options, HAVE_ISNAN will make SQLite use the system isnan() function. However, this doesn't work. From what I can tell by looking at the 3.36.0 amalgamation, HAVE_ISNAN doesn't do anything, and isnan() doesn't seem to be referenced anywhere.

Have I missed something? Is there a way that I can get SQLite to use the system isnan() function without modifying the amalgamation?

Aside from that, everything seems to be working correctly. Thank you for SQLite! :)

Chris

(2) By ddevienne on 2021-09-06 07:02:05 in reply to 1 [link] [source]

Not sure, but it's possible it is one of those compile-time options that works only on the canonical sources, not the amalgamation.

There are a few that way, that need to be defined at configure time, not compile time.
Although that's a total guess on my part, from vague recollection of past messages on this ML.

See also how to compile which has details, and lists pre-amalgamation options,
but that list being 13 years old; that's perhaps the reason HAS_INAN is not in there?

Could also be a small bug of course. We'll know soon for sure I suspect.

(3) By kai zhu (kaizhu256) on 2021-09-06 11:35:17 in reply to 2 [link] [source]

a quick search for HAVE_ISNAN on sqlite's github-mirror shows 3 occurences in source-code -- that don't do anything except define/undef itself.

https://github.com/sqlite/sqlite/search?q=have_isnan

(4) By Richard Hipp (drh) on 2021-09-06 11:48:10 in reply to 1 [source]

double and u64 (unsigned long long int) have different word orders on this platform

If that is true, then probably the database files generated by SQLite do not follow the spec. The sqlite3IsNan() might be the least of your worries.

Please try this experiment:

Generate a database that contains floating point values (with fractional parts - example: 4.5 not 4.0) on RISC OS.
Move the database file to another platform (Windows, Linux, or Mac).
Check to see if the floating point values are preserved.

Let me know the result of the experiment.

Meanwhile, the use of system isnan() has been restored by check-in b3cfe23bec0b95ca. But remember: Don't just blindly take this patch and decide "it works!" because you may well have deeper and subtler problems.

(5) By J.M. Aranda (JMAranda) on 2021-09-06 12:00:22 in reply to 4 [link] [source]

Admired Doctor, the Real number 4.0 has a fractional part with value 0. The integer number 4 does not have a fractional part. Excuse the pedantic comment, but one has seen a lot of mathematics.

(6) By Ryan Smith (cuz) on 2021-09-06 12:15:42 in reply to 5 [link] [source]

One may have seen a lot of mathematics, but one has seen a very limited amount of SQLite documentation apparently. One can assert that because the documentation shows that SQLite will store 4.0 as 4 (Integer)[1] which does not help the demonstration of possible floating point byte-order problem in the stored data.

[1] From SQLite Data Types: "As an internal optimization, small floating point values with no fractional component and stored in columns with REAL affinity are written to disk as integers in order to take up less space and are automatically converted back into floating point as the value is read out. This optimization is completely invisible at the SQL level and can only be detected by examining the raw bits of the database file".

(7) By anonymous on 2021-09-06 20:25:15 in reply to 4 [link] [source]

Thank you. You're correct, and the problem is so obvious in hindsight.

It appears that if I keep the database file "RISC OS-exclusive" then it returns the correct data, but naturally that's not ideal. In any case I need to run some more tests to see whether there are still some subtle bugs lurking elsewhere (I actually found this current one while running SQL Logic Test).

Apparently I'm not the only person to make this mistake. There's an old port of 3.3.6 that I've just tested, and unfortunately it exhibits the same issue.

I can at least confirm that the new check-in fixes the isnan() problem. Thank you for the fast turnaround.

Chris

(8) By doug (doug9forester) on 2021-09-07 00:14:34 in reply to 7 [link] [source]

Just for interest, let's say I have a database on RISC OS where the bytes are swapped for the U64 (or the other one) fields. How would I go about creating a "correct" database file that works on systems other than the RISC?

(9) By Larry Brasfield (larrybr) on 2021-09-07 00:23:10 in reply to 8 [link] [source]

Use the CLI shell to .dump your DB while running on the RISC platform. Using the .dump output, you can recreate DB schema and reload the DB tables using a version of the shell that honors the portable DB format.

(10) By anonymous on 2021-09-07 23:42:51 in reply to 7 [link] [source]

Just a quick update: With this isnan() change implemented, all of SQL Logic Test now passes :)

Now to figure out how hard it is to get the on-disc data into the right format... but I don't expect any sort of official support for that!

Chris

(11) By doug (doug9forester) on 2021-09-08 06:18:37 in reply to 10 [link] [source]

Chris, see Larry's suggestion above.

(12) By anonymous on 2021-09-08 07:56:31 in reply to 11 [link] [source]

If I understand correctly, Larry's suggestion will export the data to text so I can then import it on a, say, Unix machine and get a 'proper' database file. It's a suitable workaround in a pinch.

However, if practical (and that's a big if) I'd like to fix the RISC OS version so that its binary files conform to the spec in the first place. That's what I'm wondering about the practicality of.

(13) By Richard Hipp (drh) on 2021-09-08 10:49:08 in reply to 7 [link] [source]

Please run this test for me on your RISC OS system:

make sqlite3 dbtotxt
rm -f byteorder.db
./sqlite3 byteorder.db 'CREATE TABLE t1 AS SELECT 1579060583422312000000.0 x;'
./dbtotxt byteorder.db

Post the output of "dbtotxt" command on this chat thread. I'm expecting to see this output:

| size 8192 pagesize 4096 filename byteorder.db
| page 1 offset 0
|      0: 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00   SQLite format 3.
|     16: 10 00 01 01 00 40 20 20 00 00 00 01 00 00 00 02   .....@  ........
|     32: 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 04   ................
|     48: 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00   ................
|     80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01   ................
|     96: 00 2e 57 48 0d 00 00 00 01 0f dc 00 0f dc 00 00   ..WH............
|   4048: 00 00 00 00 00 00 00 00 00 00 00 00 22 01 06 17   ................
|   4064: 11 11 01 31 74 61 62 6c 65 74 31 74 31 02 43 52   ...1tablet1t1.CR
|   4080: 45 41 54 45 20 54 41 42 4c 45 20 74 31 28 78 29   EATE TABLE t1(x)
| page 2 offset 4096
|      0: 0d 00 00 00 01 0f f4 00 0f f4 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 0a 01 02 07 44 55 66 77 88 99 aa ff   ........DUfw....

I'm interested in the last 8 bytes of output, which should be "44 55 66 77 88 99 aa ff". What are those 8 bytes on your RISC OS system?

(14) By anonymous on 2021-09-09 05:12:25 in reply to 13 [link] [source]

The last two words are swapped, but the result is otherwise identical (apart from the version bytes at 96, since I'm on 3.36.0).

| size 8192 pagesize 4096 filename byteorder
| page 1 offset 0
|      0: 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00   SQLite format 3.
|     16: 10 00 01 01 00 40 20 20 00 00 00 01 00 00 00 02   .....@  ........
|     32: 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 04   ................
|     48: 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00   ................
|     80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01   ................
|     96: 00 2e 53 60 0d 00 00 00 01 0f dc 00 0f dc 00 00   ..S`............
|   4048: 00 00 00 00 00 00 00 00 00 00 00 00 22 01 06 17   ................
|   4064: 11 11 01 31 74 61 62 6c 65 74 31 74 31 02 43 52   ...1tablet1t1.CR
|   4080: 45 41 54 45 20 54 41 42 4c 45 20 74 31 28 78 29   EATE TABLE t1(x)
| page 2 offset 4096
|      0: 0d 00 00 00 01 0f f4 00 0f f4 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 0a 01 02 07 88 99 aa ff 44 55 66 77   ............DUfw
| end byteorder

Chris

(15) By Richard Hipp (drh) on 2021-09-09 10:07:22 in reply to 14 [link] [source]

Thanks.

Please recompile SQLite using -DSQLITE_MIXED_ENDIAN_64BIT_FLOAT and rerun the experiment for me.

(16) By anonymous on 2021-09-09 19:07:20 in reply to 15 [link] [source]

The resulting database file and dbtotxt output are identical.

(17) By Richard Hipp (drh) on 2021-09-09 19:45:03 in reply to 16 [link] [source]

Double-check your results please. You should have gotten a correct database as in post (13), with the last four bytes in ascending order of value.

Another thing to try is to compile with -DSQLITE_DEBUG. With that option enabled, you should get an assertion fault if you try to read or write a floating point value using the wrong byte order.

Based on the experience of this thread, the next release might raise an error if you try to run sqlite3_open() using an SQLite that has been built with the wrong SQLITE_MIXED_ENDIAN_64BIT_FLOAT setting, regardless of the SQLITE_DEBUG setting.

(18) By anonymous on 2021-09-10 04:25:37 in reply to 17 [link] [source]

Sorry; apparently I accidentally ran the wrong executable. The output of dbtotxt does indeed match your post (13) with -DSQLITE_MIXED_ENDIAN_64BIT_FLOAT specified.

Thank you again for your assistance!

Chris