SQLite Forum

Is typeof() and hex() unique ?
Login

Is typeof() and hex() unique ?

(1.1) By Simon Slavin (slavin) on 2020-12-03 13:51:27 edited from 1.0 [link] [source]

I just stumbled across hex() which I hadn't considered before.

I have a need to know very precisely what's in a weird database someone else made up. They weren't a good programmer or perhaps they didn't know SQLite. Also I think they expected SQLite to enforce data types in column definitions, which it didn't, so there's some inconsistency. So I'm writing my own program to check all values in their data. Here's the question:

Is the combination of typeof() and hex() of a value unique ? In other words, if I dump both those things do I know exactly what was in the value, no matter what type it is ? Assuming I don't have any long strings or BLOBs in the database, can anyone think of any circumstance where this wouldn't work ?

(2) By Larry Brasfield (LarryBrasfield) on 2020-12-03 14:08:01 in reply to 1.1 [link] [source]

The hex() function converts whatever byte sequence represents the value into hexadecimal, so it must vary when the value varies. The typeof() function produces a text form of the value's type as stored in the DB. Hence, of course the combination is unique for any given value and type.

If there was a circumstance where that was not true, it could only happen due to a bug in SQLite.

(3) By Richard Hipp (drh) on 2020-12-03 14:14:55 in reply to 2 [link] [source]

Counterexample:

CREATE TABLE t1(x);
INSERT INTO t1 VALUES(3.1999999999999999),(3.2000000000000001);
SELECT typeof(x), hex(x) FROM t1; -- shows the same
SELECT ieee754(x) FROM t1;        -- two values are subtly different

(5) By Larry Brasfield (LarryBrasfield) on 2020-12-03 14:44:59 in reply to 3 [link] [source]

On v3.34.0, that last query returns ieee754(3602879701896397,-50) ieee754(3602879701896397,-50) , which are so subtly different that I can see no difference whatsoever.

If what you say is true, then what is the meaning of the claim made for hex() that it "interprets its argument as a BLOB and returns a string which is the upper-case hexadecimal rendering of the content of that blob"? Are you saying that there can be content which somehow escapes being rendered into hexadecimal by the hex() function? And, if that is true, I would claim that the quoted documentation is incorrect.

(6) By Richard Hipp (drh) on 2020-12-03 14:51:20 in reply to 5 [link] [source]

I am guessing that you are running Windows...

Here is an alternative script that shows the same problem and which should (I believe) work on Windows.

CREATE TABLE t1(x);
INSERT INTO t1 VALUES
  (ieee754(7205759403792793,-51)),
  (ieee754(3602879701896397,-50));
SELECT typeof(x), hex(x) FROM t1;  -- This make both rows appear the same
SELECT ieee754(x) FROM t1;         -- But this query shows they are different

(7) By Richard Hipp (drh) on 2020-12-03 14:58:13 in reply to 5 [link] [source]

The problem is that the two rows of table t1 contain similar, but slightly different values for x. The first row contains a value of 3.199999999999999733546474089962430298328399658203125 and the second row contains 3.20000000000000017763568394002504646778106689453125. When converted from IEEE-754 double format into text, both values render as just "3.2". The "hex()" function works by first converting the binary into text, then converting the text into hexadecimal. Thus, the first step in hex() where the floating point number is converted into text losses the least significant bits of the number, obscuring the fact that they two numbers are different.

(8) By Larry Brasfield (LarryBrasfield) on 2020-12-03 15:08:53 in reply to 7 [link] [source]

To me, as someone who read the hex() documentation, that intermediate text conversion comes as a real surprise. Would you not agree that this detail is worthy of some explication in that function's description?

Is there any simple way to achieve Simon's objective?

(9.1) By Keith Medcalf (kmedcalf) on 2020-12-03 15:31:13 edited from 9.0 in reply to 8 [link] [source]

The hex function documentation says:

The hex() function interprets its argument as a BLOB and returns a string which is the upper-case hexadecimal rendering of the content of that blob.

That is exactly what the hex() function does.

SQLite version 3.35.0 2020-12-03 14:42:09
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table x(x);
sqlite> insert into x values (6.4);
sqlite> select x, typeof(x), hex(x), cast(x as blob), typeof(cast(x as blob)), hex(cast(x as blob)) from x;
┌─────┬───────────┬────────┬─────────────────┬─────────────────────────┬──────────────────────┐
│  x  │ typeof(x) │ hex(x) │ cast(x as blob) │ typeof(cast(x as blob)) │ hex(cast(x as blob)) │
├─────┼───────────┼────────┼─────────────────┼─────────────────────────┼──────────────────────┤
│ 6.4 │ real      │ 362E34 │ 6.4             │ blob                    │ 362E34               │
└─────┴───────────┴────────┴─────────────────┴─────────────────────────┴──────────────────────┘

https://sqlite.org/lang_expr.html#castexpr

discusses exactly what it means to cast something as a blob (it is the first entry in the table). Note that "blob" and "none" are mostly the same, except that "blob" is a "datatype" (storage class) and "none" is an affinity except that "none" is not really a recognized storage class/datatype and the string "blob" is used where this usage is meant.

(10) By Richard Hipp (drh) on 2020-12-03 15:20:54 in reply to 8 [source]

I think Simon wants:

SELECT quote(x) FROM t1;

If you run that on my example, it does indeed show that the two entries are different:

sqlite> SELECT quote(x) FROM t1;
3.19999999999999973354e+00
3.2

(11) By Mark Lawrence (mark) on 2020-12-03 15:23:04 in reply to 7 [link] [source]

So why does CASTing into BLOBs first not work?

CREATE TABLE t1(x,y);
INSERT INTO t1 VALUES (
    ieee754(7205759403792793,-51),
    ieee754(3602879701896397,-50)
);
SELECT ieee754(x) = ieee754(y) AS ieee754_compare FROM t1;

ieee754_compare
---------------
0

SELECT CAST(x AS BLOB) = CAST(y AS BLOB) AS blob_compare FROM t1;

blob_compare
------------
1

(12.1) By Mark Lawrence (mark) on 2020-12-03 15:31:59 edited from 12.0 in reply to 11 [link] [source]

Deleted

(13) By Keith Medcalf (kmedcalf) on 2020-12-03 15:41:35 in reply to 11 [link] [source]

Because when someone converts a floating point value into a printable text string they usually do not care about accuracy (otherwise they would not do it). For this reason the printed value is dorked with so as not to offend the eyes of the huge number of persons who do not understand how computers work -- because those people tend to be more vocal in their misunderstanding, their wish for dorkification usually wins out over those who prefer precision and exactitude.

There is a tradeoff to be made between those who want exactitude and want to see every floating point number to 26 digits or more of precision and those others who want to see the value expressed as "or thereabouts more or less with a boulder of salt".

Those who wish precision are free to not convert precise exactitude IEEE-754 floating point number approximations to "representation for the masses" and use them directly as they see fit.

(15) By Mark Lawrence (mark) on 2020-12-03 16:00:03 in reply to 13 [link] [source]

I didn't know that CAST x AS BLOB had anything to do with strings, which apparently it does as detailed by Richard below, and as also indicated all over the documentation (which I hadn't actually read/remembered).

I find that rather strange behaviour.

(16) By Keith Medcalf (kmedcalf) on 2020-12-03 16:20:32 in reply to 15 [link] [source]

Not at all. It is rather expected behaviour and is universally understood and makes perfect sense.

What would you propose as an alternate?

(14) By Richard Hipp (drh) on 2020-12-03 15:48:37 in reply to 11 [link] [source]

Casting a numeric value (an integer or floating point number) into a BLOB means that the value is rendered as UTF8 text, and then that text is interpreted as a BLOB. Casting a numeric value into a BLOB does not show you the internal binary representation of the numeric value.

Your value for x is:

7205759403792793*pow(2,-51) → 3.199999999999999733546474089962430298328399658203125

And your value for y is:

7205759403792794*pow(2,-51) → 3.20000000000000017763568394002504646778106689453125

Those are different numbers. (Note the one-epsilon difference in the mantissas.) So

SELECT x=y FROM t1;

Will return false. But when these numbers are converted into text, the value is rounded and both values come out as just "3.2".

SELECT CAST(x AS text)=CAST(y AS text) FROM t1;

Should give an answer of true.

(4) By Keith Medcalf (kmedcalf) on 2020-12-03 14:23:32 in reply to 1.1 [link] [source]

The HEX function works on TEXT or BLOB values and returns the hex encoding of the bucket-o-bytes. The contents of the bucket-o-bytes for a text field depends on the "underlying database encoding". Values which are NOT NULL, TEXT, or BLOB are "printed" (converted to ASCII TEXT) before being hexified:

SQLite version 3.35.0 2020-12-02 03:04:06
Enter ".help" for usage hints.
sqlite> pragma encoding = 'UTF-16le';
sqlite> pragma encoding;
UTF-16le
sqlite> create table test(x);
sqlite> insert into test values (null), ('Text'), (x'012345'), (15), (3.14159);
sqlite> select x, typeof(x), hex(x) from test;
|null|
Text|text|5400650078007400
?|blob|012345
15|integer|3135
3.14159|real|332E3134313539

The default encoding is UTF-8:

SQLite version 3.35.0 2020-12-02 03:04:06
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table test(x);
sqlite> insert into test values (null), ('Text'), (x'012345'), (15), (3.14159);
sqlite> select x, typeof(x), hex(x) from test;
┌─────────┬───────────┬────────────────┐
│    x    │ typeof(x) │     hex(x)     │
├─────────┼───────────┼────────────────┤
│         │ null      │                │
│ Text    │ text      │ 54657874       │
│ ☺#E     │ blob      │ 012345         │
│ 15      │ integer   │ 3135           │
│ 3.14159 │ real      │ 332E3134313539 │
└─────────┴───────────┴────────────────┘