.mode quote surprise

(1) By anonymous on 2023-02-23 11:42:18 [link] [source]

sqlite> .version
SQLite 3.41.0 2023-02-21 18:09:37 05941c2a04037fc3ed2ffae11f5d2260706f89431f463518740f72ada350866d
zlib version 1.2.11
clang-12.0.0
sqlite> .mode quote
sqlite> .headers off
sqlite> select 13.0;
12.999999999999999999
sqlite>

This seems to affect .mode quote only. .mode table --quote and its relatives produce unsurprising output.

(2) By L Carl (lcarlp) on 2023-02-23 16:04:58 in reply to 1 [link] [source]

.mode insert has the same anomaly.

(3) By anonymous on 2023-02-23 16:05:07 in reply to 1 [link] [source]

Quote mode is a kind of a serialization format, so the data could be prep'ed in the textual form. Thus floats are output in the "maximum" precision.

"13.0" is indeed stored in IEEE 754 floating point format as equivalent to decimal "12.999999999999999999", while, for example "11.0" is stored as "11.000000000000000000", thus:

sqlite> .mode quote
sqlite> select 11.0;
11.0

Quoting from help:

.mode MODE ?TABLE?     Set output mode where MODE is one of:
...
                         quote    Escape answers as for SQL

(4) By anonymous on 2023-02-23 16:18:40 in reply to 3 [link] [source]

You can't blame the IEEE 754 64-bit binary format for this. Both 13.0 and 11.0 have exact representations in it.

(5) By jchd (jchd18) on 2023-02-23 17:07:24 in reply to 3 [link] [source]

"13.0" is indeed stored in IEEE 754 floating point format as equivalent to decimal "12.999999999999999999"

Wrong!

Input value = 13
FP (hex) = 0x402A000000000000
Sign = +
Exponent = 3
Scaling = 2^3 = 8
Mantissa = 1/2 + 1/8

Nearby exact double values (exact computation from FP bits)
PrePrevious = +12.999999999999996447286321199499070644378662109375
Previous = +12.9999999999999982236431605997495353221893310546875
Value = +13
Next = +13.0000000000000017763568394002504646778106689453125
NextNext = +13.000000000000003552713678800500929355621337890625

The ULPs (Unit of Least Precision) around 13 are -1.77635683940025e-15 and +1.77635683940025e-15

(6) By anonymous on 2023-02-23 18:32:46 in reply to 5 [link] [source]

Thanks for the correction. 13 should be presentable exactly in FP double.

To refocus onto the original question, there must be some manipulation involved in .mode quote which in this case somehow drops the resulting value to the previous float.

(7) By anonymous on 2023-02-23 19:40:28 in reply to 3 [link] [source]

As a followup, the issue here apparently is not tied to .mode quote, it may have to do with how the input values are being converted into floats.

Testing this

sqlite> .mode
current output mode: list

sqlite> create table d(v double);
sqlite> insert into d(v) values(11.0),(13.0),(18.0);

sqlite> select printf("%.20f", v) from d;
11.00000000000000000000
12.99999999999999000000
17.99999999999999000000

(8) By Keith Medcalf (kmedcalf) on 2023-02-23 20:01:02 in reply to 6 [link] [source]

That would mean that the output algorithms used by SQLite3 are incorrect:

sqlite> select format('%!.26f', 13.0);
┌───────────────────────────────┐
│    format('%!.26f', 13.0)     │
├───────────────────────────────┤
│ '12.999999999999999999566319' │
└───────────────────────────────┘

(9) By Keith Medcalf (kmedcalf) on 2023-02-23 20:04:26 in reply to 7 [link] [source]

It is the output (float -> text) conversion that is propagating errors somewhere.

(10.1) By Keith Medcalf (kmedcalf) on 2023-02-24 01:33:49 edited from 10.0 in reply to 9 [source]

Found the bug. In order for the divide by 10 / multiply by 10 cancellation to work properly, the following change needs to be made.

Index: src/printf.c
==================================================================
--- src/printf.c
+++ src/printf.c
@@ -135,11 +135,11 @@
   if( (*cnt)<=0 ) return '0';
   (*cnt)--;
   digit = (int)*val;
   d = digit;
   digit += '0';
-  *val = (*val - d)*10.0;
+  *val = *val * 10.0 - d * 10.0;
   return (char)digit;
 }
 #endif /* SQLITE_OMIT_FLOATING_POINT */

 /*

Skip it -- this only fixes cancellations up to 100.

(11) By anonymous on 2023-02-24 02:43:08 in reply to 1 [link] [source]

What bugs me is the number of different strategies used for essentially the same task.

sqlite3_value_text (which .mode table --quote as well as cast(value as text) at the SQL level boil down to):
format with "%!.15g"
.mode quote:
format with "%!.20g"
.mode insert:
if the value can be losslessly cast to sqlite3_int64, format with "%lld.0"
otherwise, format with "%!.20g"
the quote SQL function:
if the result of formatting with "%!.15g" scans back to the same value, use that
otherwise, format with "%!.20e"

(12) By Larry Brasfield (larrybr) on 2023-02-24 03:23:11 in reply to 11 [link] [source]

What bugs me ...

It does not bug me to the slightest degree that some of the SQLite library functionality is exposed by the CLI. (That's largely what the CLI is for.)

The different output modes have different purposes, as ".help mode" will show. So I am also not bugged by some overlap among output modes.

Do you suggest that any of the output modes are useless? If so, which ones? If not, can you find solace in their utility to those who use them?

(13) By anonymous on 2023-02-24 04:23:18 in reply to 12 [link] [source]

Do you suggest that any of the output modes are useless?

I really don't see how what I wrote could be interpreted that way. I was talking about lack of consistency: there are multiple transformations available that sometimes produce different results, and they are all called "quoting".

(14) By anonymous on 2023-02-24 04:39:45 in reply to 1 [link] [source]

In defense of the devs, it needs to be noted that floating point base conversion is an annoyingly hard problem. If you need the best possible result, the worst cases require arbitrary-precision arithmetic.

(15) By Donal Fellows (dkfellows) on 2023-02-24 09:46:34 in reply to 14 [link] [source]

But there are multiple languages that get this right already. It's just that the SQLite shell isn't built using one of them (it depends on just how good the implementation of the C library is, and those are... variable).

(16) By Richard Hipp (drh) on 2023-02-24 11:56:46 in reply to 10.1 [link] [source]

A very clever and elegant solution. However, this patch sometimes gives an imprecise answer on machines that lack 80-bit floating point number support (ARM, and Windows on x86/x64). Example:

SELECT select round(80917.65,1);

The expected result is 80917.7, but on ARM or Windows, this patch yields 80917.6.

There is an alternative solution now checked into trunk.

(17) By Keith Medcalf (kmedcalf) on 2023-02-24 14:52:29 in reply to 16 [link] [source]

Very nice and passes the "trippable" test!

with testvalues(value) as materialized
     (
         select randomv() * pow(10, 308 - randomv(616)) as value
           from generate_series(0,1000000)
      union all
         select trunc(randomv() * pow(10, 308 - randomv(616))) as value
           from generate_series(0,1000000)
     )
select format('%!.26g', value) as ExactValue,
       format('%!.26g', newvalue) as ExactNewValue
  from (
        select value,
               cast(format('%!.26g', value) as real) as newvalue
          from testvalues
       )
 where value != newvalue
;

where RandomV([buckets]) is defined thusly to return a either a random real or the bucket number in which the random real would fall:

SQLITE_PRIVATE void _randomValue(sqlite3_context *context, int argc, sqlite3_value **argv)
{
    sqlite_uint64 r;
    register long double value;

    if ((argc > 1) || ((argc == 1) && (sqlite3_value_type(argv[0]) != SQLITE_INTEGER)))
    {
        return;
    }
    sqlite3_randomness(sizeof(r), &r);
    value = (long double)r / 18446744073709551615.0L;
    if (argc == 1)
    {
        sqlite3_result_int64(context, floor(sqlite3_value_int64(argv[0]) * value));
    }
    else
    {
        sqlite3_result_double(context, value);
    }
}

(18) By Keith Medcalf (kmedcalf) on 2023-02-24 15:21:58 in reply to 17 [link] [source]

Note that when long double is not available (such as MSVC on x64 Windows -- it is just fine if you use a proper compiler that is not Microsoft) pathological cases are only "trippable" to 10 significant digits. That is, the following is the minimum case that produces no output rows:

with testvalues(value) as materialized
     (
         select randomv() * pow(10, 308 - randomv(616)) as value
           from generate_series(0,1000000)
      union all
         select trunc(randomv() * pow(10, 308 - randomv(616))) as value
           from generate_series(0,1000000)
     )
select format('%!.26g', value) as ExactValue,
       format('%!.26g', newvalue) as ExactNewValue
  from (
        select value,
               cast(format('%!.26g', value) as real) as newvalue
          from testvalues
       )
 where sigdigits(value,10) != sigdigits(newvalue,10)
;

where sigdigits performs half-even rounding to the specified number of significant digits, thusly:

SQLITE_PRIVATE void _rounddigitsFunc(sqlite3_context *context, int argc, sqlite3_value **argv)
{
    static int maxdigits = 15;
    register long double x;
    register int digits = maxdigits;

    if ((argc==0) || (argc>2) || (sqlite3_value_type(argv[0]) == SQLITE_NULL))
    {
        return;
    }

    x = sqlite3_value_double(argv[0]);
    if (x == 0.0)
    {
        sqlite3_result_double(context, x);
        return;
    }
    if (argc == 2)
    {
        digits = sqlite3_value_int(argv[1]);
    }
    digits = (digits > 0 ? (digits <= maxdigits ? digits : maxdigits) : 1) - ceil(log10(fabs(x)));
    sqlite3_result_double(context, x - remainderl(x, powl(10.0, -digits)));
    return;
}

Compilers supporting long double as 80-bits (or longer) do not require significant digit half-even rounding and are accurately trippable to the full 16.95 digits of precision available to an IEEE-754 binary64 floating point number.

(19) By Keith Medcalf (kmedcalf) on 2023-02-24 17:28:41 in reply to 18 [link] [source]

The lack of long double support in MSVC x64 compilers also affects the input (text->real) routines. If you compensate for errors introduced there by applying "computed exactly rounded half-even to the number of significant digits" and apply the same "computed exactly and rounded half-even to the number of significant digits" to the source values, then doubles up to 14 significant digits are properly trippable.

create virtual table temp.trippable using statement((
with tests(tests) as
     (
         values (cast(coalesce(:tests, 1000000) as integer))
     ),
     digits(digits) as
     (
         values (cast(coalesce(:digits, 18) as integer))
     ),
     randomreal(value) as materialized
     (
         select sigdigits(
                          randomv() * pow(10, 308 - randomv(616)),
                          (select digits from digits)
                         ) as value
           from generate_series(
                                1,
                                (select tests from tests)
                               )
      union all
         select sigdigits(
                          trunc(randomv() * pow(10, 308 - randomv(616))),
                          (select digits from digits)
                         ) as value
           from generate_series(
                                1,
                                (select tests from tests)
                               )
     )
select format('%!.26g', value) as ExactValue,
       format('%!.26g', newvalue) as ExactNewValue,
       ulps(value, newvalue) as Distance
  from (
        select value as value,
               sigdigits(
                         cast(format('%!.26g', value) as real),
                         (
                          select digits
                            from digits
                         )
                        ) as newvalue
          from randomreal
       )
 where value != newvalue
 limit 10
));

Compilers producing code that has "long double" being at least 80-bit extended precision, and that carries out all intermediate computations in extended precision, can trip the full precision of an IEEE-754 double.