SQLite Forum

GCC bug?

GCC bug?

(1.1) By Richard Hipp (drh) on 2023-07-05 18:24:26 edited from 1.0 [source]

I'm getting inconsistent results for some floating-point calculations in GCC. Is this a GCC bug, or am I doing something wrong?

A repro case is shown at the bottom of this post. Compile on Linux x86 thusly:

gcc bug.c -m32 -O1 && ./a.out

The computed r[1] value is incorrect. The problem vanishes and the correct answer is computed under any one or more of the following conditions:

  • Omit the -m32 flag. (In other words, generate a 64-bit x64 binary).
  • Turn off optimization. Omit the -O option or make it -O0.
  • Use the -ffloat-store option.
  • Uncomment the two "printf()" statements that are commented out.

I'm seeing this issue in gcc-9.4.0 (Ubuntu 20.04.5 LTS) and in gcc-5.4.0 (Ubuntu 16.04.7 LTS).

This issue causes test case failures in SQLite. Does anybody have a suggested work-around?

#include <stdio.h>
#include <string.h>
void dekkerMul2(double *x, double y, double yy){
  double hx, tx, hy, ty, p, q, c, cc;
  unsigned long long int m;
  memcpy(&m, &x[0], 8);
  m &= 0xfffffffffc000000L;
  memcpy(&hx, &m, 8);              printf("hx   = %+.30e\n", hx);
  tx = x[0] - hx;                  printf("tx   = %+.30e\n", tx);
  memcpy(&m, &y, 8);
  m &= 0xfffffffffc000000L;
  memcpy(&hy, &m, 8);              printf("hy   = %+.30e\n", hy);
  ty = y - hy;                     printf("ty   = %+.30e\n", ty);
  p = hx*hy;                       printf("p    = %+.30e\n", p);
  q = hx*ty + tx*hy;               printf("q    = %+.30e\n", q);
  c = p + q;                   //  printf("c    = %+.30e\n", c);
  cc = p - c;                      printf("cc   = %+.30e\n", cc);
  cc += q + tx*ty;                 printf("cc   = %+.30e\n", cc);
  cc = x[0]*yy + x[1]*y + cc;      printf("cc   = %+.30e\n", cc);
  x[0] = c + cc;               //  printf("x[0] = %+.30e\n", x[0]);
  x[1] = c - x[0] + cc;            printf("x[1] = %+.30e\n", x[1]);
int main(void){
  double r[2];
  r[0] = 6.4029304042108156e+38;
  r[1] = 0.0;
  dekkerMul2(r, 1.0e-10, -3.6432197315497741579e-27);
  printf("r[0] = %+.30e\n", r[0]);
  printf("r[1] = %+.30e <-- suspect result\n", r[1]);
  printf("       %+.30e <-- correct answer\n",

(2) By Richard Hipp (drh) on 2023-07-05 19:02:22 in reply to 1.1 [link] [source]

Some dodgy #ifdef and #pragma logic at check-in 5d9e9364808793d6 seems to work around the problem. Please speak up if you see a better solution.

(3) By Richard Hipp (drh) on 2023-07-05 19:50:49 in reply to 1.1 [link] [source]

GCC devs report that this issue is a dup of a prior bug. They recommend using either -fexcess-precision=standard or -std=c99 to work around. Those both work for me when entered on the command-line. However, I could not get them to work with #pragma. And we do not want "sqlite3.c" to require special compiler options - it should "just work". Hence, I'll stay with the dodgy #ifdef and #pragma solution unless and until somebody can suggest a better solution.

(4) By Larry Brasfield (larrybr) on 2023-07-05 20:08:38 in reply to 3 [link] [source]

The history on that "bug", spanning over 23 years now, shows that it is not considered a bug. FWIW, I agree with the analysis and conclusion.

An interesting work-around appears in comment #87, where the use of the "volatile" storage class is used to achieve the same effect as that #pragma, except in a more targeted way.

As I understand the issue, there is little reason to expect that it will be unique to gcc. For that reason, I think judicious use of volatile would be a better fix because it would apply to present and future compilers other than gcc whose optimizers might induce the same behavior.

(6) By Richard Hipp (drh) on 2023-07-05 20:37:29 in reply to 4 [link] [source]

You try using "volatile". I couldn't get it to do anything. Let me know if you find a solution using volatile.

(7.1) By Larry Brasfield (larrybr) on 2023-07-05 21:41:27 edited from 7.0 in reply to 6 [link] [source]

Using your post #1 code as a baseline, I was able to get the "suspect" and "correct" results to match thusly (changes per diff):

< void dekkerMul2(double *x, double y, double yy){
<   double hx, tx, hy, ty, p, q, c, cc;
> void dekkerMul2(volatile double *x, double y, double yy){
>   volatile double hx, tx, hy, ty, p, q, c, cc;

I found that both the argument and local "volatile" qualifier were needed, and did not try anything more restricted.

This was using gcc 11.3 .

(8) By Richard Hipp (drh) on 2023-07-05 21:56:13 in reply to 7.1 [link] [source]

Very good. I didn't try adding volatile to the function parameters. I reproduce your results, except that I omitted volatile from hx and hy.

On the other hand, there is no performance difference. But the executable size is around a hundred bytes larger. So you can argue that "volatile" is "cleaner" and less likely to cause problems on other compilers. But I note:

  • The problem manifests as a very small error in binary-to-decimal conversion - literally an error in the 16-th significant digit.

  • The problem only comes up on i686. That's pretty much a legacy platform at this point, isn't it?

So I'm not sure yet that I want to make the change. Convince me.

(10) By Larry Brasfield (larrybr) on 2023-07-05 22:25:18 in reply to 8 [link] [source]

Convince me.

I thought the point regarding compiler independence, together with the (as-yet undemonstrated) notion that the same issue could arise with other compilers, was pretty convincing.

The fact is that the code is doing things where the extra few bits of precision kept in FP registers instead of double memory slots makes a difference. The "volatile" qualifier can be thought of as a way to tell any compiler "The significant part of this quantity is just what is stored in such-and-such memory slot; please use only that in further calculation."

It is much like "#pragma GCC optimize("float-store")", except written in C rather than gcc-speak.

Sorry if I'm just repeating myself.

(11) By ddevienne on 2023-07-06 09:43:53 in reply to 10 [link] [source]

The "volatile" qualifier can be thought of as a way to tell any compiler "The significant part of this quantity is just what is stored in such-and-such memory slot; please use only that in further calculation

I'm no expert, but that's NOT how I'd try to explain volatile...

The way I think about it instead, is to inform the compiler that a particular variable
(i.e. memory slot) may be modified in ways outside the program's control
(meaning by other means than the code the compiler see, and thus can reason about),
and thus the compiler should refrain from aggresive optimizations on that variable,
which typically means avoiding registers and prefer reloading it from memory instead.

But in practice, I know I don't know enough, and won't rely on volatile :)
I think it's useful/essential to multi-threading experts. Not regular devs. My $0.02.

(12) By Code Hz (codehz) on 2023-07-06 10:29:45 in reply to 11 [link] [source]

NEVER use volatile for multi-threading purpose, it does not provide any synchronisation, nor does it ensure the order of execution of operations. volatile was specifically intended to be used when interfacing with memory-mapped hardware, signal handlers, and the setjmp machine code instruction.

(13) By ddevienne on 2023-07-06 11:38:24 in reply to 12 [link] [source]

FWIW, I didn't mean volatile replaced atomics, but I've seen I believe usage of it in that MT context. In any case, OT, and since SQLite is C89-compatible and the memory model of C wasn't specific until C11, maybe not even portable.

(14.1) By Larry Brasfield (larrybr) on 2023-07-06 15:36:55 edited from 14.0 in reply to 11 [link] [source]

[Grammo fixed via edit.]

I was not attempting an explanation of volatile generally. It might have been clearer for me to write, "The use of the 'volatile' qualifier here can be thought of as ...", or maybe not. The context makes this evident, and needless words can detract from the point being made.

The effect that I claim is one which necessarily flows from the semantics of volatile. After the value is written to the (volatile) memory slot, the code generator is obliged to read it from that slot when the value is later reused via the slot's name. I saw no need to tutor Richard upon this fact. So my sub-point was just a reminder.

(15) By anonymous on 2023-07-10 22:18:48 in reply to 10 [link] [source]

Not every architecture/compiler/compile options need volatile here to give correct results, it would be nice to be able to disable use of volatile.

/* Use of voltile enabled by default,
user can define SQLITE_FP_USE_VOLATILE as 0 to disable it. */

#define FP_VOLATILE volatile

Index: src/func.c
--- src/func.c
+++ src/func.c
@@ -1685,15 +1685,15 @@
 ** Variables are marked "volatile" to defeat c89 x86 floating point
 ** optimizations can mess up this algorithm.
 static void kahanBabuskaNeumaierStep(
-  volatile SumCtx *pSum,
-  volatile double r
+  FP_VOLATILE SumCtx *pSum,
+  FP_VOLATILE double r
-  volatile double s = pSum->rSum;
-  volatile double t = s + r;
+  FP_VOLATILE double s = pSum->rSum;
+  FP_VOLATILE double t = s + r;
   if( fabs(s) > fabs(r) ){
     pSum->rErr += (s - t) + r;
     pSum->rErr += (r - t) + s;
@@ -1701,11 +1701,11 @@
 ** Add a (possibly large) integer to the running sum.
-static void kahanBabuskaNeumaierStepInt64(volatile SumCtx *pSum, i64 iVal){
+static void kahanBabuskaNeumaierStepInt64(FP_VOLATILE SumCtx *pSum, i64 iVal){
   if( iVal<=-4503599627370496LL || iVal>=+4503599627370496LL ){
     i64 iBig, iSm;
     iSm = iVal % 16384;
     iBig = iVal - iSm;
     kahanBabuskaNeumaierStep(pSum, iBig);
@@ -1717,11 +1717,11 @@
 ** Initialize the Kahan-Babaska-Neumaier sum from a 64-bit integer
 static void kahanBabuskaNeumaierInit(
-  volatile SumCtx *p,
+  FP_VOLATILE SumCtx *p,
   i64 iVal
   if( iVal<=-4503599627370496LL || iVal>=+4503599627370496LL ){
     i64 iSm = iVal % 16384;
     p->rSum = (double)(iVal - iSm);

Index: src/util.c
--- src/util.c
+++ src/util.c
@@ -390,20 +390,20 @@
 ** Reference:
 **   T. J. Dekker, "A Floating-Point Technique for Extending the
 **   Available Precision".  1971-07-26.
-static void dekkerMul2(volatile double *x, double y, double yy){
+static void dekkerMul2(FP_VOLATILE double *x, double y, double yy){
   ** The "volatile" keywords on parameter x[] and on local variables
   ** below are needed force intermediate results to be truncated to
   ** binary64 rather than be carried around in an extended-precision
   ** format.  The truncation is necessary for the Dekker algorithm to
   ** work.  Intel x86 floating point might omit the truncation without
   ** the use of volatile. 
-  volatile double tx, ty, p, q, c, cc;
+  FP_VOLATILE double tx, ty, p, q, c, cc;
   double hx, hy;
   u64 m;
   memcpy(&m, (void*)&x[0], 8);
   m &= 0xfffffffffc000000LL;
   memcpy(&hx, &m, 8);

(16) By Richard Hipp (drh) on 2023-07-10 23:10:39 in reply to 15 [link] [source]

Why would that be "nice"? What does it accomplish? My gut is that it is better to avoid the added #ifdef complexity. But I'm open to opposing ideas, if you want to try to convince me.

(17) By anonymous on 2023-07-11 09:53:50 in reply to 16 [link] [source]

As far s I know, that workaround is needed only for x86 and only if x87 fpu is used. If SSE is used for floating point calculations or code is compiled for any other architecture than x86 then the results are correct with or without the volatile.

The volatile prevents not only the incorrect optimizations but also the correct ones, unnecesarily degrading performance.

When compiling for 64-bit x86, compilers default to emit SSE instructions for floating point operations, and most compilers I've tested on godbolt.org even do it when compiling for 32-bit x86.

(18) By David Jones (vman59) on 2024-06-21 22:54:46 in reply to 15 [link] [source]

I just encountered a compiler that mishandles function parameters declared as volatile double, which affects one of the support routines for implementing the SUM/AVG/TOTAL aggregate functions (making them fail). Since the compiler vendor may view that usage of volatile as very rare, I don't know what priority they'll give to fixing their bug.

From what I can tell, the address of the variable in question never has its address taken so declaring it volatile is semantically meaningless in this situation.

Is there a unit test for checking the validity of the sumStep function and correct operation the Kahan-Babuska-Neumaier algorithm?

(19) By Richard Hipp (drh) on 2024-06-24 16:38:33 in reply to 18 [link] [source]

Here is a script you can feed into the CLI that tests sumStep using the Kahan-Babuska-Neumaier algorithm:

.echo on
.mode qbox
.testctrl uselongdouble off
INSERT INTO t1(x) VALUES(1e50),(17.0),(-1e50);
SELECT sum(x) FROM t1;

(20) By David Jones (vman59) on 2024-06-26 11:19:33 in reply to 19 [link] [source]

That's useful. For the time being, I added an auto extension to override the built sum, total, and avg functions. It's a re-implementation with some re-factoring which avoid using volatile. Single letter variable names (e.g. r, s, t) may be the convention for academic work, but they are terrible for making understandable code.

(5) By Spindrift (spindrift) on 2023-07-05 20:13:55 in reply to 3 [link] [source]

That bug is 23 years old.... Impressive stuff.

(9) By anonymous on 2023-07-05 22:02:03 in reply to 1.1 [link] [source]

Apparently when compiling for 64-bit, GCC by default emits SSE instrucions but for 32-bit it by default emits 387 instructions. 387 uses extended precision by default (it can be changed by setting the Precision Control Field in the x86 FPU Control Word). If -ffloat-store is enabled then results are always stored into memory and load back for subsequent calculations instead of just using value in the register. SSE uses single or double precision depending on machine instruction opcode. You can choose whether you want GCC to emit SSE or 387 instructions, either by adding -mfpmath=sse or -mfpmath=387 respectively to the command line, or by using pragma:

#pragma GCC push_options
#pragma GCC target("fpmath=sse")

 /* your functions here */

#pragma GCC pop_options