SQLite Forum

Access Violation running FTS5 rank due to compiler over-optimization
Login

Access Violation running FTS5 rank due to compiler over-optimization

(1) By Ralf on 2020-11-27 11:58:44

Embarcadero's C++ Builder may optimize out `rc==SQLITE_OK` in this for loop: <https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=676>
  
This leads to an AV in the same line if `pData` is NULL. This test then crashes: <https://www.sqlite.org/src/info?name=4a15fb03b6c7eac6&ln=90-92>
  
The cause may be an over-optimization or compiler bug. It seems related to the fact that the rc variable is not used within the for loop. This similar loop a few lines above uses rc, and `rc==SQLITE_OK` is compiled in: <https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=659>

It helps to move the [for loop](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=675-681) into the [`if( rc==SQLITE_OK )`](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=668-673) block above. This also removes one `rc==SQLITE_OK` test. Another such test may be avoided if the [test below](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=683-689) is also moved there.

The resulting code looks like this and tests fine:

```C
  /* Figure out the total size of the current row in tokens. */
  if( rc==SQLITE_OK ){
    int nTok;
    rc = pApi->xColumnSize(pFts, -1, &nTok);
    D = (double)nTok;

    /* Determine the BM25 score for the current row. */
    for(i=0; i<pData->nPhrase; i++){
      score += pData->aIDF[i] * (
        ( aFreq[i] * (k1 + 1.0) ) /
        ( aFreq[i] + k1 * (1 - b + b * D / pData->avgdl) )
      );
    }

    /* If no error has occurred, return the calculated score. Otherwise,
    ** throw an SQL exception.  */
    sqlite3_result_double(pCtx, -1.0 * score);
  }else{
    sqlite3_result_error_code(pCtx, rc);
  }
```

(2) By Dan Kennedy (dan) on 2020-11-27 16:20:44 in reply to 1 [link]

That's an interesting one. I made some changes to the way that loop is laid out here:

[](https://sqlite.org/src/info/d85f4f27f58adcc7)
[](https://sqlite.org/src/info/d85f4f27f58adcc7)

Then I also added a cast that might be missing:

[](https://sqlite.org/src/info/6ff9673847c0b417)

Do these patches fix things?

And, if you're in a mood to experiment, does applying the last patch alone (the one with the cast) fix things?

Thanks,

Dan.

(3) By Ralf on 2020-11-27 18:34:24 in reply to 2 [link]

No, the [cast patch](https://sqlite.org/src/info/6ff9673847c0b417) alone does not alter the way Embarcadero's C++ Builder compiles the [for loop](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=676).

However, the other two patches do fix the problem. The cast is not needed, but it does not hurt, either. To C++ Builder it makes no difference.

There is just an (unrelated) "Warning W8004 fts5_aux.c 643: 'rc' is assigned a value that is never used in function fts5Bm25Function".

At last, I noticed your fix takes into account a test for a `rc` return which I overlooked. Thanks for that, and for the fast answer!

(4) By Dan Kennedy (dan) on 2020-11-27 19:47:44 in reply to 3 [link]

Tough warning, really. 

[](https://sqlite.org/src/info/8edb983bc87898ef)

> No, the cast patch alone does not alter the way Embarcadero's C++ Builder compiles the for loop.

Thanks for trying. I couldn't actually think of a reason it would help. Just that it seems a strange bug for a widely-distributed compiler to have, so I thought it might be some strange strict aliasing optimization.

Dan.