Access Violation running FTS5 rank due to compiler over-optimization
(1) By Ralf on 2020-11-27 11:58:44 [link]
Embarcadero's C++ Builder may optimize out `rc==SQLITE_OK` in this for loop: <https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=676> This leads to an AV in the same line if `pData` is NULL. This test then crashes: <https://www.sqlite.org/src/info?name=4a15fb03b6c7eac6&ln=90-92> The cause may be an over-optimization or compiler bug. It seems related to the fact that the rc variable is not used within the for loop. This similar loop a few lines above uses rc, and `rc==SQLITE_OK` is compiled in: <https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=659> It helps to move the [for loop](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=675-681) into the [`if( rc==SQLITE_OK )`](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=668-673) block above. This also removes one `rc==SQLITE_OK` test. Another such test may be avoided if the [test below](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=683-689) is also moved there. The resulting code looks like this and tests fine: ```C /* Figure out the total size of the current row in tokens. */ if( rc==SQLITE_OK ){ int nTok; rc = pApi->xColumnSize(pFts, -1, &nTok); D = (double)nTok; /* Determine the BM25 score for the current row. */ for(i=0; i<pData->nPhrase; i++){ score += pData->aIDF[i] * ( ( aFreq[i] * (k1 + 1.0) ) / ( aFreq[i] + k1 * (1 - b + b * D / pData->avgdl) ) ); } /* If no error has occurred, return the calculated score. Otherwise, ** throw an SQL exception. */ sqlite3_result_double(pCtx, -1.0 * score); }else{ sqlite3_result_error_code(pCtx, rc); } ```
(2) By Dan Kennedy (dan) on 2020-11-27 16:20:44 in reply to 1 [link]
That's an interesting one. I made some changes to the way that loop is laid out here: [](https://sqlite.org/src/info/d85f4f27f58adcc7) [](https://sqlite.org/src/info/d85f4f27f58adcc7) Then I also added a cast that might be missing: [](https://sqlite.org/src/info/6ff9673847c0b417) Do these patches fix things? And, if you're in a mood to experiment, does applying the last patch alone (the one with the cast) fix things? Thanks, Dan.
(3) By Ralf on 2020-11-27 18:34:24 in reply to 2 [link]
No, the [cast patch](https://sqlite.org/src/info/6ff9673847c0b417) alone does not alter the way Embarcadero's C++ Builder compiles the [for loop](https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=676). However, the other two patches do fix the problem. The cast is not needed, but it does not hurt, either. To C++ Builder it makes no difference. There is just an (unrelated) "Warning W8004 fts5_aux.c 643: 'rc' is assigned a value that is never used in function fts5Bm25Function". At last, I noticed your fix takes into account a test for a `rc` return which I overlooked. Thanks for that, and for the fast answer!
(4) By Dan Kennedy (dan) on 2020-11-27 19:47:44 in reply to 3
Tough warning, really. [](https://sqlite.org/src/info/8edb983bc87898ef) > No, the cast patch alone does not alter the way Embarcadero's C++ Builder compiles the for loop. Thanks for trying. I couldn't actually think of a reason it would help. Just that it seems a strange bug for a widely-distributed compiler to have, so I thought it might be some strange strict aliasing optimization. Dan.