Access Violation running FTS5 rank due to compiler over-optimization
(1) By Ralf on 2020-11-27 11:58:44 [link] [source]
Embarcadero's C++ Builder may optimize out rc==SQLITE_OK
in this for loop: https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=676
This leads to an AV in the same line if pData
is NULL. This test then crashes: https://www.sqlite.org/src/info?name=4a15fb03b6c7eac6&ln=90-92
The cause may be an over-optimization or compiler bug. It seems related to the fact that the rc variable is not used within the for loop. This similar loop a few lines above uses rc, and rc==SQLITE_OK
is compiled in: https://www.sqlite.org/src/info?name=afe8c2394cf6de2a&ln=659
It helps to move the for loop into the if( rc==SQLITE_OK )
block above. This also removes one rc==SQLITE_OK
test. Another such test may be avoided if the test below is also moved there.
The resulting code looks like this and tests fine:
/* Figure out the total size of the current row in tokens. */
if( rc==SQLITE_OK ){
int nTok;
rc = pApi->xColumnSize(pFts, -1, &nTok);
D = (double)nTok;
/* Determine the BM25 score for the current row. */
for(i=0; i<pData->nPhrase; i++){
score += pData->aIDF[i] * (
( aFreq[i] * (k1 + 1.0) ) /
( aFreq[i] + k1 * (1 - b + b * D / pData->avgdl) )
);
}
/* If no error has occurred, return the calculated score. Otherwise,
** throw an SQL exception. */
sqlite3_result_double(pCtx, -1.0 * score);
}else{
sqlite3_result_error_code(pCtx, rc);
}
(2) By Dan Kennedy (dan) on 2020-11-27 16:20:44 in reply to 1 [link] [source]
That's an interesting one. I made some changes to the way that loop is laid out here:
https://sqlite.org/src/info/d85f4f27f58adcc7 https://sqlite.org/src/info/d85f4f27f58adcc7
Then I also added a cast that might be missing:
https://sqlite.org/src/info/6ff9673847c0b417
Do these patches fix things?
And, if you're in a mood to experiment, does applying the last patch alone (the one with the cast) fix things?
Thanks,
Dan.
(3) By Ralf on 2020-11-27 18:34:24 in reply to 2 [link] [source]
No, the cast patch alone does not alter the way Embarcadero's C++ Builder compiles the for loop.
However, the other two patches do fix the problem. The cast is not needed, but it does not hurt, either. To C++ Builder it makes no difference.
There is just an (unrelated) "Warning W8004 fts5_aux.c 643: 'rc' is assigned a value that is never used in function fts5Bm25Function".
At last, I noticed your fix takes into account a test for a rc
return which I overlooked. Thanks for that, and for the fast answer!
(4) By Dan Kennedy (dan) on 2020-11-27 19:47:44 in reply to 3 [source]
Tough warning, really.
https://sqlite.org/src/info/8edb983bc87898ef
No, the cast patch alone does not alter the way Embarcadero's C++ Builder compiles the for loop.
Thanks for trying. I couldn't actually think of a reason it would help. Just that it seems a strange bug for a widely-distributed compiler to have, so I thought it might be some strange strict aliasing optimization.
Dan.