UBSAN error with function types
(1.2) By Nate A (nackerma) on 2023-11-30 15:34:21 edited from 1.1 [source]
LLVM 17 reports function pointer misuse within sqlite3:
../../../../distro/sqlite3.c:138858:5: runtime error: call to function agginfoFree through pointer to incorrect function type 'void (*)(struct sqlite3 *, void *)'
sqlite3ParserAddCleanup(pParse,
(void(*)(sqlite3*,void*))agginfoFree, pAggInfo);
The second argument of agginfoFree is cast as a void* to be able to register many functions of different signatures. This is technically undefined behavior. Bringing this to your attention in case you are interested in fixing these types of things.
https://reviews.llvm.org/D148827#4422631
(2.1) By Stephan Beal (stephan) on 2023-11-30 15:37:48 edited from 2.0 in reply to 1.2 [link] [source]
This is technically undefined behavior.
Technically it is, but we're unaware of any extant environments where it's not well-behaved.
Edit: if such a platform were to suddenly appear it would have issues many pieces of software which use dlsym() to fish out function pointers.
(3) By Donal Fellows (dkfellows) on 2023-12-01 16:39:28 in reply to 2.1 [link] [source]
Specifically, POSIX's definition of dlsym()
essentially requires ordinary function and data pointers to be the same size and come from the same address space. Windows does something similar (but with a different basic operation name). The combination covers the enormous majority of systems deployed out there, except for many embedded devices (though a lot of those are too small to run either a conventional OS or SQLite).
(4) By anonymous on 2023-12-06 16:22:43 in reply to 2.1 [link] [source]
It will become a problem if sqlite is built with CFI. See https://maskray.me/blog/2022-12-18-control-flow-integrity#fsanitizefunction.
(5) By Bo Lindbergh (_blgl_) on 2023-12-06 17:42:47 in reply to 2.1 [link] [source]
Mentioning dlsym
confuses things, because casting a function pointer to a data pointer is a different type of undefined behaviour. The error Nate A reports can be fixed like this:
Index: src/select.c
==================================================================
--- src/select.c
+++ src/select.c
@@ -7035,11 +7035,12 @@
}
/*
** Deallocate a single AggInfo object
*/
-static void agginfoFree(sqlite3 *db, AggInfo *p){
+static void agginfoFree(sqlite3 *db, void *vp){
+ AggInfo *p = vp;
sqlite3DbFree(db, p->aCol);
sqlite3DbFree(db, p->aFunc);
sqlite3DbFreeNN(db, p);
}
@@ -8015,11 +8016,11 @@
** SELECT statement.
*/
pAggInfo = sqlite3DbMallocZero(db, sizeof(*pAggInfo) );
if( pAggInfo ){
sqlite3ParserAddCleanup(pParse,
- (void(*)(sqlite3*,void*))agginfoFree, pAggInfo);
+ agginfoFree, pAggInfo);
testcase( pParse->earlyCleanup );
}
if( db->mallocFailed ){
goto select_end;
}
(Plus similar changes for every other function used with sqlite3ParserAddCleanup
.)
In short, a (well-defined) data pointer type conversion inside the function lets you remove the (ill-defined (and ugly!)) function pointer type cast when installing it as a callback.
(6) By Stephan Beal (stephan) on 2023-12-07 11:35:38 in reply to 5 [link] [source]
Mentioning dlsym confuses things, because casting a function pointer to a data pointer is a different type of undefined behaviour.
Indeed, i misunderstood it to be that other type of UB.
Richard has since patched these casts so that they're well-defined.