Odd buffer overflow
I'm trying to develop a new application using SQLite 3.32.2 on Windows. During startup, the code is creating a database with 7 tables and indexes. When one of those indexes is being created, I'm seeing a buffer overrun on some memory allocated by SQLite.
It's happening in a buffer allocated to build the "INSERT INTO 'main'.sqlite_master..." statement. Looking at the call stack, sqlite3Malloc() is being called to allocate 0x91 bytes, that gets rounded up to 0x98 bytes before malloc() is called, but then 0x99 bytes are written to that buffer (the 0x99'th byte is the NULL terminator). When the buffer is eventually freed, the memory manager is detecting the overrun.
I'm a bit stumped on how to correct this. I've probably done something wrong somewhere but I'm only creating tables and indexes so far. I have found that if I just shorten or lengthen the name of the index, the overrun doesn't occur. And it's only happening on one of the indexes.
(2) By doug (doug9forester) on 2020-06-09 21:31:46 in reply to 1 [link] [source]
Please post all of your SQL so we can try it.
(3) By Larry Brasfield (LarryBrasfield) on 2020-06-09 21:35:01 in reply to 1 [link] [source]
If the issue is as simple as you have made it out to be, you have uncovered a bug in SQLite. It would be a great help if you can show the SQL that produces the overrun. A sequence of inputs to the sqlite3.exe shell that reproduces the problem would be best. If it does not reproduce that way, the effort may lead to your own discovery of what you might be giving to sqlite3_prepare...() that is unexpected and mishandled by the parser/analyzer process. It still represents a bug (most likely), but may narrow the repro quite a bit.
I'm not using sqlite.exe but compiling the amalgamation version of sqlite3.c into my code. The overrun is being detected by a checked heap.
I've reduced the number of SQL statements to just two:
CREATE TABLE aaaaaa (bbb_id TEXT,ccccccc_id TEXT,ddd TEXT); CREATE INDEX aaaaaa_ccccccc_id ON aaaaaa (bbb_id ASC,ccccccc_id ASC);
If it helps, at some point sqlite3StrAccumEnlarge() gets called with N=8. The string at this point is 0x44 bytes long
"INSERT INTO 'main'.sqlite_master VALUES('index','aaaaaa_ccccccc_id',". sqlite3DbRealloc() then gets called with p->nAlloc = 0x91.
It's that allocation that gets rounded up to 0x98 and then overrun by 1 byte. I haven't been able to follow what's happening to see why yet.
OK I think I might have found the reason. After allocating memory, SQLite is ultimately calling _msize() to get the actual amount of allocated memory, but the size returned includes the guards added by the checked heap, and it's those that later get overwritten.
Sorry for wasting your time.
(6.1) By Larry Brasfield (LarryBrasfield) on 2020-06-10 14:47:36 edited from 6.0 in reply to 5 [link] [source]
It is not clear to me that this has been a waste of time.
If _msize() is part of the regular allocation API, then either its contract is not being met or SQLite has no business writing outside the limit its return imposes. Yet a name like '_msize' with its leading underscore is unlikely to be part of the same published API as malloc(), realloc() and free(), and reliance upon it by SQLite is arguably incorrect.
I believe this issue deserves the attention of the SQLite developers, more so now that you have shown a simple repro scenario. Perhaps you could state what allocator debug tool/package you use to help induce this problem.
If you were to edit the subject, to contain the word 'BUG', it would be more likely that one of the SQLite developers will reexamine the use of _msize(). It may be a bug in the allocator debug aid, but I doubt it. More likely is that the return from _msize() should not be interpreted as it has been to give the go-ahead to overwriting that guard zone.
Should one of the developers look into this, this _msize_dbg() doc should be consulted. It specifically states the very behavior which produced the OP's issue, saying "but _msize_dbg adds two debugging features: It includes the buffers on either side of the user portion of the memory block in the returned size". To me, this clearly indicates that the "user portion" bounds cannot be inferred from the _msize() return, at least not without some extra work to account for the guard bytes.
_msize() interface is in the standard library (on some platforms).
It is suppose to work like
malloc_usable_size(). If X is a pointer
obtained from malloc(), then
_msize(X) is suppose to return the number
of usable bytes in the memory allocation.
It appears that the
_msize() implementation on the OP's system is
_msize() is not under the control of SQLite, so this
sounds like a system issue to me.
You can work-around the problem by compiling with
In that case the
_msize() routine will not be used. (SQLite will
substitute its own implementation, which is less efficient, but it guaranteed
(8) By Larry Brasfield (LarryBrasfield) on 2020-06-10 14:49:31 in reply to 7 [link] [source]
Please see my edit to what was post #6 replying to #5. I believe your supposition is incorrect, as documented in that edit.
But SQLite calls
Except the documentation you pointed to says...
When the application is linked with a debug version of the C run-time libraries, _msize resolves to _msize__dbg
So if the OP was linking to a debug version they would have got _msize_dbg
(11) By Larry Brasfield (LarryBrasfield) on 2020-06-10 15:13:02 in reply to 9 [link] [source]
For the MSCRT, _msize() is documented to act as the OP has reported. See the 2nd paragraph under "Remarks" where how it resolves to _msize_dbg() is stated. (Sorry for not detailing this earlier.)
(12) By Richard Damon (RichardDamon) on 2020-06-11 00:33:05 in reply to 11 [source]
This basically means that it doesn't conform the the 'standard' that SQLite is assuming here (or the standard that everyone else uses for this extension).
(13.1) By Larry Brasfield (LarryBrasfield) on 2020-06-11 02:22:26 edited from 13.0 in reply to 9 [link] [source]
Expanding on what _msize_dbg() is documented or supposed to do:
The difference between what _msize_dbg(...) returns (as documented) and what _msize(...) returns is (sizeof(_CrtMemBlockHeader) + nNoMansLandSize). This can be seen at CRT Debug Heap Details, remarks under "Find buffer overruns with debug heap" and in the next code block.
With that adjustment for debug builds (using the MSCRT debug heap), _msize() could be made to return what one might suppose it should under the assumption that debug instrumentation should not change the API function. A suitably placed re-#define would do it:
#undef _msize #define _msize(p) (_msize_dbg(p) - sizeof(_CrtMemBlockHeader) - nNoMansLandSize)
Ugly, yes. Worse than avoiding code that uses _msize(), arguably not.
Once the contract putatively enforced by those guard bytes is considered, it as arguably required to use realloc() to do what is done with the combination of an _msize() call and then writing beyond the earlier requested allocation. After all, realloc is supposed to be smart enough to leave the block in place if there is room at its end.
For my test I was building a unit test around some of my code that is using SQLite using CppUTest, so it's a DEBUG WIN32 console application compiled using Visual Studio.
The checked heap is CppUTest's one. I believe it replaces malloc() and free() with its own wrappers to add instrumentation and guards, but it isn't replacing msize() as well, so I assume SQLite is seeing the size of the underlying allocation including the CppUTest guards.
I'm now using
-DDSQLITE_WITHOUT_MSIZE. This also makes things closer to how it'll behave on the embedded platform I'll eventually be using.