SQLite Forum

Segfault in memjrnlWrite()
Login
I'm seeing a segfault in memjrnlWrite(), which happens sometimes, but not every time, during a specific SQL sequence when running my application using sqlite-amalgamation-3350500.  From what I can tell of my testing, it does *not* happen with sqlite-amalgamation-3320300 (testing a negative though, hence some caution).

If I build & run with asan / ubsan, this get segfault eventually picked up as an attempted NULL pointer access, but the application is otherwise clean and stable.  I believe I've seen this both on a 32-bit ARMv7 build of my application, and on 64-bit x86_64.


```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:342
342             movl    %ecx, -4(%rdi,%rdx)
(gdb) ba
#0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:342
#1  0x00007f84e6ec6e5d in memjrnlWrite (pJfd=0x14cb568, zBuf=0x7ffcc7f221bc, iAmt=4, iOfst=6252500) at sqlite3.c:98184
#2  0x00007f84e6e7fd25 in sqlite3OsWrite (id=0x14cb568, pBuf=0x7ffcc7f221bc, amt=4, offset=6252500) at sqlite3.c:23339
#3  0x00007f84e6e904fa in write32bits (fd=0x14cb568, offset=6252500, val=73533) at sqlite3.c:53268
#4  0x00007f84e6e93e9a in subjournalPage (pPg=0x17bb690) at sqlite3.c:56634
#5  0x00007f84e6e93f28 in subjournalPageIfRequired (pPg=0x17bb690) at sqlite3.c:56649
#6  0x00007f84e6e95ca9 in sqlite3PagerWrite (pPg=0x17bb690) at sqlite3.c:58297
#7  0x00007f84e6ea3bfa in insertCell (pPage=0x17bb6d8, i=122, pCell=0x14fc40c "\t\003\003\003\003\242\023\020%3\001\001\001\001\b\t\b\b", sz=10, pTemp=0x0, iChild=0, pRC=0x7ffcc7f22394) at sqlite3.c:71755
#8  0x00007f84e6ea7adc in sqlite3BtreeInsert (pCur=0x1481ec0, pX=0x7ffcc7f223f0, flags=0, seekResult=0) at sqlite3.c:73862
#9  0x00007f84e6ebe993 in sqlite3VdbeExec (p=0x14e35b8) at sqlite3.c:91803
#10 0x00007f84e6eb5a68 in sqlite3Step (p=0x14e35b8) at sqlite3.c:84331
#11 0x00007f84e6eb5cbe in sqlite3_step (pStmt=0x14e35b8) at sqlite3.c:84388
#12 0x00007f84e6ef387a in sqlite3_exec (db=0x147a548, zSql=0x7ffcc7f22e60 "INSERT INTO envncellref(cellid, channelid, code) SELECT  58663, newchan.id, oldncr.code FROM  envncellref AS oldncr, envplmnchannel AS oldchan, envplmnchannel AS newchan, envplmn AS oldplmn, envplmn A"..., xCallback=0x0, pArg=0x0, pzErrMsg=0x0) at sqlite3.c:125293
...
#20 0x0000000000415c77 in main (argc=1, argv=0x7ffcc7f24618) at main.c:171
(gdb) frame 1
#1  0x00007f84e6ec6e5d in memjrnlWrite (pJfd=0x14cb568, zBuf=0x7ffcc7f221bc, iAmt=4, iOfst=6252500) at sqlite3.c:98184
98184           memcpy((u8*)p->endpoint.pChunk->zChunk + iChunkOffset, zWrite, iSpace);
(gdb) print p
$1 = (MemJournal *) 0x14cb568
(gdb) print *p
$2 = {pMethod = 0x7f84e6f58940 <MemJournalMethods>, nChunkSize = 1016, nSpill = -1, pFirst = 0x1d712f8, endpoint = {iOffset = 6252500, pChunk = 0x0}, readpoint = {iOffset = 0, pChunk = 0x0}, flags = 8222, pVfs = 0x7f84e6f5aea0 <aVfs.76>, zJournal = 0x0}
(gdb) print p->endpoint.pChunk 
$3 = (FileChunk *) 0x0
```


The INSERT is actually nested inside a SELECT from a temporary table, and there's some other INSERTs also happening as rows are copied and modified according to business logic.  All the access is happening on the same connection and within a savepoint.  There are no threads or other database connections within the segfaulting process, though there maybe other _processes_ attempting to concurrently access the same database.  

(Edit: Further testing shows the crash also happens when only one process is accessing the database and the others are stopped.)

My config is like this:

```
3.35.5 2021-04-19 18:32:05 1b256d97b553a9611efca188a3d995a2fff712759044ba480f9a0c9e98fae886
COMPILER=gcc-10.3.1 20210422 (Red Hat 10.3.1-1)
DEFAULT_FOREIGN_KEYS
DEFAULT_WAL_SYNCHRONOUS=1
ENABLE_API_ARMOR
HAVE_ISNAN
LIKE_DOESNT_MATCH_BLOBS
MAX_EXPR_DEPTH=0
OMIT_AUTHORIZATION
OMIT_DECLTYPE
OMIT_DEPRECATED
OMIT_LOAD_EXTENSION
OMIT_PROGRESS_CALLBACK
OMIT_SHARED_CACHE
OMIT_UTF16
REVERSE_UNORDERED_SELECTS
THREADSAFE=1
```

I'm very willing (hopeful even) to consider this is a bug in my application or API usage, but I'm struggling to see what could have gone wrong to cause this.

Any suggestions for things to try would be gratefully received, though I've not been able to make a simple reproduction of this case yet.