SQLite

View Ticket
Login
2013-05-06
18:49 Fixed ticket [5eaa61ea]: sigbus on disk ful in WAL mode plus 3 other changes (artifact: 77a9876b user: drh)
2012-12-15
14:33 Ticket [5eaa61ea]: 5 changes (artifact: e816a1c2 user: drh)
11:09 Open ticket [5eaa61ea]. (artifact: 48a7ec3e user: anonymous)
2012-11-13
12:14 Ticket [5eaa61ea]: 1 change (artifact: b463520a user: anonymous)
11:16 Fixed ticket [5eaa61ea]. (artifact: 9f45881f user: drh)
11:16
Strive to use posix_fallocate() rather than ftruncate() when posix_fallocate() is available. Ticket [5eaa61ea18]. (check-in: 29980b08 user: drh tags: trunk)
11:08
Use preprocessor macros to automatically detect whether or not posix_allocate() is available. (It is generally available on Linux but not on Mac.) Ticket [5eaa61ea1881040b17449ca043b6f8fd9ca55dc3] (Closed-Leaf check-in: 597333f1 user: drh tags: tkt-5eaa61ea18)
10:54
When available, use posix_fallocate() rather than ftruncate() to allocate space for mmap()ed -shm files, since posix_fallocate() gives an error if no disk space is available whereas ftruncate() is silent and leaves the system vulnerable to a SIGBUS upon first write to the mmap()ed region. Ticket [5eaa61ea1881040b17449ca043b6f8fd9ca55dc3] (check-in: 35625961 user: drh tags: tkt-5eaa61ea18)
10:34 Ticket [5eaa61ea] sigbus on disk ful in WAL mode status still Open with 1 other change (artifact: f3ef76a5 user: drh)
09:28 Ticket [5eaa61ea]: 1 change (artifact: 428b056f user: drh)
09:02 Ticket [5eaa61ea]: 1 change (artifact: ddb32911 user: anonymous)
01:28 Ticket [5eaa61ea]: 3 changes (artifact: d0be4458 user: drh)
01:24 Ticket [5eaa61ea]: 3 changes (artifact: ba077322 user: drh)
2012-11-12
14:45 New ticket [5eaa61ea]. (artifact: a1fe498c user: anonymous)

Ticket Hash: 5eaa61ea1881040b17449ca043b6f8fd9ca55dc3
Title: sigbus on disk ful in WAL mode
Status: Fixed Type: Portability
Severity: Minor Priority: Low
Subsystem: Unknown Resolution: Fixed
Last Modified: 2013-05-06 18:49:42
Version Found In: 2993ca20207f8dac02f58d01e31d68c84328356a
Description:
Attempts to prepare a query on a WAL database when the disk space is critically low result in the process killed with BUS. The crash happens in walIndexWriteHdr invoked from walIndexRecover:
 643 static void walIndexWriteHdr(Wal *pWal){
 [...]
 650   walChecksumBytes(1, (u8*)&pWal->hdr, nCksum, 0, pWal->hdr.aCksum);
 651   memcpy((void *)&aHdr[1], (void *)&pWal->hdr, sizeof(WalIndexHdr)); << sigbus here

Since i don't know if I'll be able to attache a PoC after submitting this ticket, I'm adding it here in base64.

base64 -d > mountme.bz2 << EOF
QlpoOTFBWSZTWdl2FE4AAEd////////v7/5e/////L9v/+xv38ZSbAdOQV4FRCTR3/3s0ARNY3on
aVG5ydncNEgTQE09IyngieU8gh6TINNPUAGmjQ2o0HpDQNNHqHqAAADTQ9RoeU0GR5Q8kBoZNqeo
NEJpkRojCniamZNJo02oAzUNNGmyg0DIAAAAABoAAAAAANAaAAAxFE0Jij0J6hp5NR6Rk0ANAZBo
GgAAAADQAAAAAAAAABoDQAgNDINMmmgYTQGhpk0aYEZGTQZDJkGRpkaNMg0aAGCNNBiGgAZNDJgI
0GjQJFJRNNNDIGjTIbU0HqABoaaZDINAAAADQBiAGIGjQyNNGgAZMI0ZNDJoU3MeFxkkc3Z41hbW
8jDkSjtOhgCTZvZDyAuQe9MlnyUJRsnBSKCqYknJpalrzfoZNLCdV70TdiIXxMRuU2IRkQJqG8QZ
OQb+iYlGMFPlV0OmpxTALbroIwQMEAHxal5pbiPEEKHH0kN2QtWKoGkgJsW5O0ayrdAnCOCFYLWw
lLfOMYqAyp0kAifKYYjJCJmGkTTtwu6OsOY0RUoOiYDGtSQctLi3zirl93O1IcPEMJd2yfKOo5VC
tWwG8FR62hLdnPEziKCWNRAajF4aK+BaiAUlAZ2KQdmyNlQkNBaRNnI7pJPgEa24whyQSEgCKdFC
/aaQiZaWUYc9gZLodGIx76jQRmC3DBQYgKCBI5zBjgkBPQAMNSLg5jVDAwp0lCg6hCZ3cBnTAKCr
cB3YAkh1CQjQRAjhACk7EMgIYG/qJECI59jLeOZbEgNUsCyBI1VUxJGKqGBEkyVYgcIAm0kNpA2k
Ek0adiIMAxDQmwQlHkcPEajEqCMs0KPm/J9/zgpwGR8bJkqz2cyCKk0efK1+mDyI2XAXOKDBngnh
ufJHQ/UIOjK+VQ2Co9F59shqwue3JMARlIR+wwrBKYWf7aqVbNpnE2YHnSYYO78LCCQlXsSBlmIB
gq+Mhwc5o4DY3koqRgDooKXZs6xZl9Pu99c/JOtbNB8swjv0DLMzBv1CaRMihk7i65cnPsysYte/
g2qofXNK7kUHLlkQDswL2sD2Qxg18u1zVCzMcRKmYKwk+dyYVqIGgU9AWSAGFQncaikWEUsdh1cy
DMTB1MUJJmoPKklNQgJliXgmCc48b5DvnmlACYCBG40dGAieggokSkAKv+aEPAwFF0hGpNk0JRQM
hQpUWBdCAbFtYhjq4Qnk2IF16/kvnNfAx5nFlgNSp3At7mGEVQwmTWM2Pah6bczg7MK84eAyvC0S
qZiDAVigjMRQUGiL+XmKkF1Sm6e/kUVALAMBBOMBKbkBCQEbxxkumARAAyffwCvK4lyRZC1cImG/
O6CB4G0G2dmooW+OATX2TIF8lYiCl4hSdrvCgNAxQgtgsr8iCzICwdQCJPjc8hDYS67AulWQxxmw
1WMNwqrviGcCT9CYWMaONSepUu5bLC9KfhepOKSBVBmQ8AZBSyB1ntos5HzRZMwLjGu6tPVUQKOd
ozKDukqH+aYanaMEJAE+T6suVBNc/fNGBlP1+ZqyEaJCdpvPRVi/VJMHdoKy2QHNOxo3k8kUx7rg
w8E49w6q6osXLYPmkQ2hELLO5TWp7D8RhybGAo6WXzt66MIRIkZ1kgghKRKYMAgB7YcebZbV7ada
B/Asgf4nIUJ8iKZ1qbCYDBkg7H8sTf6oDqczF/X79LiIqxb1boNG662tp/Vz7JfwSCL6w4+2fqk3
ym9/ynQJUzxwAZf7EwajzMXWBxQgiIjYJJ0YJATYEIr+KoYlEwFWSnnjkI3piIP1ycNM6496w1SD
V/Xf8tkMAB4Ffbf/i7kinChIbLsKJwA=
EOF

Then:
bunzip2 mountme.bz2
mount -o loop mountme /mnt

Finally:
# sqlite3 /mnt/db.sqlite3 
SQLite version 3.7.14.1 2012-10-04 19:37:12
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .tables
Bus error

drh added on 2012-11-13 01:28:57:
Unable to recreate the malfunction here. In fact, gdb says that the walIndexWriteHdr() function never gets called when following the steps outlined in the original bug report (which is exactly what you would expect if you are only doing a query.)


anonymous added on 2012-11-13 09:02:21:
Are you running sqlite3/gdb as root?


drh added on 2012-11-13 09:28:13:
Yes, the sqlite3 shell was run as run. I also repeated the experiment multiple times with valgrind, for what it is worth. No issues observed.

Furthermore, I tried creating write-ahead log files that needed to be recovered (since the original problem statement said that the segfault occurred during recovery) and recover them, as root, with zero space left on the device. Still no problems.

The segfault occurs on a line that is attempting to write into a newly allocated mmap-ed file (the db.sqlite-shm file, specifically). It appears that Linux will allow disk space for mmap-ed files to be overcommitted. That is to say, based on my experiments, you can mmap more space than you have on disk and the mmap() call still works. Perhaps you system is configured in some way different from mine (perhaps it is also under memory pressure) so that the mmap-ed region is becoming unmapped somehow?


drh added on 2012-11-13 10:34:17:
Able to reproduce the problem now....


anonymous added on 2012-11-13 12:14:38:
Confirmed resolved. Thanks a lot for the very quick fix!

User Comments:
anonymous added on 2012-12-15 11:09:50:
I'm really sorry, but I have to reopen the ticket.

The following commit 597333f1024092b94bcd8772541e19a0f707bd40 (http://www.sqlite.org/src/info/597333f102) breaks the compilation on systems with no posix_fallocate available (e.g. systems using uClibc as libc-implementation, i.e. a lot of embedded systems).

Inverting the result of configure test is incorrect, i.e. the value of HAVE_POSIX_FALLOCATE should be left as is.

I didn't test it myself, but from reading the code I would say the problem reported in this ticket is only solved for systems with posix_fallocate available. posix_fallocate is optional (s. http://pubs.opengroup.org/onlinepubs/009695399/functions/posix_fallocate.html), therefore the ticket is only partially solved. Thus reopened.

Regards,
Gene