DATA RACE: Found in sqlite3.c
(1) By Zu-Ming Jiang (jiang446079653) on 2020-04-29 02:14:31 [link]
Dear SQLite developers: I used my fuzz-testing tool, connzer, to detect data race in SQLite. Here is a data race found by connzer. I wish you can help me check whether it is a real race, thanks!! The following is the race report. ## Race report ## Race object: `pInfo->nBackfill` **Thread 1:** **Access:** `pInfo->nBackfill==pWal->hdr.mxFrame` **Line number:** `sqlite3.c, 61048` **Call stack:** 1. `walTryBeginRead()` 2. `sqlite3WalBeginReadTransaction()` 3. `pagerBeginReadTransaction()` 4. `sqlite3PagerSharedLock()` 5. `lockBtree()` 6. `sqlite3BtreeBeginTrans()` 7. `sqlite3VdbeExec()` 8. `sqlite3Step()` 9. `sqlite3_step()` 10. `execsql_i64_x()` 11. `walthread3_thread()` 12. `launch_thread_main()` **Lock:** * None **Thread 2:** **Access:** `pInfo->nBackfill = 0;` **Line number:** `sqlite3.c, 60269` **Call stack:** 1. `walRestartHdr()` 2. `walRestartLog()` 3. `sqlite3WalFrames()` 4. `pagerWalFrames()` 5. `sqlite3PagerCommitPhaseOne()` 6. `sqlite3BtreeCommitPhaseOne()` 7. `vdbeCommit()` 8. `sqlite3VdbeHalt()` 9. `sqlite3VdbeExec()` 10. `sqlite3_step->sqlite3Step()` 11. `execsql_i64_x()` 12. `walthread3_thread()` 13. `launch_thread_main()` **Lock:** * None My fuzzer finds that these 2 accesses can be executed concurrently, and they are not protected by any lock, so my fuzzer report this race.
(2) By Zu-Ming Jiang (jiang446079653) on 2020-05-02 07:28:58 in reply to 1 [link]
What do you think about this data race?
(3) By Richard Hipp (drh) on 2020-05-02 11:33:14 in reply to 2
I believe your analysis is incorrect. `pInfo->nBackfill` is part of a shared-memory segment that is visible to multiple processes. Access to that shared-memory segment is serialized using file locks, not mutexes. Perhaps your tool only looks at mutexes and fails to properly account for the affect of file locks.
(4) By Zu-Ming Jiang (jiang446079653) on 2020-05-03 01:59:01 in reply to 3 [link]
Yes, you are right. File locks is not considered. Thanks for your response.
(5) By Zu-Ming Jiang (jiang446079653) on 2020-05-13 10:42:45 in reply to 3 [link]
**I think there may be some problem in your file locks.** To check whether the race is real, I set breakpoints before these two accesses when they are running in the call stack described above. I find that the breakpoints can be activated simultaneously, and the address of race variables are same. I think these result can prove that **the race actually happens**. So these accesses is not serialized successfully.
(8) By Wout Mertens (wmertens) on 2020-05-20 08:52:44 in reply to 5 [link]
Do you mean that you tested this with working file locks, and it triggers the breakpoints simultaneously? I think the reason you're not getting replies is because it requires very specic knowledge to verify your findings, and most of us don't have it. If you could post a step-by-step of how to reproduce, that would really help!
(6) By Zu-Ming Jiang (jiang446079653) on 2020-05-14 12:25:30 in reply to 3 [link]
Could you confirm this race? Because I have successfully reproduced this race by using breakpoints, I think there may be some problems in the file locks.
(7) By Zu-Ming Jiang (jiang446079653) on 2020-05-19 10:29:00 in reply to 3 [link]
I think it is a real race because I have reproduced it. Please help me confirm it.