SQLite Forum

DATA RACE: Found in sqlite3.c

DATA RACE: Found in sqlite3.c

(1) By Zu-Ming Jiang (jiang446079653) on 2020-04-29 02:14:31 [link] [source]

Dear SQLite developers:

I used my fuzz-testing tool, connzer, to detect data race in SQLite. Here is a data race found by connzer. I wish you can help me check whether it is a real race, thanks!!

The following is the race report.

Race report

Race object: pInfo->nBackfill

Thread 1:

Access: pInfo->nBackfill==pWal->hdr.mxFrame

Line number: sqlite3.c, 61048

Call stack:

  1. walTryBeginRead()
  2. sqlite3WalBeginReadTransaction()
  3. pagerBeginReadTransaction()
  4. sqlite3PagerSharedLock()
  5. lockBtree()
  6. sqlite3BtreeBeginTrans()
  7. sqlite3VdbeExec()
  8. sqlite3Step()
  9. sqlite3_step()
  10. execsql_i64_x()
  11. walthread3_thread()
  12. launch_thread_main()


  • None

Thread 2:

Access: pInfo->nBackfill = 0;

Line number: sqlite3.c, 60269

Call stack:

  1. walRestartHdr()
  2. walRestartLog()
  3. sqlite3WalFrames()
  4. pagerWalFrames()
  5. sqlite3PagerCommitPhaseOne()
  6. sqlite3BtreeCommitPhaseOne()
  7. vdbeCommit()
  8. sqlite3VdbeHalt()
  9. sqlite3VdbeExec()
  10. sqlite3_step->sqlite3Step()
  11. execsql_i64_x()
  12. walthread3_thread()
  13. launch_thread_main()


  • None

My fuzzer finds that these 2 accesses can be executed concurrently, and they are not protected by any lock, so my fuzzer report this race.

(2) By Zu-Ming Jiang (jiang446079653) on 2020-05-02 07:28:58 in reply to 1 [link] [source]

What do you think about this data race?

(3) By Richard Hipp (drh) on 2020-05-02 11:33:14 in reply to 2 [link] [source]

I believe your analysis is incorrect.

pInfo->nBackfill is part of a shared-memory segment that is visible to multiple processes. Access to that shared-memory segment is serialized using file locks, not mutexes. Perhaps your tool only looks at mutexes and fails to properly account for the affect of file locks.

(4) By Zu-Ming Jiang (jiang446079653) on 2020-05-03 01:59:01 in reply to 3 [link] [source]

Yes, you are right. File locks is not considered.

Thanks for your response.

(5) By Zu-Ming Jiang (jiang446079653) on 2020-05-13 10:42:45 in reply to 3 [source]

I think there may be some problem in your file locks.

To check whether the race is real, I set breakpoints before these two accesses when they are running in the call stack described above. I find that the breakpoints can be activated simultaneously, and the address of race variables are same. I think these result can prove that the race actually happens. So these accesses is not serialized successfully.

(6) By Zu-Ming Jiang (jiang446079653) on 2020-05-14 12:25:30 in reply to 3 [link] [source]

Could you confirm this race? Because I have successfully reproduced this race by using breakpoints, I think there may be some problems in the file locks.

(7) By Zu-Ming Jiang (jiang446079653) on 2020-05-19 10:29:00 in reply to 3 [link] [source]

I think it is a real race because I have reproduced it. Please help me confirm it.

(8) By Wout Mertens (wmertens) on 2020-05-20 08:52:44 in reply to 5 [link] [source]

Do you mean that you tested this with working file locks, and it triggers the breakpoints simultaneously?

I think the reason you're not getting replies is because it requires very specic knowledge to verify your findings, and most of us don't have it.

If you could post a step-by-step of how to reproduce, that would really help!