SQLite Forum

DATA RACE: Found in sqlite3.c

DATA RACE: Found in sqlite3.c

(1) By Zu-Ming Jiang (jiang446079653) on 2020-04-29 02:14:31 [link]

Dear SQLite developers:

I used my fuzz-testing tool, connzer, to detect data race in SQLite. Here is a data race found by connzer. I wish you can help me check whether it is a real race, thanks!!

The following is the race report.

## Race report ##

Race object: `pInfo->nBackfill`

**Thread 1:**

**Access:** `pInfo->nBackfill==pWal->hdr.mxFrame`

**Line number:** `sqlite3.c, 61048`

**Call stack:**

1. `walTryBeginRead()`
2. `sqlite3WalBeginReadTransaction()`
3. `pagerBeginReadTransaction()`
4. `sqlite3PagerSharedLock()`
5. `lockBtree()`
6. `sqlite3BtreeBeginTrans()`
7. `sqlite3VdbeExec()`
8. `sqlite3Step()`
9. `sqlite3_step()`
10. `execsql_i64_x()`
11. `walthread3_thread()`
12. `launch_thread_main()`


* None

**Thread 2:**

**Access:** `pInfo->nBackfill = 0;`

**Line number:** `sqlite3.c, 60269`

**Call stack:**

1. `walRestartHdr()`
2. `walRestartLog()`
3. `sqlite3WalFrames()`
4. `pagerWalFrames()`
5. `sqlite3PagerCommitPhaseOne()`
6. `sqlite3BtreeCommitPhaseOne()`
7. `vdbeCommit()`
8. `sqlite3VdbeHalt()`
9. `sqlite3VdbeExec()`
10. `sqlite3_step->sqlite3Step()`
11. `execsql_i64_x()`
12. `walthread3_thread()`
13. `launch_thread_main()`

* None

My fuzzer finds that these 2 accesses can be executed concurrently, and they are not protected by any lock, so my fuzzer report this race.

(2) By Zu-Ming Jiang (jiang446079653) on 2020-05-02 07:28:58 in reply to 1 [link]

What do you think about this data race?

(3) By Richard Hipp (drh) on 2020-05-02 11:33:14 in reply to 2 [link]

I believe your analysis is incorrect.

`pInfo->nBackfill` is part of a shared-memory segment that is visible
to multiple processes.  Access to that shared-memory segment is serialized
using file locks, not mutexes.  Perhaps your tool only looks at mutexes
and fails to properly account for the affect of file locks.

(4) By Zu-Ming Jiang (jiang446079653) on 2020-05-03 01:59:01 in reply to 3 [link]

Yes, you are right. File locks is not considered.

Thanks for your response.

(5) By Zu-Ming Jiang (jiang446079653) on 2020-05-13 10:42:45 in reply to 3 [link]

**I think there may be some problem in your file locks.**

To check whether the race is real, I set breakpoints before these two accesses when they are running in the call stack described above. I find that the breakpoints can be activated simultaneously, and the address of race variables are same. I think these result can prove that **the race actually happens**. So these accesses is not serialized successfully.

(6) By Zu-Ming Jiang (jiang446079653) on 2020-05-14 12:25:30 in reply to 3

Could you confirm this race? Because I have successfully reproduced this race by using breakpoints, I think there may be some problems in the file locks.

(7) By Zu-Ming Jiang (jiang446079653) on 2020-05-19 10:29:00 in reply to 3 [link]

I think it is a real race because I have reproduced it. Please help me confirm it.

(8) By Wout Mertens (wmertens) on 2020-05-20 08:52:44 in reply to 5 [link]

Do you mean that you tested this with working file locks, and it triggers the breakpoints simultaneously?

I think the reason you're not getting replies is because it requires very specic knowledge to verify your findings, and most of us don't have it.

If you could post a step-by-step of how to reproduce, that would really help!