/ Check-in [337a0b67]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update and add further detail to README-server-edition.html.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | server-process-edition
Files: files | file ages | folders
SHA3-256:337a0b67e30f1030fdc59f712e5914f4801b0e9e4ae19a1e82c10b73eb3f4773
User & Date: dan 2018-03-30 20:42:25
Context
2018-03-31
18:43
Fix an error in README-server-edition.html. Leaf check-in: 754ad35c user: dan tags: server-process-edition
2018-03-30
20:42
Update and add further detail to README-server-edition.html. check-in: 337a0b67 user: dan tags: server-process-edition
2018-03-28
15:41
Update this branch with latest trunk changes. check-in: df52e89f user: dan tags: server-process-edition
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to README-server-edition.html.

1
2
3
4
5
6
7
8
9
10
11
12
13
14


15
16




17
18
19
20

21
22





23
24
25
26

27
28
29

30
31


32
33
34
35
36
37
38
39
40
41
42
43
..
76
77
78
79
80
81
82





83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101























































































































































































102
103
104
105
106
107
108







109
110
111
112
113
114
115
116
117
118
119
120
121
122
123


124
125
126
127
128
129


130
131
132
133
134
135


136
137
138
139
140



141


















142
143
144
145
146
147
148
149
<html>

<center>
  <h1> The "server-process-edition" Branch</h1>
</center>

<p>
The "server-process-edition" branch contains two modifications to stock SQLite
that work together to provide concurrent read/write transactions using
page-level-locking provided that:

<ul>
  <li><p> All clients are in the same process, and
  <li><p> The application uses "PRAGMA synchronous=off".


</ul>





<p>
Up to 16 simultaneous read/write transactions controlled by page-level-locking
are possible. Additionally, there may be any number of read-only transactions
started using "BEGIN READONLY" commands. Read-only transactions do not block

read-write transactions, and read-write transactions do not block read-only
transactions.






<p>
The two features on this branch are:
<ol>

  <li><p> An alternative layout for the database free-page list. This is intended
       to reduce contention between writers when allocating new database pages,
       either from the free-list or by extending the database file.


  <li><p> The "server-mode" extension, which provides the read/write


       page-level-locking and read-only MVCC concurrency mentioned above.
</ol>


<h2> Alternative Free-List Format </h2>

<p>
The alternative free-list format is very similar to the current format. It
differs in the following respects:

<ul>
  <li><p>The "total number of free pages" field in the db header is not
................................................................................

<p>
At present, the free-list format may only be modified when the free-list is
completely empty. Which, as the implementation ensures that a free-list that
uses the alternative format is never completely emptied, effectively precludes
changing the format from 2 (alternative) to 1 (legacy).






<h2> Page level locking - "Server Mode" </h2>

<p>
A database client automatically enters "server mode" if (a) it is using a VFS
that takes a process-wide exclusive lock on the db file (like "unix-excl"
does), and (b) there exists a directory named "&lt;database&gt;-journal" in the
file system alongside the database file "&lt;database&gt;" There is currently no
provision for creating this directory, although it could be safely done for
a database in rollback mode using something like:

<pre>
  PRAGMA journal_mode = off;
  BEGIN EXCLUSIVE;
    &lt;create directory&gt;
  END;
</pre>

<p>
To check the status of these two conditions, a new file-control is added -























































































































































































SQLITE_FCNTL_SERVER_MODE. SQLite invokes this file-control as part of the
procedure for detecting a hot journal (after it has established that there is a
file-system entry named &lt;database&gt;-journal and that no other process
holds a RESERVED lock). If the VFS does support an exclusive process-wide lock
and if the directory is present, the VFS indicates that the client should enter
server mode. If the VFS does not indicate this, or if it returns
SQLITE_NOTFOUND, then SQLite proceeds with the hot-journal rollback.








<p>
There is also a new file-control named SQLITE_FCNTL_FILEID, which requests a
128-bit value that uniquely identifies an open file on disk from the VFS. This
is used to ensure that all connections to the same database from within a
process use the same shared state, even if they connect to the db using
different file-system paths.

<p>
The heap-memory data structure shared between all connections to the same
database is protected by a mutex. Clients take and release the mutex each
time a transaction is opened or closed, and each time a read or write lock
is taken on a specific database page. Ordinary read/write transactions lock
each page that they access - each page can support any number of concurrent
read locks or a single write lock. 



<p>
Write transactions use a journal file stored in the &lt;database&gt;-journal 
directory. Journal files are named "&lt;id&gt;-journal", where &lt;id&gt; is an
integer value betwen 0 and 15, inclusive. A client may use multiple different
journal files throughout its lifetime.



<p>Before database pages are overwritten in server-mode, entries are added to
an in-memory hash table containing the old page content. These entries are
used by read-only transactions to ensure that they access a consistent snapshot
of the database. Hash table entries are automatically removed when they are
no longer required.



<p>
It is not difficult to extend the kind of page level locking used by read/write
transactions to clients in multiple processes. It might be more difficult to
extend the read-only MVCC capability though.






















<h2> Performance Test </h2>

<p>
The test uses a single table with the following schema:

<pre>
  CREATE TABLE t1(a INTEGER PRIMARY KEY, b BLOB(16), c BLOB(16), d BLOB(400));
  CREATE INDEX i1 ON t1(b);







|
|
|


|
|
>
>


>
>
>
>


|
<
>
|
<
>
>
>
>
>




>
|
|
|
>

<
>
>
|



|







 







>
>
>
>
>
|


|
<
|
|
|
|








<
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|
|
|
<
<
<
<
>
>
>
>
>
>
>

<
|
|
|
|
|

<
<
<
<
<
<
<
>
>

<
<
<
<
<
>
>

<
<
<
<
<
>
>

<
<
<
<
>
>
>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

26
27

28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

43
44
45
46
47
48
49
50
51
52
53
54
55
56
..
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104

105
106
107
108
109
110
111
112
113
114
115
116


117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302




303
304
305
306
307
308
309
310

311
312
313
314
315
316







317
318
319





320
321
322





323
324
325




326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
<html>

<center>
  <h1> The "server-process-edition" Branch</h1>
</center>

<p>
The "server-process-edition" branch contains two modifications to stock 
SQLite that work together to provide concurrent read/write transactions 
using pessimistic page-level-locking. The system runs in two modes:

<ul>
  <li><p> Single-process mode - where all clients must be within the
          same address space, and 
  <li><p> Multi-process mode - where clients may be distributed between
          multiple OS processes.
</ul>

<p> The system is designed to be most efficient when used with 
<a href="https://www.sqlite.org/pragma.html#pragma_synchronous">
  "PRAGMA synchronous=OFF"</a>, although it does not require this.

<p>
Up to 16 simultaneous read/write transactions controlled by page-level-locking
are possible. Additionally, in single-process mode there may be any number of

read-only transactions started using the "BEGIN READONLY" command. Read-only
transactions do not block read-write transactions, and read-write transactions

do not block read-only transactions. Read-only transactions access a consistent
snapshot of the database - writes committed by other clients after the
transaction has started are never visible to read-only transactions. In
multi-process mode, the "BEGIN READONLY" command is equivalent to a stock
"BEGIN".

<p>
The two features on this branch are:
<ol>
  <li><p> An 
    <a href=#freelist>alternative layout for the database free-page list</a>. 
    This is intended to reduce contention between writers when allocating
    new database pages, either from the free-list or by extending the
    database file.


  <li><p> The <a href=#servermode>"server-mode" extension</a>, which
    provides read/write page-level-locking concurrency and (in
    single-process mode) read-only MVCC concurrency mentioned above.
</ol>


<h2 id=freelist> 1.0 Alternative Free-List Format </h2>

<p>
The alternative free-list format is very similar to the current format. It
differs in the following respects:

<ul>
  <li><p>The "total number of free pages" field in the db header is not
................................................................................

<p>
At present, the free-list format may only be modified when the free-list is
completely empty. Which, as the implementation ensures that a free-list that
uses the alternative format is never completely emptied, effectively precludes
changing the format from 2 (alternative) to 1 (legacy).

<p>
For databases that use the "alternative" free-list format, the read and write
versions in the database header (byte offsets 18 and 19) are set to 3 for
rollback mode or 4 for wal mode (instead of 1 and 2 respectively).

<h2 id=servermode> 2.0 Page level locking - "Server Mode" </h2>

<p>
A database client automatically enters "server mode" if there exists a

<i>directory</i> named "&lt;database&gt;-journal" in the file system alongside
the database file "&lt;database&gt;" There is currently no provision for
creating this directory, although it could be safely done for a database in
rollback mode using something like:

<pre>
  PRAGMA journal_mode = off;
  BEGIN EXCLUSIVE;
    &lt;create directory&gt;
  END;
</pre>



<p> As well as signalling new clients that they should enter server-mode,
creating a directory named "&lt;database&gt;-journal" has the helpful 
side-effect of preventing legacy clients from accessing the database file at
all.

<p> If the VFS is one that takes an exclusive lock on the db file (to 
guarantee that no other process accesses the db file), then the system
automatically enters single-process mode. Otherwise, multi-process mode.

<p> In both single and multi-process modes, page-level-locking is managed
by allocating a fixed-size array of "locking slots". Each locking slot is
32-bits in size. By default, the array contains 262144 (2^18) slots. Pages are
assigned to locking slots using the formula (pgno % 262144) - so pages 1,
262145, 524289 etc. share a single locking slot.

<p> In single-process mode, the array of locking slots is allocated on 
the process heap and access is protected by a mutex. In multi-process mode, it
is created by memory-mapping a file on disk (similar to the *-shm file in
SQLite wal mode) and access is performed using 
<a href="https://en.wikipedia.org/wiki/Compare-and-swap">atomic CAS
  primitives</a> exclusively.

<p> Each time a read/write transaction is opened, the client assumes a client 
id between 0 and 15 for the duration of the transaction. Client ids are unique
at any point in time - concurrently executing transactions must use different
client ids. So there may exist a maximum of 16 concurrent read/write
transactions at any one time.

<p> Read/write transactions in server-mode are similar to regular SQLite
transactions in rollback mode. The most significant differences are that:

<ul>
  <li> <p>Instead of using journal file &lt;database&gt;-journal, server-mode
    clients use &lt;database&gt;-journal/&lt;client-id&gt;-journal. If 
    there are multiple concurrent transactions, each uses a separate 
    journal file.

    <li> <p>No database-wide lock is taken. Instead, individual read and write
    locks are taken on the pages accessed by the transaction.
</ul>

<p> Each locking slot is 32-bits in size. A locking slot may simultaneously
support a single write-lock, up to 16 read-locks from read/write clients, and
(in single process mode) up 1024 read-locks from "BEGIN READONLY" clients.
Locking slot bits are used as follows:

<ul>
  <li> <p> The least-significant 16-bits are used for read-locks taken by
    read/write clients. To take a read-lock, bit &lt;client-id&gt; of the
    locking slot is set.

  <li> <p> The next 5 bytes are used for the write-lock. If no write-lock
    is held on the slot, then this 5 byte integer is set to 0. Otherwise,
    it is set to (<i>C</i> + 1), where <i>C</i> is the &lt;client-id&gt; of
    the client holding the write-lock.

  <li> <p> The next 10 bits contain the total number of read-locks held by
    "BEGIN READONLY" clients on the locking slot. See the section below
    for a description of how these are used.
</ul>

<p> Currently, if a client requests a lock that cannot be granted due to
a conflicting lock, SQLITE_BUSY is returned to the caller and either the
entire transaction or statement transaction must be rolled back. See 
<a href=#problems>Problems and Issues</a> below for more details.

<h3> 2.1 Single-Process Mode </h3>

<p> Single process mode is simpler than multi-process mode because it does
not have to deal with runtime client failure - it is assumed that if one
client fails mid-transaction the entire process crashes. As a result the 
only time hot-journal rollback is required in single-process mode is as
part of startup. The first client to connect to a database in single-process
mode attempts to open and rollback all 16 potential hot journal files.

<p> But, in order to support non-blocking "BEGIN READONLY" transactions, it is
also in some ways more complicated than multi-process mode. "BEGIN READONLY"
support works as follows:

<ul>
  <li> <p>Clients executing "BEGIN READONLY" transactions are not assigned
    a &lt;client-id&gt;. Instead, they have a transaction-id that is unique
    within the lifetime of the process. Transaction-ids are assigned using a 
    monotonically increasing function.

  <li> <p>In single-process mode, writers never spill the cache mid-transaction.
    Data is only written to the database as part of committing a transaction.

  <li> <p>As well as writing the contents of overwritten pages out to the journal
    file, a writer in single-process mode also accumulates a list of buffers
    containing the original data for each page overwritten by the current
    transaction in main-memory.

  <li> <p>When a transaction is to be committed, a writer first obtains a
    transaction-id (in the same way as a BEGIN READONLY client) and then adds
    all of its "old data" buffers to a hash table accessible to all database
    clients. Associated with each hash table entry is the newly assigned
    transaction-id. It then waits (spin-locks) for all "BEGIN READONLY"
    read-locks to clear on all pages that will be written out by the
    transaction. Following this, it commits the transaction as normal (writes
    out the dirty pages and zeroes the journal file header).

  <li> <p>When a "BEGIN READONLY" transaction reads a page, it first checks
    the aforementioned hash table for a suitable entry. A suitable entry
    is one with the right page-number and a transaction-id greater than that
    of the "BEGIN READONLY" transaction (i.e. one added to the hash table 
    <i>after</i> the BEGIN READONLY transaction started). If such an entry
    can be found, the client uses the associated data instead of reading
    from the db file. Or, if no such entry is found, the client:
    <ol>
      <li> Increments the number of BEGIN READONLY read-locks on the page.
      <li> Reads the contents of the page from the database file.
      <li> Decrements the number of BEGIN READONLY read-locks on the page.
    </ol>
    <p> The mutex used to protect access to the array of locking slots and
    the shared hash table is relinquished for step 2 above.

  <li> <p>After each transaction is commited in single-process mode, the 
    client searches the hash table for entries that can be discarded. An
    entry can be discarded if it has a transaction-id older than any still
    in use (either by BEGIN READONLY transactions or committers).
</ul>

<h3> 2.2 Multi-Process Mode </h3>

<p> Multi-process mode differs from single-process mode in two important ways:

<ul>
  <li> <p>Individual clients may fail mid-transaction and the system must recover
    from this.

  <li> <p>Partly as a consequence of the above, there are no convenient 
    primitives like mutexes or malloc() with which to build complicated data
    structures like the hash-table used in single-process mode. As a result,
    there is no support for "BEGIN READONLY" transactions in multi-process
    mode.  
</ul>

<p> Unlike single-process mode clients, which may be assigned a different
client-id for each transaction, clients in multi-process mode are assigned a
client-id when they connect to the database and do not relinquish it until
they disconnect. As such, a database in multi-process server-mode supports
at most 16 concurrent client connections.

<p> As well as the array of locking slots, the shared-memory mapping used
by clients in multi-process mode contains 16 "client slots". When a client
connects, it takes a posix WRITE lock on the client slot that corresponds
to its client id. This lock is not released until the client disconnects.
Additionally, whenever a client starts a transaction, it sets the value 
in its client locking slot to 1, and clears it again after the transaction
is concluded.

<p> This assists with handling client failure mid-transaction in two ways:

<ul>
  <li><p> If client A cannot obtain a lock due to a conflicting lock held by
    client B, it can check whether or not client B has failed by attempting a
    WRITE lock on its client locking slot. If successful, then client B must
    have failed and client A may: 
    <ul>
      <li> Roll back client B's journal, and
      <li> By iterating through the entire locking slot array, release all
        locks held by client B when it failed.
    </ul>

  <li><p> When a client first connects and locks its client locking slot, it
    can check whether or not the previous user of the client locking slot failed
    mid-transaction (since if it did, the locking slot value will still be
    non-zero). If it did, the new owner of the client locking slot can release
    any locks and roll back any hot-journal before proceeding.
</ul>

<h3> 2.3 Required VFS Support </h3>

<p> The server-mode extension requires that the VFS support various special
file-control commands. Currently support is limited to the "unix" VFS.

<dl>
  <dt> SQLITE_FCNTL_SERVER_MODE
  <dd><p> This is used by SQLite to query the VFS as to whether the
  connection should use single-process server-mode, multi-process server-mode,
  or continue in legacy mode.

  <p>SQLite invokes this file-control as part of the procedure for detecting a
  hot journal (after it has established that there is a file-system entry named
  &lt;database&gt;-journal and that no other process holds a RESERVED lock).




  If the &lt;database&gt;-journal directory is present in the file-system and
  the current VFS takes an exclusive lock on the database file (i.e. is
  "unix-excl"), then this file-control indicates that the connection should use
  single-process server-mode. Or, if the directory exists but the VFS does not
  take an exclusive lock on the database file, that the connection should use
  multi-proces server-mode. Or, if there is no directory of the required name,
  that the connection should use legacy mode.


  <dt> SQLITE_FCNTL_FILEID
  <dd><p> Return a 128-bit value that uniquely identifies an open file on disk
  from the VFS. This is used to ensure that all connections to the same
  database from within a process use the same shared state, even if they
  connect to the db using different file-system paths.








  <dt> SQLITE_FCNTL_SHMOPEN
  <dd>






  <dt> SQLITE_FCNTL_SHMOPEN2
  <dd>






  <dt> SQLITE_FCNTL_SHMLOCK
  <dd>





  <dt> SQLITE_FCNTL_SHMCLOSE
  <dd>
</dl>


<h2 id=problems> 3.0 Problems and Issues </h2>

<ul>

  <li> <p>Writer starvation might be the biggest issue. How can it be
       prevented?

  <li> <p>Blocking locks of some sort would likely improve things. The issue
       here is deadlock detection.

  <li> <p>The limit of 16 concurrent clients in multi-process mode could be 
       raised to 27 (since the locking-slot bits used for BEGIN READONLY
       locks in single-process mode can be reassigned to support more 
       read/write client read-locks).

</ul>

<h2> 4.0 Performance Test </h2>

<p>
The test uses a single table with the following schema:

<pre>
  CREATE TABLE t1(a INTEGER PRIMARY KEY, b BLOB(16), c BLOB(16), d BLOB(400));
  CREATE INDEX i1 ON t1(b);

Changes to src/server.c.

136
137
138
139
140
141
142



143
144
145
146
147
148
149
...
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
...
220
221
222
223
224
225
226






227
228
229
230
231
232


233
234
235
236
237
238
239
  int iCommitId;                  /* Current commit id (or 0) */
  int nAlloc;                     /* Allocated size of aLock[] array */
  int nLock;                      /* Number of entries in aLock[] */
  u32 *aLock;                     /* Array of held locks */
  Server *pNext;                  /* Next in pCommit or pReader list */
};




struct ServerGlobal {
  ServerDb *pDb;                  /* Linked list of all ServerDb objects */
};
static struct ServerGlobal g_server;


struct ServerFcntlArg {
................................................................................
/*
** Possible values for Server.eTrans.
*/
#define SERVER_TRANS_NONE      0
#define SERVER_TRANS_READONLY  1
#define SERVER_TRANS_READWRITE 2

#define SERVER_WRITE_LOCK 3
#define SERVER_READ_LOCK  2
#define SERVER_NO_LOCK    1

/*
** Global mutex functions used by code in this file.
*/
static void serverEnterMutex(void){
  sqlite3_mutex_enter(sqlite3MutexAlloc(SQLITE_MUTEX_STATIC_APP1));
}
static void serverLeaveMutex(void){
................................................................................
    p->nClient++;
  }
  pNew->pDb = p;
  serverLeaveMutex();
  return rc;
}







static int serverClientRollback(Server *p, int iClient){
  ServerDb *pDb = p->pDb;
  ServerJournal *pJ = &pDb->aJrnl[iClient];
  int bExist = 1;
  int rc = SQLITE_OK;



  if( fdOpen(pJ->jfd)==0 ){
    bExist = 0;
    rc = sqlite3OsAccess(pDb->pVfs, pJ->zJournal, SQLITE_ACCESS_EXISTS,&bExist);
    if( bExist && rc==SQLITE_OK ){
      int flags = SQLITE_OPEN_READWRITE|SQLITE_OPEN_MAIN_JOURNAL;
      rc = sqlite3OsOpen(pDb->pVfs, pJ->zJournal, pJ->jfd, flags, &flags);
    }







>
>
>







 







<
<
<
<







 







>
>
>
>
>
>






>
>







136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
...
160
161
162
163
164
165
166




167
168
169
170
171
172
173
...
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
  int iCommitId;                  /* Current commit id (or 0) */
  int nAlloc;                     /* Allocated size of aLock[] array */
  int nLock;                      /* Number of entries in aLock[] */
  u32 *aLock;                     /* Array of held locks */
  Server *pNext;                  /* Next in pCommit or pReader list */
};

/*
** Global variables used by this module.
*/
struct ServerGlobal {
  ServerDb *pDb;                  /* Linked list of all ServerDb objects */
};
static struct ServerGlobal g_server;


struct ServerFcntlArg {
................................................................................
/*
** Possible values for Server.eTrans.
*/
#define SERVER_TRANS_NONE      0
#define SERVER_TRANS_READONLY  1
#define SERVER_TRANS_READWRITE 2





/*
** Global mutex functions used by code in this file.
*/
static void serverEnterMutex(void){
  sqlite3_mutex_enter(sqlite3MutexAlloc(SQLITE_MUTEX_STATIC_APP1));
}
static void serverLeaveMutex(void){
................................................................................
    p->nClient++;
  }
  pNew->pDb = p;
  serverLeaveMutex();
  return rc;
}

/*
** Roll back journal iClient. This is a hot-journal rollback - the
** connection passed as the first argument does not currently have an
** open transaction that uses the journal (although it may have an
** open transaction that uses some other journal).
*/
static int serverClientRollback(Server *p, int iClient){
  ServerDb *pDb = p->pDb;
  ServerJournal *pJ = &pDb->aJrnl[iClient];
  int bExist = 1;
  int rc = SQLITE_OK;

  /* If it is not exists on disk but is not already open, open the
  ** journal file in question. */
  if( fdOpen(pJ->jfd)==0 ){
    bExist = 0;
    rc = sqlite3OsAccess(pDb->pVfs, pJ->zJournal, SQLITE_ACCESS_EXISTS,&bExist);
    if( bExist && rc==SQLITE_OK ){
      int flags = SQLITE_OPEN_READWRITE|SQLITE_OPEN_MAIN_JOURNAL;
      rc = sqlite3OsOpen(pDb->pVfs, pJ->zJournal, pJ->jfd, flags, &flags);
    }