Documentation Source Text

Changes On Branch rm-requirements-marks
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Changes In Branch rm-requirements-marks Excluding Merge-Ins

This is equivalent to a diff from c13d10c92b to 4f2e9c3ad9

2024-04-29
17:33
First attempt at a document explaining the advantages of using a bytecode representation for prepare statement versus a tree-of-objects representation. (check-in: cb56cf7dc4 user: drh tags: trunk)
2024-04-20
09:25
Remove all stray ^ characters from fileformat2.in, as reported in the forum. Let's defer doing this for the time being. We may start paying attention to the requirements marks again in the future. (Closed-Leaf check-in: 4f2e9c3ad9 user: stephan tags: rm-requirements-marks)
2024-04-16
17:31
Add documentation for the fts5-tokenize-blob feature. (check-in: c13d10c92b user: dan tags: trunk)
17:29
Add documentation for the fts5-tokenize-blob feature. (Closed-Leaf check-in: 98b0894622 user: dan tags: fts5-tokenize-blob)
16:29
Fix a few misspellings and grammos. (check-in: 4159d99277 user: larrybr tags: trunk)

Changes to pages/fileformat2.in.

30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
scenarios and so are uncommon, but they are part of the state of an SQLite
database and so cannot be ignored.  This document defines the format
of a rollback journal and the write-ahead log file, but the focus is
on the main database file.</p>

<h2>Pages</h2>

<p>The main database file consists of one or more pages.  ^The size of a
page is a power of two between 512 and 65536 inclusive.  All pages within
the same database are the same size.  ^The page size for a database file
is determined by the 2-byte integer located at an offset of
16 bytes from the beginning of the database file.</p>

<p>Pages are numbered beginning with 1.  The maximum page number is
4294967294 (2<sup><small>32</small></sup> - 2).  The minimum size
SQLite database is a single 512-byte page.
The maximum size database would be 4294967294 pages at 65536 bytes per







|

|







30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
scenarios and so are uncommon, but they are part of the state of an SQLite
database and so cannot be ignored.  This document defines the format
of a rollback journal and the write-ahead log file, but the focus is
on the main database file.</p>

<h2>Pages</h2>

<p>The main database file consists of one or more pages.  The size of a
page is a power of two between 512 and 65536 inclusive.  All pages within
the same database are the same size.  The page size for a database file
is determined by the 2-byte integer located at an offset of
16 bytes from the beginning of the database file.</p>

<p>Pages are numbered beginning with 1.  The maximum page number is
4294967294 (2<sup><small>32</small></sup> - 2).  The minimum size
SQLite database is a single 512-byte page.
The maximum size database would be 4294967294 pages at 65536 bytes per
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
<li>An index b-tree leaf page
</ul>
<li>A payload overflow page
<li>A pointer map page
</ul>
</p>

<p>^All reads from and writes to the main database file begin at a page
boundary and all writes are an integer number of pages in size.  ^Reads
are also usually an integer number of pages in size, with the one exception
that when the database is first opened, the first 100 bytes of the
database file (the database file header) are read as a sub-page size unit.</p>

<p>^Before any information-bearing page of the database is modified, 
the original unmodified content of that page is written into the 
rollback journal.  If a transaction is interrupted and needs to be 
rolled back, the rollback journal can then be used to restore the
database to its original state.  ^Freelist leaf pages bear no
information that would need to be restored on a rollback and so they
are not written to the journal prior to modification, in order to
reduce disk I/O.</p>

<tcl>hd_fragment database_header {database header}</tcl>
<h2>The Database Header</h2>








|
|




|



|







69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
<li>An index b-tree leaf page
</ul>
<li>A payload overflow page
<li>A pointer map page
</ul>
</p>

<p>All reads from and writes to the main database file begin at a page
boundary and all writes are an integer number of pages in size.  Reads
are also usually an integer number of pages in size, with the one exception
that when the database is first opened, the first 100 bytes of the
database file (the database file header) are read as a sub-page size unit.</p>

<p>Before any information-bearing page of the database is modified, 
the original unmodified content of that page is written into the 
rollback journal.  If a transaction is interrupted and needs to be 
rolled back, the rollback journal can then be used to restore the
database to its original state.  Freelist leaf pages bear no
information that would need to be restored on a rollback and so they
are not written to the journal prior to modification, in order to
reduce disk I/O.</p>

<tcl>hd_fragment database_header {database header}</tcl>
<h2>The Database Header</h2>

149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
The [version-valid-for number].
<tr><td valign=top align=center>96<td valign=top align=center>4<td align=left>
[SQLITE_VERSION_NUMBER]
</table></center>

<h3>Magic Header String</h3>

<p>^Every valid SQLite database file begins with the following 16 bytes 
(in hex): 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
corresponds to the UTF-8 string "SQLite format 3" including the nul
terminator character at the end.</p>

<h3>Page Size</h3>

<p>The two-byte value beginning at offset 16 determines the page size of 







|







149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
The [version-valid-for number].
<tr><td valign=top align=center>96<td valign=top align=center>4<td align=left>
[SQLITE_VERSION_NUMBER]
</table></center>

<h3>Magic Header String</h3>

<p>Every valid SQLite database file begins with the following 16 bytes 
(in hex): 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
corresponds to the UTF-8 string "SQLite format 3" including the nul
terminator character at the end.</p>

<h3>Page Size</h3>

<p>The two-byte value beginning at offset 16 determines the page size of 
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285

<tcl>hd_fragment resbyte {reserve bytes}</tcl>
<h3>Reserved bytes per page</h3>

<p>SQLite has the ability to set aside a small number of extra bytes at
the end of every page for use by extensions.  These extra bytes are
used, for example, by the SQLite Encryption Extension to store a nonce
and/or cryptographic checksum associated with each page.  ^The 
"reserved space" size in the 1-byte integer at offset 20 is the number
of bytes of space at the end of each page to reserve for extensions.
This value is usually 0.  The value can be odd.</p>

<tcl>hd_fragment usable_size {usable size}</tcl>
<p>The "usable size" of a database page is the page size specified by the
2-byte integer at offset 16 in the header less the "reserved" space size
recorded in the 1-byte integer at offset 20 in the header.  The usable
size of a page might be an odd number.  ^(However, the usable size is not
allowed to be less than 480.  In other words, if the page size is 512,
then the reserved space size cannot exceed 32.)^</p>

<h3>Payload fractions</h3>

<p>^The maximum and minimum embedded payload fractions and the leaf
payload fraction values must be 64, 32, and 32.  These values were
originally intended to be tunable parameters that could be used to
modify the storage format of the b-tree algorithm.  However, that
functionality is not supported and there are no current plans to add
support in the future.  Hence, these three bytes are fixed at the
values specified.</p>

<h3>File change counter</h3>

<tcl>hd_fragment chngctr {change counter}</tcl>
<p>^The file change counter is a 4-byte big-endian integer at
offset 24 that is incremented whenever the database file is unlocked
after having been modified.
When two or more processes are reading the same database file, each 
process can detect database changes from other processes by monitoring 
the change counter.
A process will normally want to flush its database page cache when
another process modified the database, since the cache has become stale.
The file change counter facilitates this.</p>

<p>In WAL mode, changes to the database are detected using the wal-index
and so the change counter is not needed.  Hence, the change counter might
not be incremented on each transaction in WAL mode.</p>

<h3>In-header database size</h3>

<tcl>hd_fragment filesize {in-header database size}</tcl>
<p>^The 4-byte big-endian integer at offset 28 into the header 
stores the size of the database file in pages.  ^If this in-header
datasize size is not valid (see the next paragraph), then the database 
size is computed by looking
at the actual size of the database file. Older versions of SQLite
ignored the in-header database size and used the actual file size
exclusively.  ^Newer versions of SQLite use the in-header database
size if it is available but fall back to the actual file size if
the in-header database size is not valid.</p>

<p>^The in-header database size is only considered to be valid if
it is non-zero and if the 4-byte [change counter] at offset 24
exactly matches the 4-byte [version-valid-for number] at offset 92.
^(The in-header database size is always valid 
when the database is only modified using recent versions of SQLite,
versions 3.7.0 ([dateof:3.7.0]) and later.)^
If a legacy version of SQLite writes to the database, it will not
know to update the in-header database size and so the in-header
database size could be incorrect.  But legacy versions of SQLite
will also leave the version-valid-for number at offset 92 unchanged
so it will not match the change-counter.  Hence, invalid in-header
database sizes can be detected (and ignored) by observing when
the change-counter does not match the version-valid-for number.</p>

<h3>Free page list</h3>

<p>Unused pages in the database file are stored on a freelist.  ^The
4-byte big-endian integer at offset 32 stores the page number of
the first page of the freelist, or zero if the freelist is empty.
^The 4-byte big-endian integer at offset 36 stores the total 
number of pages on the freelist.</p>

<h3>Schema cookie</h3>

<p>^The schema cookie is a 4-byte big-endian integer at offset 40
that is incremented whenever the database schema changes.  A 
prepared statement is compiled against a specific version of the
database schema.  When the database schema changes, the statement
must be reprepared.  ^When a prepared statement runs, it first checks
the schema cookie to ensure the value is the same as when the statement
was prepared and if the schema cookie has changed, the statement either
automatically reprepares and reruns or it aborts with an [SQLITE_SCHEMA] 
error.</p>

<tcl>hd_fragment {schemaformat} {schema format} {schema format number}</tcl>
<h3>Schema format number</h3>







|








|

|



|










|
















|
|




|



|


|

|










|


|




|



|







190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285

<tcl>hd_fragment resbyte {reserve bytes}</tcl>
<h3>Reserved bytes per page</h3>

<p>SQLite has the ability to set aside a small number of extra bytes at
the end of every page for use by extensions.  These extra bytes are
used, for example, by the SQLite Encryption Extension to store a nonce
and/or cryptographic checksum associated with each page.  The 
"reserved space" size in the 1-byte integer at offset 20 is the number
of bytes of space at the end of each page to reserve for extensions.
This value is usually 0.  The value can be odd.</p>

<tcl>hd_fragment usable_size {usable size}</tcl>
<p>The "usable size" of a database page is the page size specified by the
2-byte integer at offset 16 in the header less the "reserved" space size
recorded in the 1-byte integer at offset 20 in the header.  The usable
size of a page might be an odd number.  (However, the usable size is not
allowed to be less than 480.  In other words, if the page size is 512,
then the reserved space size cannot exceed 32.)</p>

<h3>Payload fractions</h3>

<p>The maximum and minimum embedded payload fractions and the leaf
payload fraction values must be 64, 32, and 32.  These values were
originally intended to be tunable parameters that could be used to
modify the storage format of the b-tree algorithm.  However, that
functionality is not supported and there are no current plans to add
support in the future.  Hence, these three bytes are fixed at the
values specified.</p>

<h3>File change counter</h3>

<tcl>hd_fragment chngctr {change counter}</tcl>
<p>The file change counter is a 4-byte big-endian integer at
offset 24 that is incremented whenever the database file is unlocked
after having been modified.
When two or more processes are reading the same database file, each 
process can detect database changes from other processes by monitoring 
the change counter.
A process will normally want to flush its database page cache when
another process modified the database, since the cache has become stale.
The file change counter facilitates this.</p>

<p>In WAL mode, changes to the database are detected using the wal-index
and so the change counter is not needed.  Hence, the change counter might
not be incremented on each transaction in WAL mode.</p>

<h3>In-header database size</h3>

<tcl>hd_fragment filesize {in-header database size}</tcl>
<p>The 4-byte big-endian integer at offset 28 into the header 
stores the size of the database file in pages.  If this in-header
datasize size is not valid (see the next paragraph), then the database 
size is computed by looking
at the actual size of the database file. Older versions of SQLite
ignored the in-header database size and used the actual file size
exclusively.  Newer versions of SQLite use the in-header database
size if it is available but fall back to the actual file size if
the in-header database size is not valid.</p>

<p>The in-header database size is only considered to be valid if
it is non-zero and if the 4-byte [change counter] at offset 24
exactly matches the 4-byte [version-valid-for number] at offset 92.
(The in-header database size is always valid 
when the database is only modified using recent versions of SQLite,
versions 3.7.0 ([dateof:3.7.0]) and later.)
If a legacy version of SQLite writes to the database, it will not
know to update the in-header database size and so the in-header
database size could be incorrect.  But legacy versions of SQLite
will also leave the version-valid-for number at offset 92 unchanged
so it will not match the change-counter.  Hence, invalid in-header
database sizes can be detected (and ignored) by observing when
the change-counter does not match the version-valid-for number.</p>

<h3>Free page list</h3>

<p>Unused pages in the database file are stored on a freelist.  The
4-byte big-endian integer at offset 32 stores the page number of
the first page of the freelist, or zero if the freelist is empty.
The 4-byte big-endian integer at offset 36 stores the total 
number of pages on the freelist.</p>

<h3>Schema cookie</h3>

<p>The schema cookie is a 4-byte big-endian integer at offset 40
that is incremented whenever the database schema changes.  A 
prepared statement is compiled against a specific version of the
database schema.  When the database schema changes, the statement
must be reprepared.  When a prepared statement runs, it first checks
the schema cookie to ensure the value is the same as when the statement
was prepared and if the schema cookie has changed, the statement either
automatically reprepares and reruns or it aborts with an [SQLITE_SCHEMA] 
error.</p>

<tcl>hd_fragment {schemaformat} {schema format} {schema format number}</tcl>
<h3>Schema format number</h3>
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
[ALTER TABLE | ALTER TABLE ... ADD COLUMN] functionality.  Support for
reading and writing format 2 was added in SQLite 
[version 3.1.3] on [dateof:3.1.3].</li>
<li value=3>Format 3 adds the ability of extra columns added by
[ALTER TABLE | ALTER TABLE ... ADD COLUMN] to have non-NULL default
values.  This capability was added in SQLite [version 3.1.4]
on [dateof:3.1.4].</li>
<li value=4>^Format 4 causes SQLite to respect the
[descending indexes | DESC keyword] on
index declarations.  (^The DESC keyword is ignored in indexes for 
formats 1, 2, and 3.)
^Format 4 also adds two new boolean record type values ([serial types]
8 and 9).  Support for format 4 was added in SQLite 3.3.0 on
2006-01-10.</li>
</ol>

<p>^New database files created by SQLite use format 4 by default.
^The [legacy_file_format pragma] can be used to cause SQLite
to create new database files using format 1.
The format version number can be made to default to 1 instead of 4 by
setting [SQLITE_DEFAULT_FILE_FORMAT]=1 at compile-time.
</p>

<h3>Suggested cache size</h3>

<p>The 4-byte big-endian signed integer at offset 48 is the suggested
cache size in pages for the database file.  The value is a suggestion
only and SQLite is under no obligation to honor it.  The absolute value
of the integer is used as the suggested size.  The suggested cache size
can be set using the [default_cache_size pragma].</p>

<h3>Incremental vacuum settings</h3>

<p>The two 4-byte big-endian integers at offsets 52 and 64 are used
to manage the [auto_vacuum] and [incremental_vacuum] modes.  ^If
the integer at offset 52 is zero then pointer-map (ptrmap) pages are
omitted from the database file and neither auto_vacuum nor
incremental_vacuum are supported.  ^If the integer at offset 52 is
non-zero then it is the page number of the largest root page in the
database file, the database file will contain ptrmap pages, and the
mode must be either auto_vacuum or incremental_vacuum.  ^In this latter
case, the integer at offset 64 is true for incremental_vacuum and
false for auto_vacuum.  ^If the integer at offset 52 is zero then
the integer at offset 64 must also be zero.</p>

<tcl>hd_fragment enc {text encoding}</tcl>
<h3>Text encoding</h3>

<p>^The 4-byte big-endian integer at offset 56 determines the encoding
used for all text strings stored in the database.  
^A value of 1 means UTF-8.
^A value of 2 means UTF-16le.
^A value of 3 means UTF-16be.
No other values are allowed.
^(The sqlite3.h header file defines C-preprocessor macros SQLITE_UTF8 as 1,
SQLITE_UTF16LE as 2, and SQLITE_UTF16BE as 3, to use in place of
the numeric codes for the text encoding.)^</p>

<h3>User version number</h3>

<p>^The 4-byte big-endian integer at offset 60 is the user version which
is set and queried by the [user_version pragma].  The user version is
not used by SQLite.</p>

<tcl>hd_fragment appid {Application ID}</tcl>
<h3>Application ID</h3>

<p>^The 4-byte big-endian integer at offset 68 is an "Application ID" that
can be set by the [PRAGMA application_id] command in order to identify the
database as belonging to or associated with a particular application.
The application ID is intended for database files used as an
[application file-format].  The application ID can be used by utilities 
such as [http://www.darwinsys.com/file/ | file(1)] to determine the specific
file type rather than just reporting "SQLite3 Database".  A list of
assigned application IDs can be seen by consulting the
[https://www.sqlite.org/src/artifact?ci=trunk&filename=magic.txt|magic.txt]
file in the SQLite source repository.</p>

<tcl>hd_fragment validfor {version-valid-for number}</tcl>
<h3>Write library version number and version-valid-for number</h3>

<p>^The 4-byte big-endian integer at offset 96 stores the 
[SQLITE_VERSION_NUMBER] value for the SQLite library that most
recently modified the database file.  ^The 4-byte big-endian integer at
offset 92 is the value of the [change counter] when the version number
was stored.  The integer at offset 92 indicates which transaction
the version number is valid for and is sometimes called the
"version-valid-for number".

<h3>Header space reserved for expansion</h3>








|

|

|




|
|
















|


|


|

|





|

|
|
|

|

|



|






|













|

|







298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
[ALTER TABLE | ALTER TABLE ... ADD COLUMN] functionality.  Support for
reading and writing format 2 was added in SQLite 
[version 3.1.3] on [dateof:3.1.3].</li>
<li value=3>Format 3 adds the ability of extra columns added by
[ALTER TABLE | ALTER TABLE ... ADD COLUMN] to have non-NULL default
values.  This capability was added in SQLite [version 3.1.4]
on [dateof:3.1.4].</li>
<li value=4>Format 4 causes SQLite to respect the
[descending indexes | DESC keyword] on
index declarations.  (The DESC keyword is ignored in indexes for 
formats 1, 2, and 3.)
Format 4 also adds two new boolean record type values ([serial types]
8 and 9).  Support for format 4 was added in SQLite 3.3.0 on
2006-01-10.</li>
</ol>

<p>New database files created by SQLite use format 4 by default.
The [legacy_file_format pragma] can be used to cause SQLite
to create new database files using format 1.
The format version number can be made to default to 1 instead of 4 by
setting [SQLITE_DEFAULT_FILE_FORMAT]=1 at compile-time.
</p>

<h3>Suggested cache size</h3>

<p>The 4-byte big-endian signed integer at offset 48 is the suggested
cache size in pages for the database file.  The value is a suggestion
only and SQLite is under no obligation to honor it.  The absolute value
of the integer is used as the suggested size.  The suggested cache size
can be set using the [default_cache_size pragma].</p>

<h3>Incremental vacuum settings</h3>

<p>The two 4-byte big-endian integers at offsets 52 and 64 are used
to manage the [auto_vacuum] and [incremental_vacuum] modes.  If
the integer at offset 52 is zero then pointer-map (ptrmap) pages are
omitted from the database file and neither auto_vacuum nor
incremental_vacuum are supported.  If the integer at offset 52 is
non-zero then it is the page number of the largest root page in the
database file, the database file will contain ptrmap pages, and the
mode must be either auto_vacuum or incremental_vacuum.  In this latter
case, the integer at offset 64 is true for incremental_vacuum and
false for auto_vacuum.  If the integer at offset 52 is zero then
the integer at offset 64 must also be zero.</p>

<tcl>hd_fragment enc {text encoding}</tcl>
<h3>Text encoding</h3>

<p>The 4-byte big-endian integer at offset 56 determines the encoding
used for all text strings stored in the database.  
A value of 1 means UTF-8.
A value of 2 means UTF-16le.
A value of 3 means UTF-16be.
No other values are allowed.
(The sqlite3.h header file defines C-preprocessor macros SQLITE_UTF8 as 1,
SQLITE_UTF16LE as 2, and SQLITE_UTF16BE as 3, to use in place of
the numeric codes for the text encoding.)</p>

<h3>User version number</h3>

<p>The 4-byte big-endian integer at offset 60 is the user version which
is set and queried by the [user_version pragma].  The user version is
not used by SQLite.</p>

<tcl>hd_fragment appid {Application ID}</tcl>
<h3>Application ID</h3>

<p>The 4-byte big-endian integer at offset 68 is an "Application ID" that
can be set by the [PRAGMA application_id] command in order to identify the
database as belonging to or associated with a particular application.
The application ID is intended for database files used as an
[application file-format].  The application ID can be used by utilities 
such as [http://www.darwinsys.com/file/ | file(1)] to determine the specific
file type rather than just reporting "SQLite3 Database".  A list of
assigned application IDs can be seen by consulting the
[https://www.sqlite.org/src/artifact?ci=trunk&filename=magic.txt|magic.txt]
file in the SQLite source repository.</p>

<tcl>hd_fragment validfor {version-valid-for number}</tcl>
<h3>Write library version number and version-valid-for number</h3>

<p>The 4-byte big-endian integer at offset 96 stores the 
[SQLITE_VERSION_NUMBER] value for the SQLite library that most
recently modified the database file.  The 4-byte big-endian integer at
offset 92 is the value of the [change counter] when the version number
was stored.  The integer at offset 92 indicates which transaction
the version number is valid for and is sometimes called the
"version-valid-for number".

<h3>Header space reserved for expansion</h3>

397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
inclusive.  A database file that is less than or equal to 1073741824 bytes 
in size contains no lock-byte page.  A database file larger than
1073741824 contains exactly one lock-byte page.
</p>

<p>The lock-byte page is set aside for use by the operating-system specific
[VFS] implementation in implementing the database file locking primitives.
^SQLite does not use the lock-byte page.  ^The SQLite core 
will never read or write the lock-byte page,
though operating-system specific [VFS] 
implementations may choose to read or write bytes on the lock-byte 
page according to the 
needs and proclivities of the underlying system.  The unix and win32
[VFS] implementations that come built into SQLite do not write to the
lock-byte page, but third-party VFS implementations for







|







397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
inclusive.  A database file that is less than or equal to 1073741824 bytes 
in size contains no lock-byte page.  A database file larger than
1073741824 contains exactly one lock-byte page.
</p>

<p>The lock-byte page is set aside for use by the operating-system specific
[VFS] implementation in implementing the database file locking primitives.
SQLite does not use the lock-byte page.  The SQLite core 
will never read or write the lock-byte page,
though operating-system specific [VFS] 
implementations may choose to read or write bytes on the lock-byte 
page according to the 
needs and proclivities of the underlying system.  The unix and win32
[VFS] implementations that come built into SQLite do not write to the
lock-byte page, but third-party VFS implementations for
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
<p>The freelist is organized as a linked list of freelist trunk pages
with each trunk page containing page numbers for zero or more freelist
leaf pages.</p>

<p>A freelist trunk page consists of an array of 4-byte big-endian integers.
The size of the array is as many integers as will fit in the usable space
of a page.  The minimum usable space is 480 bytes so the array will always
be at least 120 entries in length.  ^The first integer on a freelist trunk
page is the page number of the next freelist trunk page in the list or zero 
if this is the last freelist trunk page.  ^The second integer on a freelist
trunk page is the number of leaf page pointers to follow.  
^(Call the second integer on a freelist trunk page L.
If L is greater than zero then integers with array indexes between 2 and
L+1 inclusive contain page numbers for freelist leaf pages.)^</p>

<p>Freelist leaf pages contain no information.  ^SQLite avoids reading or
writing freelist leaf pages in order to reduce disk I/O.</p>

<p>A bug in SQLite versions prior to 3.6.0 ([dateof:3.6.0])
caused the database to be
reported as corrupt if any of the last 6 entries in the freelist trunk page 
array contained non-zero values.  Newer versions of SQLite do not have
this problem.  ^However, newer versions of SQLite still avoid using the 
last six entries in the freelist trunk page array in order that database
files created by newer versions of SQLite can be read by older versions
of SQLite.</p>

<p>^The number of freelist pages is stored as a 4-byte big-endian integer
in the database header at an offset of 36 from the beginning of the file.
^The database header also stores the page number of the first freelist trunk
page as a 4-byte big-endian integer at an offset of 32 from the beginning
of the file.</p>

<tcl>hd_fragment btree {B*-Trees} {B-tree}</tcl>
<h2>B-tree Pages</h2>

<p>The b-tree algorithm provides key/data storage with unique and







|

|

|

|

|






|




|

|







428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
<p>The freelist is organized as a linked list of freelist trunk pages
with each trunk page containing page numbers for zero or more freelist
leaf pages.</p>

<p>A freelist trunk page consists of an array of 4-byte big-endian integers.
The size of the array is as many integers as will fit in the usable space
of a page.  The minimum usable space is 480 bytes so the array will always
be at least 120 entries in length.  The first integer on a freelist trunk
page is the page number of the next freelist trunk page in the list or zero 
if this is the last freelist trunk page.  The second integer on a freelist
trunk page is the number of leaf page pointers to follow.  
(Call the second integer on a freelist trunk page L.
If L is greater than zero then integers with array indexes between 2 and
L+1 inclusive contain page numbers for freelist leaf pages.)</p>

<p>Freelist leaf pages contain no information.  SQLite avoids reading or
writing freelist leaf pages in order to reduce disk I/O.</p>

<p>A bug in SQLite versions prior to 3.6.0 ([dateof:3.6.0])
caused the database to be
reported as corrupt if any of the last 6 entries in the freelist trunk page 
array contained non-zero values.  Newer versions of SQLite do not have
this problem.  However, newer versions of SQLite still avoid using the 
last six entries in the freelist trunk page array in order that database
files created by newer versions of SQLite can be read by older versions
of SQLite.</p>

<p>The number of freelist pages is stored as a 4-byte big-endian integer
in the database header at an offset of 36 from the beginning of the file.
The database header also stores the page number of the first freelist trunk
page as a 4-byte big-endian integer at an offset of 32 from the beginning
of the file.</p>

<tcl>hd_fragment btree {B*-Trees} {B-tree}</tcl>
<h2>B-tree Pages</h2>

<p>The b-tree algorithm provides key/data storage with unique and
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
uses more than one fourth of the available storage space on the page
and hence every internal page is able to store at least 4 keys.
The integer keys of table b-trees are never large enough to
require overflow, so key overflow only occurs on index b-trees.</p>

<p>Define the depth
of a leaf b-tree to be 1 and the depth of any interior b-tree to be one
more than the maximum depth of any of its children.  ^In a well-formed
database, all children of an interior b-tree have the same depth.</p>

<p>In an interior b-tree page, the pointers and keys logically alternate 
with a pointer on both ends. (The previous sentence is to be understood
conceptually - the actual layout of the keys and
pointers within the page is more complicated and will be described in
the sequel.)  All keys within the same page are unique and are logically
organized in ascending order from left to right.  (Again, this ordering
is logical, not physical.  The actual location of keys within the page
is arbitrary.) ^For any key X, pointers to the left
of a X refer to b-tree pages on which all keys are less than or equal to X.
^Pointers to the right of X refer to pages where all keys are 
greater than X.</p>

<p>Within an interior b-tree page, each key and the pointer to its
immediate left are combined into a structure called a "cell".  The
right-most pointer is held separately.  A leaf b-tree page has no
pointers, but it still uses the cell structure to hold keys for
index b-trees or keys and content for table b-trees.  Data is also







|









|

|







489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
uses more than one fourth of the available storage space on the page
and hence every internal page is able to store at least 4 keys.
The integer keys of table b-trees are never large enough to
require overflow, so key overflow only occurs on index b-trees.</p>

<p>Define the depth
of a leaf b-tree to be 1 and the depth of any interior b-tree to be one
more than the maximum depth of any of its children.  In a well-formed
database, all children of an interior b-tree have the same depth.</p>

<p>In an interior b-tree page, the pointers and keys logically alternate 
with a pointer on both ends. (The previous sentence is to be understood
conceptually - the actual layout of the keys and
pointers within the page is more complicated and will be described in
the sequel.)  All keys within the same page are unique and are logically
organized in ascending order from left to right.  (Again, this ordering
is logical, not physical.  The actual location of keys within the page
is arbitrary.) For any key X, pointers to the left
of a X refer to b-tree pages on which all keys are less than or equal to X.
Pointers to the right of X refer to pages where all keys are 
greater than X.</p>

<p>Within an interior b-tree page, each key and the pointer to its
immediate left are combined into a structure called a "cell".  The
right-most pointer is held separately.  A leaf b-tree page has no
pointers, but it still uses the cell structure to hold keys for
index b-trees or keys and content for table b-trees.  Data is also
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672

<p>The 100-byte database file header is found only on page 1, which is
always a table b-tree page.  All other b-tree pages in the database file
omit this 100-byte header.</p>

<p>The reserved region is an area of unused space at the end of every
page (except the locking page) that extensions can use to hold per-page
information.  ^The size of the reserved region is determined by the one-byte
unsigned integer found at an offset of 20 into the database file header.
The size of the reserved region is usually zero.</p>

<p>The b-tree page header is 8 bytes in size for leaf pages and 12
bytes for interior pages.  All multibyte values in the page header
are big-endian.
The b-tree page header is composed of the following fields:</p>

<center>
<i>B-tree Page Header Format</i><br>
<table border=1 width="80%">
<tr><th>Offset<th>Size<th>Description
<tr><td align=center valign=top>0<td align=center valign=top>1<td align=left>
^The one-byte flag at offset 0 indicating the b-tree page type.<ul>
<li>^A value of 2 (0x02) means the page is an interior index b-tree page.
<li>^A value of 5 (0x05) means the page is an interior table b-tree page.
<li>^A value of 10 (0x0a) means the page is a leaf index b-tree page.
<li>^A value of 13 (0x0d) means the page is a leaf table b-tree page.</ul>
^Any other value for the b-tree page type is an error.
<tr><td align=center valign=top>1<td align=center valign=top>2<td align=left>
^The two-byte integer at offset 1 gives the start of the
first freeblock on the page, or is zero if there are no freeblocks.
<tr><td align=center valign=top>3<td align=center valign=top>2<td align=left>
^The two-byte integer at offset 3 gives the number of cells on the page.
<tr><td align=center valign=top>5<td align=center valign=top>2<td align=left>
^(The two-byte integer at offset 5 designates the start of the cell content
area.  A zero value for this integer is interpreted as 65536.)^
<tr><td align=center valign=top>7<td align=center valign=top>1<td align=left>
^The one-byte integer at offset 7 gives the number of fragmented free
bytes within the cell content area.
<tr><td align=center valign=top>8<td align=center valign=top>4<td align=left>
^(The four-byte page number at offset 8 is the right-most pointer.  This
value appears in the header of interior b-tree pages only and is omitted from
all other pages.)^
</table></center>

<p>^The cell pointer array of a b-tree page immediately follows the b-tree
page header.  Let K be the number of cells on the btree.  ^The cell pointer
array consists of K 2-byte integer offsets to the cell contents.  ^The
cell pointers are arranged in key order with left-most cell (the cell with the
smallest key) first and the right-most cell (the cell with the largest
key) last.</p>

<p>Cell content is stored in the cell content region of the b-tree page.
SQLite strives to place cells as far toward the end of the b-tree page as
it can, in order to leave space for future growth of the cell pointer array.
The area in between the last cell pointer array entry and the beginning of
the first cell is the unallocated region.
</p>

<p>^If a page contains no cells (which is only possible for a root page
of a table that contains no rows) then the offset to the
cell content area will equal the page size minus the bytes of reserved space.
^(If the database uses a 65536-byte page size and the reserved space is zero
(the usual value for reserved space) then the cell content offset of an
empty page wants to be 65536.  
However, that integer is too large to be stored in a
2-byte unsigned integer, so a value of 0 is used in its place.)^

<p>A freeblock is a structure used to identify unallocated space within
a b-tree page.  Freeblocks are organized as a chain.  ^The first 2 bytes of
a freeblock are a big-endian integer which is the offset in the b-tree page
of the next freeblock in the chain, or zero if the freeblock is the last on
the chain.  ^The third and fourth bytes of each freeblock form
a big-endian integer which is the size of the freeblock in bytes, including
the 4-byte header.  ^Freeblocks are always connected in order 
of increasing offset.  ^The second field of the b-tree page header is the
offset of the first freeblock, or zero if there are no freeblocks on the
page.  ^In a well-formed b-tree page, there will always be at least one cell
before the first freeblock.</p>

<p>A freeblock requires at least 4 bytes of space.  If there is an isolated
group of 1, 2, or 3 unused bytes within the cell content area, those bytes
comprise a fragment.  ^The total number of bytes in all fragments is stored
in the fifth field of the b-tree page header.  ^In a well-formed b-tree page,
the total number of bytes in fragments may not exceed 60.</p>

<p>The total amount of free space on a b-tree page consists of the size
of the unallocated region plus the total size of all freeblocks plus the
number of fragmented free bytes.  ^SQLite may from time to time reorganize
a b-tree page so that there are no freeblocks or fragment bytes, all
unused bytes are contained in the unallocated space region, and all
cells are packed tightly at the end of the page.  This is called 
"defragmenting" the b-tree page.</p>

<tcl>hd_fragment varint {variable-length integer} {varint}</tcl>








|













|
|
|
|
|
|

|


|

|
|

|


|

|


|
|
|











|


|



|


|


|

|
|

|




|
|




|







578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672

<p>The 100-byte database file header is found only on page 1, which is
always a table b-tree page.  All other b-tree pages in the database file
omit this 100-byte header.</p>

<p>The reserved region is an area of unused space at the end of every
page (except the locking page) that extensions can use to hold per-page
information.  The size of the reserved region is determined by the one-byte
unsigned integer found at an offset of 20 into the database file header.
The size of the reserved region is usually zero.</p>

<p>The b-tree page header is 8 bytes in size for leaf pages and 12
bytes for interior pages.  All multibyte values in the page header
are big-endian.
The b-tree page header is composed of the following fields:</p>

<center>
<i>B-tree Page Header Format</i><br>
<table border=1 width="80%">
<tr><th>Offset<th>Size<th>Description
<tr><td align=center valign=top>0<td align=center valign=top>1<td align=left>
The one-byte flag at offset 0 indicating the b-tree page type.<ul>
<li>A value of 2 (0x02) means the page is an interior index b-tree page.
<li>A value of 5 (0x05) means the page is an interior table b-tree page.
<li>A value of 10 (0x0a) means the page is a leaf index b-tree page.
<li>A value of 13 (0x0d) means the page is a leaf table b-tree page.</ul>
Any other value for the b-tree page type is an error.
<tr><td align=center valign=top>1<td align=center valign=top>2<td align=left>
The two-byte integer at offset 1 gives the start of the
first freeblock on the page, or is zero if there are no freeblocks.
<tr><td align=center valign=top>3<td align=center valign=top>2<td align=left>
The two-byte integer at offset 3 gives the number of cells on the page.
<tr><td align=center valign=top>5<td align=center valign=top>2<td align=left>
(The two-byte integer at offset 5 designates the start of the cell content
area.  A zero value for this integer is interpreted as 65536.)
<tr><td align=center valign=top>7<td align=center valign=top>1<td align=left>
The one-byte integer at offset 7 gives the number of fragmented free
bytes within the cell content area.
<tr><td align=center valign=top>8<td align=center valign=top>4<td align=left>
(The four-byte page number at offset 8 is the right-most pointer.  This
value appears in the header of interior b-tree pages only and is omitted from
all other pages.)
</table></center>

<p>The cell pointer array of a b-tree page immediately follows the b-tree
page header.  Let K be the number of cells on the btree.  The cell pointer
array consists of K 2-byte integer offsets to the cell contents.  The
cell pointers are arranged in key order with left-most cell (the cell with the
smallest key) first and the right-most cell (the cell with the largest
key) last.</p>

<p>Cell content is stored in the cell content region of the b-tree page.
SQLite strives to place cells as far toward the end of the b-tree page as
it can, in order to leave space for future growth of the cell pointer array.
The area in between the last cell pointer array entry and the beginning of
the first cell is the unallocated region.
</p>

<p>If a page contains no cells (which is only possible for a root page
of a table that contains no rows) then the offset to the
cell content area will equal the page size minus the bytes of reserved space.
(If the database uses a 65536-byte page size and the reserved space is zero
(the usual value for reserved space) then the cell content offset of an
empty page wants to be 65536.  
However, that integer is too large to be stored in a
2-byte unsigned integer, so a value of 0 is used in its place.)

<p>A freeblock is a structure used to identify unallocated space within
a b-tree page.  Freeblocks are organized as a chain.  The first 2 bytes of
a freeblock are a big-endian integer which is the offset in the b-tree page
of the next freeblock in the chain, or zero if the freeblock is the last on
the chain.  The third and fourth bytes of each freeblock form
a big-endian integer which is the size of the freeblock in bytes, including
the 4-byte header.  Freeblocks are always connected in order 
of increasing offset.  The second field of the b-tree page header is the
offset of the first freeblock, or zero if there are no freeblocks on the
page.  In a well-formed b-tree page, there will always be at least one cell
before the first freeblock.</p>

<p>A freeblock requires at least 4 bytes of space.  If there is an isolated
group of 1, 2, or 3 unused bytes within the cell content area, those bytes
comprise a fragment.  The total number of bytes in all fragments is stored
in the fifth field of the b-tree page header.  In a well-formed b-tree page,
the total number of bytes in fragments may not exceed 60.</p>

<p>The total amount of free space on a b-tree page consists of the size
of the unallocated region plus the total size of all freeblocks plus the
number of fragmented free bytes.  SQLite may from time to time reorganize
a b-tree page so that there are no freeblocks or fragment bytes, all
unused bytes are contained in the unallocated space region, and all
cells are packed tightly at the end of the page.  This is called 
"defragmenting" the b-tree page.</p>

<tcl>hd_fragment varint {variable-length integer} {varint}</tcl>

779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
on the b-tree page without spilling onto an overflow page and symbol M
represents the minimum amount of payload that must be stored on the btree
page before spilling is allowed.

<dl>
<dt><p>Table B-Tree Leaf Cell:</dt>
<dd><p>
^(Let X be U-35.  If the payload size P is less than or equal to X then
the entire payload is stored on the b-tree leaf page.)^
^(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X
then the number of bytes stored on the table b-tree leaf page is K
if K is less or equal to X or M otherwise.)^
^(The number of bytes stored on the leaf page is never less than M.)^
</p></dd>

<dt><p>Table B-Tree Interior Cell:</dt>
<dd><p>
Interior pages of table b-trees have no payload and so there is never
any payload to spill.
</p></dd>

<dt><p>Index B-Tree Leaf Or Interior Cell:</dt>
<dd><p>
^(Let X be ((U-12)*64/255)-23.  If the payload size P is less than
or equal to X then the entire payload is stored on the b-tree page.)^
^(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X then the number
of bytes stored on the index b-tree page is K if K is less than or
equal to X or M otherwise.)^
^(The number of bytes stored on the index page is never less than M.)^
</p></dd>
</dl>

<p>Here is an alternative description of the same computation:

<ul>
<li>X is U-35 for table btree leaf pages or
    ((U-12)*64/255)-23 for index pages.
<li>M is always ((U-12)*32/255)-23.
<li>Let K be M+((P-M)%(U-4)).
<li>^If P&lt;=X then all P bytes of payload are stored directly on the 
    btree page without overflow.
<li>^If P&gt;X and K&lt;=X then the first K bytes of P are stored on the 
    btree page and the remaining P-K bytes are stored on overflow pages.
<li>^If P&gt;X and K&gt;X then the first M bytes of P are stored on the
    btree page and the remaining P-M bytes are stored on overflow pages.
</ul>

<p>The overflow thresholds are designed to give a minimum fanout of
4 for index b-trees and to make sure enough of the payload
is on the b-tree page that the record header can usually be accessed
without consulting an overflow page.  In hindsight, the designer of
the SQLite b-tree logic realized that these thresholds could have been
made much simpler.  However, the computations cannot be changed
without resulting in an incompatible file format.  And the current computations
work well, even if they are a little complex.</p>

<tcl>hd_fragment ovflpgs {overflow page} {overflow pages}</tcl>
<h2>Cell Payload Overflow Pages</h2>

<p>^When the payload of a b-tree cell is too large for the b-tree page,
the surplus is spilled onto overflow pages.  ^Overflow pages form a linked
list.  ^The first four bytes of each overflow page are a big-endian
integer which is the page number of the next page in the chain, or zero
for the final page in the chain.  ^The fifth byte through the last usable
byte are used to hold overflow content.</p>

<h2>Pointer Map or Ptrmap Pages</h2>

<p>Pointer map or ptrmap pages are extra pages inserted into the database
to make the operation of [auto_vacuum] and [incremental_vacuum] modes
more efficient.  Other page types in the database typically have pointers
from parent to child.  For example, an interior b-tree page contains pointers
to its child b-tree pages and an overflow chain has a pointer
from earlier to later links in the chain.  A ptrmap page contains linkage
information going in the opposite direction, from child to parent.</p>

<p>^Ptrmap pages must exist in any database file which has a non-zero
largest root b-tree page value at offset 52 in the database header.
^If the largest root b-tree page value is zero, then the database must not
contain ptrmap pages.</p>

<p>^In a database with ptrmap pages, the first ptrmap page is page 2.
A ptrmap page consists of an array of 5-byte entries.  Let J be the
number of 5-byte entries that will fit in the usable space of a page.
(In other words, J=U/5.)  ^The first ptrmap page will contain back pointer
information for pages 3 through J+2, inclusive.  ^The second pointer map
page will be on page J+3 and that ptrmap page will provide back pointer
information for pages J+4 through 2*J+3 inclusive.  And so forth for
the entire database file.</p>

<p>^(In a database that uses ptrmap pages, all pages at locations identified
by the computation in the previous paragraph must be ptrmap page and no
other page may be a ptrmap page.  Except, if the byte-lock page happens to
fall on the same page number as a ptrmap page, then the ptrmap is moved
to the following page for that one case.)^</p>

<p>Each 5-byte entry on a ptrmap page provides back-link information about 
one of the pages that immediately follow the pointer map.  ^(If page B is a
ptrmap page then back-link information about page B+1 is provided by
the first entry on the pointer map.  Information about page B+2 is
provided by the second entry.  And so forth.)^</p>

<p>Each 5-byte ptrmap entry consists of one byte of "page type" information
followed by a 4-byte big-endian page number.  Five page types are recognized:
</p>

<ol>
<li>A b-tree root page.  The
page number should be zero.
<li>A freelist page.  The page number should be
zero.
<li>The first page of a
cell payload overflow chain.  The page number is the b-tree page that
contains the cell whose content has overflowed.
<li>A page in an overflow chain
other than the first page.  The page number is the prior page of the
overflow chain.
<li>A non-root b-tree page.  The
page number is the parent b-tree page.
</ol>

<p>^In any database file that contains ptrmap pages, all b-tree root pages 
must come before any non-root b-tree page, cell payload overflow page, or
freelist page.  This restriction ensures that a root page will never
be moved during an auto-vacuum or incremental-vacuum.  The auto-vacuum
logic does not know how to update the root_page field of the sqlite_schema
table and so it is necessary to prevent root pages from being moved
during an auto-vacuum in order to preserve the integrity of the
sqlite_schema table.  ^Root pages are moved to the beginning of the
database file by the CREATE TABLE, CREATE INDEX, DROP TABLE, and
DROP INDEX operations.</p>

<h1>Schema Layer</h1>

<p>The foregoing text describes low-level aspects of the SQLite file
format.  The b-tree mechanism provides a powerful and efficient means of







|
|
|


|
|










|
|
|


|
|










|

|

|















|
|
|

|












|

|


|


|
|




|



|


|


|




















|






|







779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
on the b-tree page without spilling onto an overflow page and symbol M
represents the minimum amount of payload that must be stored on the btree
page before spilling is allowed.

<dl>
<dt><p>Table B-Tree Leaf Cell:</dt>
<dd><p>
(Let X be U-35.  If the payload size P is less than or equal to X then
the entire payload is stored on the b-tree leaf page.)
(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X
then the number of bytes stored on the table b-tree leaf page is K
if K is less or equal to X or M otherwise.)
(The number of bytes stored on the leaf page is never less than M.)
</p></dd>

<dt><p>Table B-Tree Interior Cell:</dt>
<dd><p>
Interior pages of table b-trees have no payload and so there is never
any payload to spill.
</p></dd>

<dt><p>Index B-Tree Leaf Or Interior Cell:</dt>
<dd><p>
(Let X be ((U-12)*64/255)-23.  If the payload size P is less than
or equal to X then the entire payload is stored on the b-tree page.)
(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X then the number
of bytes stored on the index b-tree page is K if K is less than or
equal to X or M otherwise.)
(The number of bytes stored on the index page is never less than M.)
</p></dd>
</dl>

<p>Here is an alternative description of the same computation:

<ul>
<li>X is U-35 for table btree leaf pages or
    ((U-12)*64/255)-23 for index pages.
<li>M is always ((U-12)*32/255)-23.
<li>Let K be M+((P-M)%(U-4)).
<li>If P&lt;=X then all P bytes of payload are stored directly on the 
    btree page without overflow.
<li>If P&gt;X and K&lt;=X then the first K bytes of P are stored on the 
    btree page and the remaining P-K bytes are stored on overflow pages.
<li>If P&gt;X and K&gt;X then the first M bytes of P are stored on the
    btree page and the remaining P-M bytes are stored on overflow pages.
</ul>

<p>The overflow thresholds are designed to give a minimum fanout of
4 for index b-trees and to make sure enough of the payload
is on the b-tree page that the record header can usually be accessed
without consulting an overflow page.  In hindsight, the designer of
the SQLite b-tree logic realized that these thresholds could have been
made much simpler.  However, the computations cannot be changed
without resulting in an incompatible file format.  And the current computations
work well, even if they are a little complex.</p>

<tcl>hd_fragment ovflpgs {overflow page} {overflow pages}</tcl>
<h2>Cell Payload Overflow Pages</h2>

<p>When the payload of a b-tree cell is too large for the b-tree page,
the surplus is spilled onto overflow pages.  Overflow pages form a linked
list.  The first four bytes of each overflow page are a big-endian
integer which is the page number of the next page in the chain, or zero
for the final page in the chain.  The fifth byte through the last usable
byte are used to hold overflow content.</p>

<h2>Pointer Map or Ptrmap Pages</h2>

<p>Pointer map or ptrmap pages are extra pages inserted into the database
to make the operation of [auto_vacuum] and [incremental_vacuum] modes
more efficient.  Other page types in the database typically have pointers
from parent to child.  For example, an interior b-tree page contains pointers
to its child b-tree pages and an overflow chain has a pointer
from earlier to later links in the chain.  A ptrmap page contains linkage
information going in the opposite direction, from child to parent.</p>

<p>Ptrmap pages must exist in any database file which has a non-zero
largest root b-tree page value at offset 52 in the database header.
If the largest root b-tree page value is zero, then the database must not
contain ptrmap pages.</p>

<p>In a database with ptrmap pages, the first ptrmap page is page 2.
A ptrmap page consists of an array of 5-byte entries.  Let J be the
number of 5-byte entries that will fit in the usable space of a page.
(In other words, J=U/5.)  The first ptrmap page will contain back pointer
information for pages 3 through J+2, inclusive.  The second pointer map
page will be on page J+3 and that ptrmap page will provide back pointer
information for pages J+4 through 2*J+3 inclusive.  And so forth for
the entire database file.</p>

<p>(In a database that uses ptrmap pages, all pages at locations identified
by the computation in the previous paragraph must be ptrmap page and no
other page may be a ptrmap page.  Except, if the byte-lock page happens to
fall on the same page number as a ptrmap page, then the ptrmap is moved
to the following page for that one case.)</p>

<p>Each 5-byte entry on a ptrmap page provides back-link information about 
one of the pages that immediately follow the pointer map.  (If page B is a
ptrmap page then back-link information about page B+1 is provided by
the first entry on the pointer map.  Information about page B+2 is
provided by the second entry.  And so forth.)</p>

<p>Each 5-byte ptrmap entry consists of one byte of "page type" information
followed by a 4-byte big-endian page number.  Five page types are recognized:
</p>

<ol>
<li>A b-tree root page.  The
page number should be zero.
<li>A freelist page.  The page number should be
zero.
<li>The first page of a
cell payload overflow chain.  The page number is the b-tree page that
contains the cell whose content has overflowed.
<li>A page in an overflow chain
other than the first page.  The page number is the prior page of the
overflow chain.
<li>A non-root b-tree page.  The
page number is the parent b-tree page.
</ol>

<p>In any database file that contains ptrmap pages, all b-tree root pages 
must come before any non-root b-tree page, cell payload overflow page, or
freelist page.  This restriction ensures that a root page will never
be moved during an auto-vacuum or incremental-vacuum.  The auto-vacuum
logic does not know how to update the root_page field of the sqlite_schema
table and so it is necessary to prevent root pages from being moved
during an auto-vacuum in order to preserve the integrity of the
sqlite_schema table.  Root pages are moved to the beginning of the
database file by the CREATE TABLE, CREATE INDEX, DROP TABLE, and
DROP INDEX operations.</p>

<h1>Schema Layer</h1>

<p>The foregoing text describes low-level aspects of the SQLite file
format.  The b-tree mechanism provides a powerful and efficient means of
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285

<p>The record format makes extensive use of the 
[variable-length integer] or [varint]
representation of 64-bit signed integers defined above.</p>

<tcl>hd_fragment serialtype {serial type} {serial types}</tcl>
<p>A record contains a header and a body, in that order.  
^(The header begins with a single varint which determines the total number
of bytes in the header.  The varint value is the size of the header in
bytes including the size varint itself.)^  ^Following the size varint are
one or more additional varints, one per column.  These additional varints
are called "serial type" numbers and
determine the datatype of each column, according to the following chart:</p>

<center>
<i>Serial Type Codes Of The Record Format</i><br>
<table width="80%" border=1>
<tr><th>Serial Type<th>Content Size<th>Meaning
<tr><td valign=top align=center>0<td valign=top align=center>0<td align=left>
^Value is a NULL.
<tr><td valign=top align=center>1<td valign=top align=center>1<td align=left>
^Value is an 8-bit twos-complement integer.
<tr><td valign=top align=center>2<td valign=top align=center>2<td align=left>
^Value is a big-endian 16-bit twos-complement integer.
<tr><td valign=top align=center>3<td valign=top align=center>3<td align=left>
^Value is a big-endian 24-bit twos-complement integer.
<tr><td valign=top align=center>4<td valign=top align=center>4<td align=left>
^Value is a big-endian 32-bit twos-complement integer.
<tr><td valign=top align=center>5<td valign=top align=center>6<td align=left>
^Value is a big-endian 48-bit twos-complement integer.
<tr><td valign=top align=center>6<td valign=top align=center>8<td align=left>
^Value is a big-endian 64-bit twos-complement integer.
<tr><td valign=top align=center>7<td valign=top align=center>8<td align=left>
^Value is a big-endian IEEE 754-2008 64-bit floating point number.
<tr><td valign=top align=center>8<td valign=top align=center>0<td align=left>
^Value is the integer 0. (Only available for [schema format] 4 and higher.)
<tr><td valign=top align=center>9<td valign=top align=center>0<td align=left>
^Value is the integer 1. (Only available for [schema format] 4 and higher.)
<tr><td valign=top align=center>10,11
    <td valign=top align=center><i>variable</i><td align=left>
<i>Reserved for internal use.  These serial type codes will
   never appear in a well-formed database file, but they
   might be used in transient and temporary database files
   that SQLite sometimes generates for its own use.
   The meanings of these codes can shift from one release
   of SQLite to the next.</i>
<tr><td valign=top align=center>N&#x2265;12 and even
    <td valign=top align=center>(N-12)/2<td align=left>
^Value is a BLOB that is (N-12)/2 bytes in length.
<tr><td valign=top align=center>N&#x2265;13 and odd
    <td valign=top align=center>(N-13)/2<td align=left>
^Value is a string in the [text encoding] and (N-13)/2 bytes in length.
^The nul terminator is not stored.
</table></center>

<p>The header size varint
and serial type varints will usually consist of a single byte.  The
serial type varints for large strings and BLOBs might extend to two or three
byte varints, but that is the exception rather than the rule. 
The varint format is very efficient at coding the record header.</p>

<p>^The values for each column in the record immediately follow the header.
^(For serial types 0, 8, 9, 12, and 13, the value is zero bytes in
length.  If all columns are of these types then the body section of the
record is empty.)^</p>

<p>^A record might have fewer values than the number of columns in the
corresponding table.  This can happen, for example, after an
[ALTER TABLE|ALTER TABLE ... ADD COLUMN] SQL statement has increased
the number of columns in the table schema without modifying preexisting rows
in the table.
^Missing values at the end of the record are filled in using the
[default value] for the corresponding columns defined in the table schema.
</p>


<h2>Record Sort Order</h2>

<p>The order of keys in an index b-tree is determined by the sort order of
the records that the keys represent.  Record comparison progresses column
by column.  Columns of a record are examined from left to right.  The
first pair of columns that are not equal determines the relative order
of the two records.  The sort order of individual columns is as
follows:</p>

<ol>
<li>^(NULL values (serial type 0) sort first.)^
<li>^(Numeric values (serial types 1 through 9) sort after NULLs
      and in numeric order.)^
<li>^(Text values (odd serial types 13 and larger) sort after numeric
    values in the order determined by the columns [collating function].)^
<li>^(BLOB values (even serial types 12 and larger) sort last and in the order 
    determined by memcmp().)^
</ol>

<p>A [collating function] for each column is necessary in order to compute
the order of text fields.
^(SQLite defines three built-in collating functions:)^
</p>

<blockquote><table border=0 cellspacing=10>
<tr><td valign=top>BINARY
    <td> ^(The built-in BINARY collation compares strings byte by byte
        using the memcmp() function
        from the standard C library.)^
<tr><td valign=top>NOCASE
    <td> ^(The NOCASE collation is like BINARY except that uppercase
        ASCII characters ('A' through 'Z')
        are folded into their lowercase equivalents prior to running the
        comparison.  Only ASCII characters are case-folded.)^
        ^(NOCASE
        does not implement a general purpose unicode caseless comparison.)^
<tr><td valign=top>RTRIM
    <td> ^(RTRIM is like BINARY except that extra spaces at the end of either
         string do not change the result.  In other words, strings will
         compare equal to one another as long as they
         differ only in the number of spaces at the end.)^
</table></blockquote>

<p>^Additional application-specific collating functions can be added to
SQLite using the [sqlite3_create_collation()] interface.</p>

<p>^The default collating function for all strings is BINARY.
^Alternative collating functions for table columns can be specified in the
[CREATE TABLE] statement using the COLLATE clause on the [column definition].
^When a column is indexed, the same collating function specified in the
[CREATE TABLE] statement is used for the column in the index, by default,
though this can be overridden using a COLLATE clause in the 
[CREATE INDEX] statement.

<tcl>hd_fragment #sqltab {table data format}</tcl>
<h2>Representation Of SQL Tables</h2>

<p> ^Each ordinary SQL table in the database schema is represented on-disk
by a table b-tree.  ^Each entry in the table b-tree corresponds to a row
of the SQL table.  ^The [rowid] of the SQL table is the 64-bit signed
integer key for each entry in the table b-tree.</p>

<p> ^The content of each SQL table row is stored in the database file by
first combining the values in the various columns into a byte array
in the record format, then storing that byte array as the payload in
an entry in the table b-tree.  ^The order of values in the record is
the same as the order of columns in the SQL table definition.
^When an SQL table includes an
[INTEGER PRIMARY KEY] column (which aliases the [rowid]) then that
column appears in the record as a NULL value.  ^SQLite will always use
the table b-tree key rather than the NULL value when referencing the
[INTEGER PRIMARY KEY] column.</p>

<p> ^If the [affinity] of a column is REAL and that column contains a
value that can be converted to an integer without loss of information
(if the value contains no fractional part and is not too large to be
represented as an integer) then the column may be stored in the record
as an integer.  ^SQLite will convert the value back to floating
point when extracting it from the record.</p>

<h2>Representation of WITHOUT ROWID Tables</h2>

<p>^If an SQL table is created using the "WITHOUT ROWID" clause at the
end of its CREATE TABLE statement, then that table is a [WITHOUT ROWID]
table and uses a different on-disk representation.  ^A WITHOUT ROWID
table uses an index b-tree rather than a table b-tree for storage.
^The key for each entry in the WITHOUT ROWID b-tree is a record composed
of the columns of the PRIMARY KEY followed by all remaining columns of
the table.  ^The primary key columns appear in the order that they were
declared in the PRIMARY KEY clause and the remaining columns appear in
the order they occur in the CREATE TABLE statement.

<p>^Hence, the content encoding for a WITHOUT ROWID table is the same
as the content encoding for an ordinary rowid table, except that the
order of the columns is rearranged so that PRIMARY KEY columns come
first, and the content is used as the key in an index b-tree rather
than as the data in a table b-tree.
^The special encoding rules for columns with REAL affinity
apply to WITHOUT ROWID tables the same as they do with rowid tables.

<h3>Suppression of redundant columns in the PRIMARY KEY
    of WITHOUT ROWID tables</h3>

<p>^If the PRIMARY KEY of a WITHOUT ROWID tables uses the same columns
with the same collating sequence more than once, then the second and
subsequent occurrences of that column in the PRIMARY KEY definition are
ignored.  ^(For example, the following CREATE TABLE statements all specify
the same table, which will have the exact same representation on disk:

<blockquote><pre>
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,c)) WITHOUT ROWID;
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,c,a,c)) WITHOUT ROWID;
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,A,a,C)) WITHOUT ROWID;
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,a,a,a,c)) WITHOUT ROWID;
</pre></blockquote>)^

<p>The first example above is the preferred definition of the table,
of course.  ^All of the examples create a WITHOUT ROWID table with
two PRIMARY KEY columns, "a" and "c", in that order, followed by
two data columns "b" and "d", also in that order.

<h2>Representation Of SQL Indices</h2>

<p>^Each SQL index, whether explicitly declared via a [CREATE INDEX] statement
or implied by a UNIQUE or PRIMARY KEY constraint, corresponds to an 
index b-tree in the database file.
^Each entry in the index b-tree corresponds to a single row in the 
associated SQL table.
^The key to an index b-tree is
a record composed of the columns that are being indexed followed by the
key of the corresponding table row.  For ordinary tables, the row key is
the [rowid], and for [WITHOUT ROWID] tables the row key is the PRIMARY KEY.
Because every row in the table has a unique row key,
all keys in an index are unique.</p>

<p>^In a normal index, there is a one-to-one mapping between rows in a 
table and entries in each index associated with that table.
^However, in a [partial index], the index b-tree only contains entries
corresponding to table rows for which the WHERE clause expression on the
CREATE INDEX statement is true.
^Corresponding rows in the index and table b-trees share the same rowid
or primary key values and contain the same value for all indexed columns.</p>

<h3>Suppression of redundant columns in WITHOUT ROWID secondary indexes
</h3>

<p> ^In an index on a WITHOUT ROWID table, if a column of the PRIMARY KEY
is also a column in the index and has a matching collating sequence, then the
indexed column is not repeated in the table-key suffix on the
end of the index record.  ^(As an example, consider the following SQL:

<blockquote><pre>
CREATE TABLE ex25(a,b,c,d,e,PRIMARY KEY(d,c,a)) WITHOUT rowid;
CREATE INDEX ex25ce ON ex25(c,e);
CREATE INDEX ex25acde ON ex25(a,c,d,e);
CREATE INDEX ex25ae ON ex25(a COLLATE nocase,e);
</pre></blockquote>

<p>Each row in the ex25ce index is a record
with these columns: c, e, d, a.  The first two columns are
the columns being indexed, c and e.  The remaining columns are the primary
key of the corresponding table row.  Normally, the primary key would be
columns d, c, and a, but because column c already appears earlier in the
index, it is omitted from the key suffix.)^</p>

<p>^(In the extreme case where the columns being indexed cover all columns
of the PRIMARY KEY, the index will consist of only the columns being
indexed.  The ex25acde example above demonstrates this.)^  ^Each entry in
the ex25acde index consists of only the columns a, c, d, and e, in that
order.</p>

<p>Each row in ex25ae contains five columns: a, e, d, c, a.  The "a"
column is repeated since the first occurrence of "a" has a collating
function of "nocase" and the second has a collating sequence of "binary".
If the "a" column is not repeated and if the table contains two or more
entries with the same "e" value and where "a" differs only in case, then
all of those table entries would correspond to a single entry in the
index, which would break the one-to-one correspondence between the table
and the index.

<p> ^The suppression of redundant columns in the key suffix of an index
entry only occurs in WITHOUT ROWID tables.  ^In an ordinary rowid table,
the index entry always ends with the rowid even if the [INTEGER PRIMARY KEY]
column is one of the columns being indexed.</p>

<tcl>hd_fragment ffschema {schema storage}</tcl>
<h2>Storage Of The SQL Database Schema</h2>

<p>^Page 1 of a database file is the root page of a table b-tree that
holds a special table named "[sqlite_schema]".  This b-tree is known
as the "schema table" since it stores the complete
database schema.  ^(The structure of the sqlite_schema table is as
if it had been created using the following SQL:</p>

<blockquote><pre>
CREATE TABLE sqlite_schema(
  type text,
  name text,
  tbl_name text,
  rootpage integer,
  sql text
);
</pre></blockquote>)^

<p>^The sqlite_schema table contains one row for each table, index, view,
and trigger (collectively "objects") in the database schema, except there
is no entry for the sqlite_schema table itself.  ^The sqlite_schema table
contains entries for [internal schema objects] in addition to application-
and programmer-defined objects.


<p>^(The sqlite_schema.type column will be one
of the following text strings:  'table', 'index', 'view', or 'trigger'
according to the type of object defined.  The 'table' string is used
for both ordinary and [virtual tables].)^</p>

<p>^(The sqlite_schema.name column will hold the name of the object.)^
^([UNIQUE] and [PRIMARY KEY] constraints on tables cause SQLite to create
[internal indexes] with names of the form "sqlite_autoindex_TABLE_N"
where TABLE is replaced by the name of the table that contains the
constraint and N is an integer beginning with 1 and increasing by one
with each constraint seen in the table definition.)^
^(In a [WITHOUT ROWID] table, there is no sqlite_schema entry for the
PRIMARY KEY, but the "sqlite_autoindex_TABLE_N" name is set aside
for the PRIMARY KEY as if the sqlite_schema entry did exist.  This
will affect the numbering of subsequent UNIQUE constraints.)^
^The "sqlite_autoindex_TABLE_N" name is never allocated for an
[INTEGER PRIMARY KEY], either in rowid tables or WITHOUT ROWID tables.
</p>

<p>The sqlite_schema.tbl_name column holds the name of a table or view
that the object is associated with.  ^For a table or view, the
tbl_name column is a copy of the name column.  ^For an index, the tbl_name
is the name of the table that is indexed.  ^For a trigger, the tbl_name
column stores the name of the table or view that causes the trigger 
to fire.</p>

<p>^(The sqlite_schema.rootpage column stores the page number of the root
b-tree page for tables and indexes.)^  ^For rows that define views, triggers,
and virtual tables, the rootpage column is 0 or NULL.</p>

<p>^(The sqlite_schema.sql column stores SQL text that describes the
object.  This SQL text is a [CREATE TABLE], [CREATE VIRTUAL TABLE],
[CREATE INDEX],
[CREATE VIEW], or [CREATE TRIGGER] statement that if evaluated against
the database file when it is the main database of a [database connection]
would recreate the object.)^  The text is usually a copy of the original
statement used to create the object but with normalizations applied so
that the text conforms to the following rules:

<ul>
<li>^The CREATE, TABLE, VIEW, TRIGGER, and INDEX keywords at the beginning
of the statement are converted to all upper case letters.
<li>^The TEMP or TEMPORARY keyword is removed if it occurs after the 
initial CREATE keyword.
<li>^Any database name qualifier that occurs prior to the name of the
object being created is removed.
<li>^Leading spaces are removed.
<li>^All spaces following the first two keywords are converted into a single
space.
</ul>

<p>^(The text in the sqlite_schema.sql column is a copy of the original
CREATE statement text that created the object, except normalized as
described above and as modified by subsequent [ALTER TABLE] statements.)^
^(The sqlite_schema.sql is NULL for the [internal indexes] that are
automatically created by [UNIQUE] or [PRIMARY KEY] constraints.)^</p>

<h3>Alternative Names For The Schema Table</h3>

<p>The name "sqlite_schema" does not appear anywhere in the file format.
That name is just a convention used by the database implementation.
Due to historical and operational considerations, the
"sqlite_schema" table can also sometimes be called by one of the







|

|









|

|

|

|

|

|

|

|

|

|










|


|
|








|
|

|

|




|














|
|
|
|
|
|
|




|




|

|

|


|
|
|

|


|


|


|
|

|







|
|
|


|


|

|

|



|



|




|

|

|

|



|




|





|


|







|


|





|


|

|






|

|


|





|


|













|

|

|












|
|






|


|










|

|

|




|


|

|
|



|
|


|
|




|
|
|



|
|


|




|




|

|

|

|
|



|

|
|
|







936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285

<p>The record format makes extensive use of the 
[variable-length integer] or [varint]
representation of 64-bit signed integers defined above.</p>

<tcl>hd_fragment serialtype {serial type} {serial types}</tcl>
<p>A record contains a header and a body, in that order.  
(The header begins with a single varint which determines the total number
of bytes in the header.  The varint value is the size of the header in
bytes including the size varint itself.)  Following the size varint are
one or more additional varints, one per column.  These additional varints
are called "serial type" numbers and
determine the datatype of each column, according to the following chart:</p>

<center>
<i>Serial Type Codes Of The Record Format</i><br>
<table width="80%" border=1>
<tr><th>Serial Type<th>Content Size<th>Meaning
<tr><td valign=top align=center>0<td valign=top align=center>0<td align=left>
Value is a NULL.
<tr><td valign=top align=center>1<td valign=top align=center>1<td align=left>
Value is an 8-bit twos-complement integer.
<tr><td valign=top align=center>2<td valign=top align=center>2<td align=left>
Value is a big-endian 16-bit twos-complement integer.
<tr><td valign=top align=center>3<td valign=top align=center>3<td align=left>
Value is a big-endian 24-bit twos-complement integer.
<tr><td valign=top align=center>4<td valign=top align=center>4<td align=left>
Value is a big-endian 32-bit twos-complement integer.
<tr><td valign=top align=center>5<td valign=top align=center>6<td align=left>
Value is a big-endian 48-bit twos-complement integer.
<tr><td valign=top align=center>6<td valign=top align=center>8<td align=left>
Value is a big-endian 64-bit twos-complement integer.
<tr><td valign=top align=center>7<td valign=top align=center>8<td align=left>
Value is a big-endian IEEE 754-2008 64-bit floating point number.
<tr><td valign=top align=center>8<td valign=top align=center>0<td align=left>
Value is the integer 0. (Only available for [schema format] 4 and higher.)
<tr><td valign=top align=center>9<td valign=top align=center>0<td align=left>
Value is the integer 1. (Only available for [schema format] 4 and higher.)
<tr><td valign=top align=center>10,11
    <td valign=top align=center><i>variable</i><td align=left>
<i>Reserved for internal use.  These serial type codes will
   never appear in a well-formed database file, but they
   might be used in transient and temporary database files
   that SQLite sometimes generates for its own use.
   The meanings of these codes can shift from one release
   of SQLite to the next.</i>
<tr><td valign=top align=center>N&#x2265;12 and even
    <td valign=top align=center>(N-12)/2<td align=left>
Value is a BLOB that is (N-12)/2 bytes in length.
<tr><td valign=top align=center>N&#x2265;13 and odd
    <td valign=top align=center>(N-13)/2<td align=left>
Value is a string in the [text encoding] and (N-13)/2 bytes in length.
The nul terminator is not stored.
</table></center>

<p>The header size varint
and serial type varints will usually consist of a single byte.  The
serial type varints for large strings and BLOBs might extend to two or three
byte varints, but that is the exception rather than the rule. 
The varint format is very efficient at coding the record header.</p>

<p>The values for each column in the record immediately follow the header.
(For serial types 0, 8, 9, 12, and 13, the value is zero bytes in
length.  If all columns are of these types then the body section of the
record is empty.)</p>

<p>A record might have fewer values than the number of columns in the
corresponding table.  This can happen, for example, after an
[ALTER TABLE|ALTER TABLE ... ADD COLUMN] SQL statement has increased
the number of columns in the table schema without modifying preexisting rows
in the table.
Missing values at the end of the record are filled in using the
[default value] for the corresponding columns defined in the table schema.
</p>


<h2>Record Sort Order</h2>

<p>The order of keys in an index b-tree is determined by the sort order of
the records that the keys represent.  Record comparison progresses column
by column.  Columns of a record are examined from left to right.  The
first pair of columns that are not equal determines the relative order
of the two records.  The sort order of individual columns is as
follows:</p>

<ol>
<li>(NULL values (serial type 0) sort first.)
<li>(Numeric values (serial types 1 through 9) sort after NULLs
      and in numeric order.)
<li>(Text values (odd serial types 13 and larger) sort after numeric
    values in the order determined by the columns [collating function].)
<li>(BLOB values (even serial types 12 and larger) sort last and in the order 
    determined by memcmp().)
</ol>

<p>A [collating function] for each column is necessary in order to compute
the order of text fields.
(SQLite defines three built-in collating functions:)
</p>

<blockquote><table border=0 cellspacing=10>
<tr><td valign=top>BINARY
    <td> (The built-in BINARY collation compares strings byte by byte
        using the memcmp() function
        from the standard C library.)
<tr><td valign=top>NOCASE
    <td> (The NOCASE collation is like BINARY except that uppercase
        ASCII characters ('A' through 'Z')
        are folded into their lowercase equivalents prior to running the
        comparison.  Only ASCII characters are case-folded.)
        (NOCASE
        does not implement a general purpose unicode caseless comparison.)
<tr><td valign=top>RTRIM
    <td> (RTRIM is like BINARY except that extra spaces at the end of either
         string do not change the result.  In other words, strings will
         compare equal to one another as long as they
         differ only in the number of spaces at the end.)
</table></blockquote>

<p>Additional application-specific collating functions can be added to
SQLite using the [sqlite3_create_collation()] interface.</p>

<p>The default collating function for all strings is BINARY.
Alternative collating functions for table columns can be specified in the
[CREATE TABLE] statement using the COLLATE clause on the [column definition].
When a column is indexed, the same collating function specified in the
[CREATE TABLE] statement is used for the column in the index, by default,
though this can be overridden using a COLLATE clause in the 
[CREATE INDEX] statement.

<tcl>hd_fragment #sqltab {table data format}</tcl>
<h2>Representation Of SQL Tables</h2>

<p> Each ordinary SQL table in the database schema is represented on-disk
by a table b-tree.  Each entry in the table b-tree corresponds to a row
of the SQL table.  The [rowid] of the SQL table is the 64-bit signed
integer key for each entry in the table b-tree.</p>

<p> The content of each SQL table row is stored in the database file by
first combining the values in the various columns into a byte array
in the record format, then storing that byte array as the payload in
an entry in the table b-tree.  The order of values in the record is
the same as the order of columns in the SQL table definition.
When an SQL table includes an
[INTEGER PRIMARY KEY] column (which aliases the [rowid]) then that
column appears in the record as a NULL value.  SQLite will always use
the table b-tree key rather than the NULL value when referencing the
[INTEGER PRIMARY KEY] column.</p>

<p> If the [affinity] of a column is REAL and that column contains a
value that can be converted to an integer without loss of information
(if the value contains no fractional part and is not too large to be
represented as an integer) then the column may be stored in the record
as an integer.  SQLite will convert the value back to floating
point when extracting it from the record.</p>

<h2>Representation of WITHOUT ROWID Tables</h2>

<p>If an SQL table is created using the "WITHOUT ROWID" clause at the
end of its CREATE TABLE statement, then that table is a [WITHOUT ROWID]
table and uses a different on-disk representation.  A WITHOUT ROWID
table uses an index b-tree rather than a table b-tree for storage.
The key for each entry in the WITHOUT ROWID b-tree is a record composed
of the columns of the PRIMARY KEY followed by all remaining columns of
the table.  The primary key columns appear in the order that they were
declared in the PRIMARY KEY clause and the remaining columns appear in
the order they occur in the CREATE TABLE statement.

<p>Hence, the content encoding for a WITHOUT ROWID table is the same
as the content encoding for an ordinary rowid table, except that the
order of the columns is rearranged so that PRIMARY KEY columns come
first, and the content is used as the key in an index b-tree rather
than as the data in a table b-tree.
The special encoding rules for columns with REAL affinity
apply to WITHOUT ROWID tables the same as they do with rowid tables.

<h3>Suppression of redundant columns in the PRIMARY KEY
    of WITHOUT ROWID tables</h3>

<p>If the PRIMARY KEY of a WITHOUT ROWID tables uses the same columns
with the same collating sequence more than once, then the second and
subsequent occurrences of that column in the PRIMARY KEY definition are
ignored.  (For example, the following CREATE TABLE statements all specify
the same table, which will have the exact same representation on disk:

<blockquote><pre>
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,c)) WITHOUT ROWID;
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,c,a,c)) WITHOUT ROWID;
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,A,a,C)) WITHOUT ROWID;
CREATE TABLE t1(a,b,c,d,PRIMARY KEY(a,a,a,a,c)) WITHOUT ROWID;
</pre></blockquote>)

<p>The first example above is the preferred definition of the table,
of course.  All of the examples create a WITHOUT ROWID table with
two PRIMARY KEY columns, "a" and "c", in that order, followed by
two data columns "b" and "d", also in that order.

<h2>Representation Of SQL Indices</h2>

<p>Each SQL index, whether explicitly declared via a [CREATE INDEX] statement
or implied by a UNIQUE or PRIMARY KEY constraint, corresponds to an 
index b-tree in the database file.
Each entry in the index b-tree corresponds to a single row in the 
associated SQL table.
The key to an index b-tree is
a record composed of the columns that are being indexed followed by the
key of the corresponding table row.  For ordinary tables, the row key is
the [rowid], and for [WITHOUT ROWID] tables the row key is the PRIMARY KEY.
Because every row in the table has a unique row key,
all keys in an index are unique.</p>

<p>In a normal index, there is a one-to-one mapping between rows in a 
table and entries in each index associated with that table.
However, in a [partial index], the index b-tree only contains entries
corresponding to table rows for which the WHERE clause expression on the
CREATE INDEX statement is true.
Corresponding rows in the index and table b-trees share the same rowid
or primary key values and contain the same value for all indexed columns.</p>

<h3>Suppression of redundant columns in WITHOUT ROWID secondary indexes
</h3>

<p> In an index on a WITHOUT ROWID table, if a column of the PRIMARY KEY
is also a column in the index and has a matching collating sequence, then the
indexed column is not repeated in the table-key suffix on the
end of the index record.  (As an example, consider the following SQL:

<blockquote><pre>
CREATE TABLE ex25(a,b,c,d,e,PRIMARY KEY(d,c,a)) WITHOUT rowid;
CREATE INDEX ex25ce ON ex25(c,e);
CREATE INDEX ex25acde ON ex25(a,c,d,e);
CREATE INDEX ex25ae ON ex25(a COLLATE nocase,e);
</pre></blockquote>

<p>Each row in the ex25ce index is a record
with these columns: c, e, d, a.  The first two columns are
the columns being indexed, c and e.  The remaining columns are the primary
key of the corresponding table row.  Normally, the primary key would be
columns d, c, and a, but because column c already appears earlier in the
index, it is omitted from the key suffix.)</p>

<p>(In the extreme case where the columns being indexed cover all columns
of the PRIMARY KEY, the index will consist of only the columns being
indexed.  The ex25acde example above demonstrates this.)  Each entry in
the ex25acde index consists of only the columns a, c, d, and e, in that
order.</p>

<p>Each row in ex25ae contains five columns: a, e, d, c, a.  The "a"
column is repeated since the first occurrence of "a" has a collating
function of "nocase" and the second has a collating sequence of "binary".
If the "a" column is not repeated and if the table contains two or more
entries with the same "e" value and where "a" differs only in case, then
all of those table entries would correspond to a single entry in the
index, which would break the one-to-one correspondence between the table
and the index.

<p> The suppression of redundant columns in the key suffix of an index
entry only occurs in WITHOUT ROWID tables.  In an ordinary rowid table,
the index entry always ends with the rowid even if the [INTEGER PRIMARY KEY]
column is one of the columns being indexed.</p>

<tcl>hd_fragment ffschema {schema storage}</tcl>
<h2>Storage Of The SQL Database Schema</h2>

<p>Page 1 of a database file is the root page of a table b-tree that
holds a special table named "[sqlite_schema]".  This b-tree is known
as the "schema table" since it stores the complete
database schema.  (The structure of the sqlite_schema table is as
if it had been created using the following SQL:</p>

<blockquote><pre>
CREATE TABLE sqlite_schema(
  type text,
  name text,
  tbl_name text,
  rootpage integer,
  sql text
);
</pre></blockquote>)

<p>The sqlite_schema table contains one row for each table, index, view,
and trigger (collectively "objects") in the database schema, except there
is no entry for the sqlite_schema table itself.  The sqlite_schema table
contains entries for [internal schema objects] in addition to application-
and programmer-defined objects.


<p>(The sqlite_schema.type column will be one
of the following text strings:  'table', 'index', 'view', or 'trigger'
according to the type of object defined.  The 'table' string is used
for both ordinary and [virtual tables].)</p>

<p>(The sqlite_schema.name column will hold the name of the object.)
([UNIQUE] and [PRIMARY KEY] constraints on tables cause SQLite to create
[internal indexes] with names of the form "sqlite_autoindex_TABLE_N"
where TABLE is replaced by the name of the table that contains the
constraint and N is an integer beginning with 1 and increasing by one
with each constraint seen in the table definition.)
(In a [WITHOUT ROWID] table, there is no sqlite_schema entry for the
PRIMARY KEY, but the "sqlite_autoindex_TABLE_N" name is set aside
for the PRIMARY KEY as if the sqlite_schema entry did exist.  This
will affect the numbering of subsequent UNIQUE constraints.)
The "sqlite_autoindex_TABLE_N" name is never allocated for an
[INTEGER PRIMARY KEY], either in rowid tables or WITHOUT ROWID tables.
</p>

<p>The sqlite_schema.tbl_name column holds the name of a table or view
that the object is associated with.  For a table or view, the
tbl_name column is a copy of the name column.  For an index, the tbl_name
is the name of the table that is indexed.  For a trigger, the tbl_name
column stores the name of the table or view that causes the trigger 
to fire.</p>

<p>(The sqlite_schema.rootpage column stores the page number of the root
b-tree page for tables and indexes.)  For rows that define views, triggers,
and virtual tables, the rootpage column is 0 or NULL.</p>

<p>(The sqlite_schema.sql column stores SQL text that describes the
object.  This SQL text is a [CREATE TABLE], [CREATE VIRTUAL TABLE],
[CREATE INDEX],
[CREATE VIEW], or [CREATE TRIGGER] statement that if evaluated against
the database file when it is the main database of a [database connection]
would recreate the object.)  The text is usually a copy of the original
statement used to create the object but with normalizations applied so
that the text conforms to the following rules:

<ul>
<li>The CREATE, TABLE, VIEW, TRIGGER, and INDEX keywords at the beginning
of the statement are converted to all upper case letters.
<li>The TEMP or TEMPORARY keyword is removed if it occurs after the 
initial CREATE keyword.
<li>Any database name qualifier that occurs prior to the name of the
object being created is removed.
<li>Leading spaces are removed.
<li>All spaces following the first two keywords are converted into a single
space.
</ul>

<p>(The text in the sqlite_schema.sql column is a copy of the original
CREATE statement text that created the object, except normalized as
described above and as modified by subsequent [ALTER TABLE] statements.)
(The sqlite_schema.sql is NULL for the [internal indexes] that are
automatically created by [UNIQUE] or [PRIMARY KEY] constraints.)</p>

<h3>Alternative Names For The Schema Table</h3>

<p>The name "sqlite_schema" does not appear anywhere in the file format.
That name is just a convention used by the database implementation.
Due to historical and operational considerations, the
"sqlite_schema" table can also sometimes be called by one of the
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
these alternative names.

<tcl>hd_fragment intschema {internal schema objects} \
{internal schema object} {internal index} {internal indexes} \
{internal table} {internal tables}</tcl>
<h3>Internal Schema Objects</h3>

<p>^In addition to the tables, indexes, views, and triggers created by
the application and/or the developer using CREATE statements SQL, the
sqlite_schema table may contain zero or more entries for 
<i>internal schema objects</i> that are created by SQLite for its 
own internal use.  ^The names of internal schema objects
always begin with "sqlite_" and any table, index, view, or trigger
whose name begins with "sqlite_" is an internal schema object.
^SQLite prohibits applications from creating objects whose names begin
with "sqlite_".  

<p>Internal schema objects used by SQLite may include the following:

<ul>
<li><p>Indices with names of the form "sqlite_autoindex_TABLE_N" that
       are used to implement [UNIQUE] and [PRIMARY KEY] constraints on







|



|


|







1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
these alternative names.

<tcl>hd_fragment intschema {internal schema objects} \
{internal schema object} {internal index} {internal indexes} \
{internal table} {internal tables}</tcl>
<h3>Internal Schema Objects</h3>

<p>In addition to the tables, indexes, views, and triggers created by
the application and/or the developer using CREATE statements SQL, the
sqlite_schema table may contain zero or more entries for 
<i>internal schema objects</i> that are created by SQLite for its 
own internal use.  The names of internal schema objects
always begin with "sqlite_" and any table, index, view, or trigger
whose name begins with "sqlite_" is an internal schema object.
SQLite prohibits applications from creating objects whose names begin
with "sqlite_".  

<p>Internal schema objects used by SQLite may include the following:

<ul>
<li><p>Indices with names of the form "sqlite_autoindex_TABLE_N" that
       are used to implement [UNIQUE] and [PRIMARY KEY] constraints on
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444

<p>New internal schema objects names, always beginning with "sqlite_",
may be added to the SQLite file format in future releases.

<tcl>hd_fragment seqtab {sqlite_sequence}</tcl>
<h3>The sqlite_sequence table</h3>

<p>^The sqlite_sequence table is an internal table used to help implement
[AUTOINCREMENT].  ^The sqlite_sequence table is created automatically
whenever any ordinary table with an AUTOINCREMENT integer primary
key is created.  ^Once created, the sqlite_sequence table exists in the
sqlite_schema table forever; it cannot be dropped.
^(The schema for the sqlite_sequence table is:

<blockquote><pre>
CREATE TABLE sqlite_sequence(name,seq);
</pre></blockquote>)^

<p>^There is a single row in the sqlite_sequence table for each ordinary
table that uses AUTOINCREMENT.  ^(The name of the table (as it appears in
sqlite_schema.name) is in the sqlite_sequence.name field and the largest
[INTEGER PRIMARY KEY] ever inserted into that table is 
in the sqlite_sequence.seq field.)^  
^New automatically generated integer primary keys for AUTOINCREMENT
tables are guaranteed to be larger than the sqlite_sequence.seq field for
that table.
^(If the sqlite_sequence.seq field of an AUTOINCREMENT table is already at
the largest integer value (9223372036854775807) then attempts to add new
rows to that table with an automatically generated integer primary will fail
with an [SQLITE_FULL] error.)^
^The sqlite_sequence.seq field is automatically updated if required when
new entries are inserted to an AUTOINCREMENT table.  
^The sqlite_sequence row for an AUTOINCREMENT table is automatically deleted
when the table is dropped.
^If the sqlite_sequence row for an AUTOINCREMENT table does not exist when
the AUTOINCREMENT table is updated, then a new sqlite_sequence row is created.
^(If the sqlite_sequence.seq value for an AUTOINCREMENT table is manually 
set to something other than an integer and there is a subsequent attempt to
insert the or update the AUTOINCREMENT table, then the behavior is undefined.)^

<p>^Application code is allowed to modify the sqlite_sequence table, to add
new rows, to delete rows, or to modify existing rows.  ^However, application
code cannot create the sqlite_sequence table if it does not already exist.
^Application code can delete all entries from the sqlite_sequence table,
but application code cannot drop the sqlite_sequence table.

<tcl>hd_fragment stat1tab {sqlite_stat1} SQLITE_STAT1 </tcl>
<h3>The sqlite_stat1 table</h3>

<p>^The sqlite_stat1 is an internal table created by the [ANALYZE] command
and used to hold supplemental information about tables and indexes that the
query planner can use to help it find better ways of performing queries.
^Applications can update, delete from, insert into or drop the sqlite_stat1
table, but may not create or alter the sqlite_stat1 table.
^(The schema of the sqlite_stat1 table is as follows:

<blockquote><pre>
CREATE TABLE sqlite_stat1(tbl,idx,stat);
</pre></blockquote>)^

<p> ^(There is normally one row per index, with the index identified by the
name in the sqlite_stat1.idx column.)^  ^(The sqlite_stat1.tbl column is
the name of the table to which the index belongs.)^  ^(In each such row, 
the sqlite_stat.stat column will be
a string consisting of a list of integers followed by zero or more 
arguments.)^  ^The first integer in this
list is the approximate number of rows in the index.  (The number of
rows in the index is the same as the number of rows in the table,
except for [partial indexes].)
^The second integer is the approximate number of rows in the index
that have the same value in the first column of the index.  ^The third 
integer is the number of rows in the index that have 
the same value for the first two columns.  ^The N-th integer (for N>1) 
is the estimated average number of rows in 
the index which have the same value for the first N-1 columns.  ^For
a K-column index, there will be K+1 integers in the stat column.  ^If
the index is unique, then the last integer will be 1.

<p>^The list of integers in the stat column can optionally be followed
by arguments, each of which is a sequence of non-space characters.
^All arguments are preceded by a single space.
^Unrecognized arguments are silently ignored.

<p>^If the "unordered" argument is present, then the query planner assumes
that the index is unordered and will not use the index for a range query
or for sorting.

<p>^The "sz=NNN" argument (where NNN represents a sequence of 1 or more digits)
means that the average row size over all records of the table or
index is NNN bytes per row.  ^The SQLite query planner might use the
estimated row size information provided by the "sz=NNN" token
to help it choose smaller tables and indexes that require less disk I/O.

<p>^(The presence of the "noskipscan" token on the sqlite_stat1.stat field
of an index prevents that index from being used with the
[skip-scan optimization].)^

<p>New text tokens may be added to the end of the stat column in future
enhancements to SQLite.  For compatibility, unrecognized tokens at the end
of the stat column are silently ignored.

<p>^(If the sqlite_stat1.idx column is NULL, then the sqlite_stat1.stat
column contains a single integer which is the approximate number of
rows in the table identified by sqlite_stat1.tbl.)^
^(If the sqlite_stat1.idx column is the same as the sqlite_stat1.tbl
column, then the table is a [WITHOUT ROWID] table and the sqlite_stat1.stat
field contains information about the index btree that implements the
WITHOUT ROWID table.)^

<tcl>hd_fragment stat2tab {sqlite_stat2}</tcl>
<h3>The sqlite_stat2 table</h3>

<p>The sqlite_stat2 is only created and is only used if SQLite is compiled
with SQLITE_ENABLE_STAT2 and if the SQLite version number is between
3.6.18 ([dateof:3.6.18]) and 3.7.8 ([dateof:3.7.8]).







|
|

|

|



|

|
|


|
|


|


|
|

|

|

|

|

|
|

|





|


|

|



|

|
|
|


|



|
|

|

|
|


|

|
|

|



|

|



|

|





|

|
|


|







1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444

<p>New internal schema objects names, always beginning with "sqlite_",
may be added to the SQLite file format in future releases.

<tcl>hd_fragment seqtab {sqlite_sequence}</tcl>
<h3>The sqlite_sequence table</h3>

<p>The sqlite_sequence table is an internal table used to help implement
[AUTOINCREMENT].  The sqlite_sequence table is created automatically
whenever any ordinary table with an AUTOINCREMENT integer primary
key is created.  Once created, the sqlite_sequence table exists in the
sqlite_schema table forever; it cannot be dropped.
(The schema for the sqlite_sequence table is:

<blockquote><pre>
CREATE TABLE sqlite_sequence(name,seq);
</pre></blockquote>)

<p>There is a single row in the sqlite_sequence table for each ordinary
table that uses AUTOINCREMENT.  (The name of the table (as it appears in
sqlite_schema.name) is in the sqlite_sequence.name field and the largest
[INTEGER PRIMARY KEY] ever inserted into that table is 
in the sqlite_sequence.seq field.)  
New automatically generated integer primary keys for AUTOINCREMENT
tables are guaranteed to be larger than the sqlite_sequence.seq field for
that table.
(If the sqlite_sequence.seq field of an AUTOINCREMENT table is already at
the largest integer value (9223372036854775807) then attempts to add new
rows to that table with an automatically generated integer primary will fail
with an [SQLITE_FULL] error.)
The sqlite_sequence.seq field is automatically updated if required when
new entries are inserted to an AUTOINCREMENT table.  
The sqlite_sequence row for an AUTOINCREMENT table is automatically deleted
when the table is dropped.
If the sqlite_sequence row for an AUTOINCREMENT table does not exist when
the AUTOINCREMENT table is updated, then a new sqlite_sequence row is created.
(If the sqlite_sequence.seq value for an AUTOINCREMENT table is manually 
set to something other than an integer and there is a subsequent attempt to
insert the or update the AUTOINCREMENT table, then the behavior is undefined.)

<p>Application code is allowed to modify the sqlite_sequence table, to add
new rows, to delete rows, or to modify existing rows.  However, application
code cannot create the sqlite_sequence table if it does not already exist.
Application code can delete all entries from the sqlite_sequence table,
but application code cannot drop the sqlite_sequence table.

<tcl>hd_fragment stat1tab {sqlite_stat1} SQLITE_STAT1 </tcl>
<h3>The sqlite_stat1 table</h3>

<p>The sqlite_stat1 is an internal table created by the [ANALYZE] command
and used to hold supplemental information about tables and indexes that the
query planner can use to help it find better ways of performing queries.
Applications can update, delete from, insert into or drop the sqlite_stat1
table, but may not create or alter the sqlite_stat1 table.
(The schema of the sqlite_stat1 table is as follows:

<blockquote><pre>
CREATE TABLE sqlite_stat1(tbl,idx,stat);
</pre></blockquote>)

<p> (There is normally one row per index, with the index identified by the
name in the sqlite_stat1.idx column.)  (The sqlite_stat1.tbl column is
the name of the table to which the index belongs.)  (In each such row, 
the sqlite_stat.stat column will be
a string consisting of a list of integers followed by zero or more 
arguments.)  The first integer in this
list is the approximate number of rows in the index.  (The number of
rows in the index is the same as the number of rows in the table,
except for [partial indexes].)
The second integer is the approximate number of rows in the index
that have the same value in the first column of the index.  The third 
integer is the number of rows in the index that have 
the same value for the first two columns.  The N-th integer (for N>1) 
is the estimated average number of rows in 
the index which have the same value for the first N-1 columns.  For
a K-column index, there will be K+1 integers in the stat column.  If
the index is unique, then the last integer will be 1.

<p>The list of integers in the stat column can optionally be followed
by arguments, each of which is a sequence of non-space characters.
All arguments are preceded by a single space.
Unrecognized arguments are silently ignored.

<p>If the "unordered" argument is present, then the query planner assumes
that the index is unordered and will not use the index for a range query
or for sorting.

<p>The "sz=NNN" argument (where NNN represents a sequence of 1 or more digits)
means that the average row size over all records of the table or
index is NNN bytes per row.  The SQLite query planner might use the
estimated row size information provided by the "sz=NNN" token
to help it choose smaller tables and indexes that require less disk I/O.

<p>(The presence of the "noskipscan" token on the sqlite_stat1.stat field
of an index prevents that index from being used with the
[skip-scan optimization].)

<p>New text tokens may be added to the end of the stat column in future
enhancements to SQLite.  For compatibility, unrecognized tokens at the end
of the stat column are silently ignored.

<p>(If the sqlite_stat1.idx column is NULL, then the sqlite_stat1.stat
column contains a single integer which is the approximate number of
rows in the table identified by sqlite_stat1.tbl.)
(If the sqlite_stat1.idx column is the same as the sqlite_stat1.tbl
column, then the table is a [WITHOUT ROWID] table and the sqlite_stat1.stat
field contains information about the index btree that implements the
WITHOUT ROWID table.)

<tcl>hd_fragment stat2tab {sqlite_stat2}</tcl>
<h3>The sqlite_stat2 table</h3>

<p>The sqlite_stat2 is only created and is only used if SQLite is compiled
with SQLITE_ENABLE_STAT2 and if the SQLite version number is between
3.6.18 ([dateof:3.6.18]) and 3.7.8 ([dateof:3.7.8]).
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
version of SQLite before 3.7.9.
If the [SQLITE_ENABLE_STAT4] compile-time option is used and the
SQLite version number is 3.8.1 ([dateof:3.8.1]) or greater,
then sqlite_stat3 might be read but not written.
The sqlite_stat3 table contains additional information
about the distribution of keys within an index, information that the
query planner can use to devise better and faster query algorithms.
^(The schema of the sqlite_stat3 table is as follows:

<blockquote><pre>
CREATE TABLE sqlite_stat3(tbl,idx,nEq,nLt,nDLt,sample);
</pre></blockquote>)^

<p>There are usually multiple entries in the sqlite_stat3 table for each index.
^(The sqlite_stat3.sample column holds the value of the left-most field of an
index identified by sqlite_stat3.idx and sqlite_stat3.tbl.)^
^(The sqlite_stat3.nEq column holds the approximate
number of entries in the index whose left-most column exactly matches
the sample.)^
^(The sqlite_stat3.nLt holds the approximate number of entries in the
index whose left-most column is less than the sample.)^
^(The sqlite_stat3.nDLt column holds the approximate
number of distinct left-most entries in the index that are less than
the sample.)^

<p>^There can be an arbitrary number of sqlite_stat3 entries per index.
The [ANALYZE] command will typically generate sqlite_stat3 tables
that contain between 10 and 40 samples that are distributed across
the key space and with large nEq values.

<p>^(In a well-formed sqlite_stat3 table, the samples for any single
index must appear in the same order that they occur in the index.  
In other words, if the entry with left-most column S1 is earlier in
the index b-tree than the
entry with left-most column S2, then in the sqlite_stat3 table, 
sample S1 must have a smaller rowid than sample S2.)^

<tcl>hd_fragment stat4tab {sqlite_stat4} SQLITE_STAT4</tcl>
<h3>The sqlite_stat4 table</h3>

<p>The sqlite_stat4 is only created and is only used if SQLite is compiled
with [SQLITE_ENABLE_STAT4] and if the SQLite version number is
3.8.1 ([dateof:3.8.1]) or greater.
The sqlite_stat4 table is neither read nor written by any
version of SQLite before 3.8.1.
The sqlite_stat4 table contains additional information
about the distribution of keys within an index or the distribution of
keys in the primary key of a [WITHOUT ROWID] table.
The query planner can sometimes use the additional information in
the sqlite_stat4 table to devise better and faster query algorithms.
^(The schema of the sqlite_stat4 table is as follows:

<blockquote><pre>
CREATE TABLE sqlite_stat4(tbl,idx,nEq,nLt,nDLt,sample);
</pre></blockquote>)^

<p>There are typically between 10 to 40 entries in the sqlite_stat4 table for
each index for which statistics are available, however these limits are
not hard bounds.
The meanings of the columns in the sqlite_stat4 table are as follows:

<center>
<table border="0" width="100%" cellpadding="10">
<tr><td valign="top" align="right">tbl:</td>
    <td>^(The sqlite_stat4.tbl column holds name of the table that owns
    the index that the row describes)^

<tr><td valign="top" align="right">idx:</td>
    <td>^(The sqlite_stat4.idx column holds name of the index that the
    row describes, or in the case of
    an sqlite_stat4 entry for a [WITHOUT ROWID] table, the
    name of the table itself.)^

<tr><td valign="top" align="right">sample:</td>
    <td>^(The sqlite_stat4.sample column holds a BLOB
    in the [record format] that encodes the indexed columns followed by
    the rowid for a rowid table or by the columns of the primary key 
    for a WITHOUT ROWID table.)^
    ^(The sqlite_stat4.sample BLOB for the WITHOUT ROWID table itself 
    contains just the columns of the primary key.)^
    Let the number of columns encoded by the sqlite_stat4.sample blob be N.
    ^For indexes on an ordinary rowid table, N will be one more than the number 
    of columns indexed.
    ^For indexes on WITHOUT ROWID tables, N will be the number of columns
    indexed plus the number of columns in the primary key.
    ^For a WITHOUT ROWID table, N will be the number of columns in the
    primary key.

<tr><td valign="top" align="right">nEq:</td>
    <td>^(The sqlite_stat4.nEq column holds a list of N integers where 
    the K-th integer is the approximate number of entries in the index
    whose left-most K columns exactly match the K left-most columns
    of the sample.)^

<tr><td valign="top" align="right">nLt:</td>
    <td>^(The sqlite_stat4.nLt column holds a list of N integers where
    the K-th integer is the approximate number of entries in the
    index whose K left-most columns are collectively less than the 
    K left-most columns of the sample.)^

<tr><td valign="top" align="right">nDLt:</td>
    <td>^(The sqlite_stat4.nDLt column holds a list of N integers where
    the K-th integer is the approximate
    number of entries in the index that are distinct in the first K columns and
    where the left-most K columns are collectively less than the left-most
    K columns of the sample.)^
</table>
</center>

<p>The sqlite_stat4 is a generalization of the sqlite_stat3 table.  The
sqlite_stat3 table provides information about the left-most column of an
index whereas the sqlite_stat4 table provides information about all columns
of the index.

<p>^There can be an arbitrary number of sqlite_stat4 entries per index.
The [ANALYZE] command will typically generate sqlite_stat4 tables
that contain between 10 and 40 samples that are distributed across
the key space and with large nEq values.

<p>^(In a well-formed sqlite_stat4 table, the samples for any single
index must appear in the same order that they occur in the index.  
In other words, if entry S1 is earlier in the index b-tree than 
entry S2, then in the sqlite_stat4 table, sample S1 must have a
smaller rowid than sample S2.)^

<tcl>hd_fragment rollbackjournal {rollback journal format}</tcl>
<h1>The Rollback Journal</h1>

<p>The rollback journal is a file associated with each SQLite database
file that holds information used to restore the database file to its initial
state during the course of a transaction.
^The rollback journal file is always located in the same 
directory as the database
file and has the same name as the database file but with the string
"<tt>-journal</tt>" appended.  There can only be a single rollback journal
associated with a give database and hence there can only be one write
transaction open against a single database at one time.</p>

<p>If a transaction is aborted due to an application crash, an operating
system crash, or a hardware power failure or crash, then the database may
be left in an inconsistent state.  ^The next time SQLite attempts to open
the database file, the presence of the rollback journal file will be 
detected and the journal will be automatically played back to restore the
database to its state at the start of the incomplete transaction.</p>

<p>^A rollback journal is only considered to be valid if it exists and
contains a valid header.  Hence a transaction can be committed in one
of three ways:
<ol>
<li>^(The rollback journal file can be deleted)^,
<li>^(The rollback journal file can be truncated to zero length)^, or
<li>^(The header of the rollback journal can be overwritten with
invalid header text (for example, all zeros).)^
</ol>
<p>
^These three ways of committing a transaction correspond to the DELETE,
TRUNCATE, and PERSIST settings, respectively, of the [journal_mode pragma].
</p>


<p>A valid rollback journal begins with a header in the following format:</p>

<center>







|



|


|
|
|

|
|
|
|

|

|




|




|














|



|









|
|


|


|


|


|
|
|

|

|

|



|


|


|


|


|



|








|




|



|







|








|




|



|
|
|
|


|







1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
version of SQLite before 3.7.9.
If the [SQLITE_ENABLE_STAT4] compile-time option is used and the
SQLite version number is 3.8.1 ([dateof:3.8.1]) or greater,
then sqlite_stat3 might be read but not written.
The sqlite_stat3 table contains additional information
about the distribution of keys within an index, information that the
query planner can use to devise better and faster query algorithms.
(The schema of the sqlite_stat3 table is as follows:

<blockquote><pre>
CREATE TABLE sqlite_stat3(tbl,idx,nEq,nLt,nDLt,sample);
</pre></blockquote>)

<p>There are usually multiple entries in the sqlite_stat3 table for each index.
(The sqlite_stat3.sample column holds the value of the left-most field of an
index identified by sqlite_stat3.idx and sqlite_stat3.tbl.)
(The sqlite_stat3.nEq column holds the approximate
number of entries in the index whose left-most column exactly matches
the sample.)
(The sqlite_stat3.nLt holds the approximate number of entries in the
index whose left-most column is less than the sample.)
(The sqlite_stat3.nDLt column holds the approximate
number of distinct left-most entries in the index that are less than
the sample.)

<p>There can be an arbitrary number of sqlite_stat3 entries per index.
The [ANALYZE] command will typically generate sqlite_stat3 tables
that contain between 10 and 40 samples that are distributed across
the key space and with large nEq values.

<p>(In a well-formed sqlite_stat3 table, the samples for any single
index must appear in the same order that they occur in the index.  
In other words, if the entry with left-most column S1 is earlier in
the index b-tree than the
entry with left-most column S2, then in the sqlite_stat3 table, 
sample S1 must have a smaller rowid than sample S2.)

<tcl>hd_fragment stat4tab {sqlite_stat4} SQLITE_STAT4</tcl>
<h3>The sqlite_stat4 table</h3>

<p>The sqlite_stat4 is only created and is only used if SQLite is compiled
with [SQLITE_ENABLE_STAT4] and if the SQLite version number is
3.8.1 ([dateof:3.8.1]) or greater.
The sqlite_stat4 table is neither read nor written by any
version of SQLite before 3.8.1.
The sqlite_stat4 table contains additional information
about the distribution of keys within an index or the distribution of
keys in the primary key of a [WITHOUT ROWID] table.
The query planner can sometimes use the additional information in
the sqlite_stat4 table to devise better and faster query algorithms.
(The schema of the sqlite_stat4 table is as follows:

<blockquote><pre>
CREATE TABLE sqlite_stat4(tbl,idx,nEq,nLt,nDLt,sample);
</pre></blockquote>)

<p>There are typically between 10 to 40 entries in the sqlite_stat4 table for
each index for which statistics are available, however these limits are
not hard bounds.
The meanings of the columns in the sqlite_stat4 table are as follows:

<center>
<table border="0" width="100%" cellpadding="10">
<tr><td valign="top" align="right">tbl:</td>
    <td>(The sqlite_stat4.tbl column holds name of the table that owns
    the index that the row describes)

<tr><td valign="top" align="right">idx:</td>
    <td>(The sqlite_stat4.idx column holds name of the index that the
    row describes, or in the case of
    an sqlite_stat4 entry for a [WITHOUT ROWID] table, the
    name of the table itself.)

<tr><td valign="top" align="right">sample:</td>
    <td>(The sqlite_stat4.sample column holds a BLOB
    in the [record format] that encodes the indexed columns followed by
    the rowid for a rowid table or by the columns of the primary key 
    for a WITHOUT ROWID table.)
    (The sqlite_stat4.sample BLOB for the WITHOUT ROWID table itself 
    contains just the columns of the primary key.)
    Let the number of columns encoded by the sqlite_stat4.sample blob be N.
    For indexes on an ordinary rowid table, N will be one more than the number 
    of columns indexed.
    For indexes on WITHOUT ROWID tables, N will be the number of columns
    indexed plus the number of columns in the primary key.
    For a WITHOUT ROWID table, N will be the number of columns in the
    primary key.

<tr><td valign="top" align="right">nEq:</td>
    <td>(The sqlite_stat4.nEq column holds a list of N integers where 
    the K-th integer is the approximate number of entries in the index
    whose left-most K columns exactly match the K left-most columns
    of the sample.)

<tr><td valign="top" align="right">nLt:</td>
    <td>(The sqlite_stat4.nLt column holds a list of N integers where
    the K-th integer is the approximate number of entries in the
    index whose K left-most columns are collectively less than the 
    K left-most columns of the sample.)

<tr><td valign="top" align="right">nDLt:</td>
    <td>(The sqlite_stat4.nDLt column holds a list of N integers where
    the K-th integer is the approximate
    number of entries in the index that are distinct in the first K columns and
    where the left-most K columns are collectively less than the left-most
    K columns of the sample.)
</table>
</center>

<p>The sqlite_stat4 is a generalization of the sqlite_stat3 table.  The
sqlite_stat3 table provides information about the left-most column of an
index whereas the sqlite_stat4 table provides information about all columns
of the index.

<p>There can be an arbitrary number of sqlite_stat4 entries per index.
The [ANALYZE] command will typically generate sqlite_stat4 tables
that contain between 10 and 40 samples that are distributed across
the key space and with large nEq values.

<p>(In a well-formed sqlite_stat4 table, the samples for any single
index must appear in the same order that they occur in the index.  
In other words, if entry S1 is earlier in the index b-tree than 
entry S2, then in the sqlite_stat4 table, sample S1 must have a
smaller rowid than sample S2.)

<tcl>hd_fragment rollbackjournal {rollback journal format}</tcl>
<h1>The Rollback Journal</h1>

<p>The rollback journal is a file associated with each SQLite database
file that holds information used to restore the database file to its initial
state during the course of a transaction.
The rollback journal file is always located in the same 
directory as the database
file and has the same name as the database file but with the string
"<tt>-journal</tt>" appended.  There can only be a single rollback journal
associated with a give database and hence there can only be one write
transaction open against a single database at one time.</p>

<p>If a transaction is aborted due to an application crash, an operating
system crash, or a hardware power failure or crash, then the database may
be left in an inconsistent state.  The next time SQLite attempts to open
the database file, the presence of the rollback journal file will be 
detected and the journal will be automatically played back to restore the
database to its state at the start of the incomplete transaction.</p>

<p>A rollback journal is only considered to be valid if it exists and
contains a valid header.  Hence a transaction can be committed in one
of three ways:
<ol>
<li>(The rollback journal file can be deleted),
<li>(The rollback journal file can be truncated to zero length), or
<li>(The header of the rollback journal can be overwritten with
invalid header text (for example, all zeros).)
</ol>
<p>
These three ways of committing a transaction correspond to the DELETE,
TRUNCATE, and PERSIST settings, respectively, of the [journal_mode pragma].
</p>


<p>A valid rollback journal begins with a header in the following format:</p>

<center>
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
        journal.
<tr><td valign=top align=center>24
    <td valign=top align=center>4
    <td>Size of pages in this journal.
</table>
</center>

<p>^A rollback journal header is padded with zeros out to the size of a 
single sector (as defined by the sector size integer at offset 20).
The header is in a sector by itself so that if a power loss occurs while
writing the sector, information that follows the header will be
(hopefully) undamaged.</p>

<p>^After the header and zero padding are zero or more page records.  ^Each
page record stores a copy of the content of a page from the database file
before it was changed.  ^The same page may not appear more than once
within a single rollback journal.
To rollback an incomplete transaction, a process
has merely to read the rollback journal from beginning to end and
write pages found in the journal back into the database file at the
appropriate location.</p>

<p>Let the database page size (the value of the integer at offset 24 







|





|

|







1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
        journal.
<tr><td valign=top align=center>24
    <td valign=top align=center>4
    <td>Size of pages in this journal.
</table>
</center>

<p>A rollback journal header is padded with zeros out to the size of a 
single sector (as defined by the sector size integer at offset 20).
The header is in a sector by itself so that if a power loss occurs while
writing the sector, information that follows the header will be
(hopefully) undamaged.</p>

<p>After the header and zero padding are zero or more page records.  Each
page record stores a copy of the content of a page from the database file
before it was changed.  The same page may not appear more than once
within a single rollback journal.
To rollback an incomplete transaction, a process
has merely to read the rollback journal from beginning to end and
write pages found in the journal back into the database file at the
appropriate location.</p>

<p>Let the database page size (the value of the integer at offset 24 
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
<tr><td valign=top align=center>N+4
    <td valign=top align=center>4
    <td>Checksum
</table>
</center>


<p>^(The checksum is an unsigned 32-bit integer computed as follows:</p>

<ol>
<li>Initialize the checksum to the checksum nonce value found in the
journal header at offset 12.
<li>Initialize index X to be N-200 (where N is the size of a database page
in bytes.
<li>Interpret the byte at offset X into the page as an 8-bit unsigned integer
 and add the value of that integer to the checksum.
<li>Subtract 200 from X.
<li>If X is greater than or equal to zero, go back to step 3.
</ol>)^

<p>The checksum value is used to guard against incomplete writes of
a journal page record following a power failure.  A different random nonce
is used each time a transaction is started in order to minimize the risk
that unwritten sectors might by chance contain data from the same page
that was a part of prior journals.  By changing the nonce for each
transaction, stale data on disk will still generate an incorrect checksum
and be detected with high probability.  The checksum only uses a sparse sample
of 32-bit words from the data record for performance reasons - design studies 
during the planning phases of SQLite 3.0.0 showed
a significant performance hit in checksumming the entire page.</p>

<p>Let the page count value at offset 8 in the journal header be M.
^If M is greater than zero then after M page records the journal file
may be zero padded out to the next multiple of the sector size and another
journal header may be inserted.  ^All journal headers within the same
journal must contain the same database page size and sector size.</p>

<p>^If M is -1 in the initial journal header, then the number of page records
that follow is computed by computing how many page records will fit in
the available space of the remainder of the journal file.</p>

<tcl>hd_fragment walformat {WAL format} {WAL file format}</tcl>
<h1>The Write-Ahead Log</h1>

<p>Beginning with [version 3.7.0] ([dateof:3.7.0]), 
SQLite supports a new transaction
control mechanism called "[WAL | write-ahead log]" or "[WAL]".
^When a database is in WAL mode, all connections to that database must
use the WAL.  ^A particular database will use either a rollback journal
or a WAL, but not both at the same time.
^The WAL is always located in the same directory as the database
file and has the same name as the database file but with the string
"<tt>-wal</tt>" appended.</p>

<h2>WAL File Format</h2>

<p>A [WAL file] consists of a header followed by zero or more "frames".
Each frame records the revised content of a single page from the
database file.  All changes to the database are recorded by writing
frames into the WAL.  Transactions commit when a frame is written that
contains a commit marker.  ^A single WAL can and usually does record 
multiple transactions.  Periodically, the content of the WAL is
transferred back into the database file in an operation called a
"checkpoint".</p>

<p>^A single WAL file can be reused multiple times.  ^In other words, the
WAL can fill up with frames and then be checkpointed and then new
frames can overwrite the old ones.  ^A WAL always grows from beginning
toward the end.  Checksums and counters attached to each frame are
used to determine which frames within the WAL are valid and which
are leftovers from prior checkpoints.</p>

<p>^(The WAL header is 32 bytes in size and consists of the following eight
big-endian 32-bit unsigned integer values:</p>

<center>
<i>WAL Header Format</i><br>
<table width="80%" border=1>
<tr><th>Offset<th>Size<th>Description
<tr><td valign=top align=center>0<td valign=top align=center>4







|










|













|

|


|









|
|

|









|




|

|




|







1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
<tr><td valign=top align=center>N+4
    <td valign=top align=center>4
    <td>Checksum
</table>
</center>


<p>(The checksum is an unsigned 32-bit integer computed as follows:</p>

<ol>
<li>Initialize the checksum to the checksum nonce value found in the
journal header at offset 12.
<li>Initialize index X to be N-200 (where N is the size of a database page
in bytes.
<li>Interpret the byte at offset X into the page as an 8-bit unsigned integer
 and add the value of that integer to the checksum.
<li>Subtract 200 from X.
<li>If X is greater than or equal to zero, go back to step 3.
</ol>)

<p>The checksum value is used to guard against incomplete writes of
a journal page record following a power failure.  A different random nonce
is used each time a transaction is started in order to minimize the risk
that unwritten sectors might by chance contain data from the same page
that was a part of prior journals.  By changing the nonce for each
transaction, stale data on disk will still generate an incorrect checksum
and be detected with high probability.  The checksum only uses a sparse sample
of 32-bit words from the data record for performance reasons - design studies 
during the planning phases of SQLite 3.0.0 showed
a significant performance hit in checksumming the entire page.</p>

<p>Let the page count value at offset 8 in the journal header be M.
If M is greater than zero then after M page records the journal file
may be zero padded out to the next multiple of the sector size and another
journal header may be inserted.  All journal headers within the same
journal must contain the same database page size and sector size.</p>

<p>If M is -1 in the initial journal header, then the number of page records
that follow is computed by computing how many page records will fit in
the available space of the remainder of the journal file.</p>

<tcl>hd_fragment walformat {WAL format} {WAL file format}</tcl>
<h1>The Write-Ahead Log</h1>

<p>Beginning with [version 3.7.0] ([dateof:3.7.0]), 
SQLite supports a new transaction
control mechanism called "[WAL | write-ahead log]" or "[WAL]".
When a database is in WAL mode, all connections to that database must
use the WAL.  A particular database will use either a rollback journal
or a WAL, but not both at the same time.
The WAL is always located in the same directory as the database
file and has the same name as the database file but with the string
"<tt>-wal</tt>" appended.</p>

<h2>WAL File Format</h2>

<p>A [WAL file] consists of a header followed by zero or more "frames".
Each frame records the revised content of a single page from the
database file.  All changes to the database are recorded by writing
frames into the WAL.  Transactions commit when a frame is written that
contains a commit marker.  A single WAL can and usually does record 
multiple transactions.  Periodically, the content of the WAL is
transferred back into the database file in an operation called a
"checkpoint".</p>

<p>A single WAL file can be reused multiple times.  In other words, the
WAL can fill up with frames and then be checkpointed and then new
frames can overwrite the old ones.  A WAL always grows from beginning
toward the end.  Checksums and counters attached to each frame are
used to determine which frames within the WAL are valid and which
are leftovers from prior checkpoints.</p>

<p>(The WAL header is 32 bytes in size and consists of the following eight
big-endian 32-bit unsigned integer values:</p>

<center>
<i>WAL Header Format</i><br>
<table width="80%" border=1>
<tr><th>Offset<th>Size<th>Description
<tr><td valign=top align=center>0<td valign=top align=center>4
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
<tr><td valign=top align=center>20<td valign=top align=center>4
    <td>Salt-2: a different random number for each checkpoint
<tr><td valign=top align=center>24<td valign=top align=center>4
    <td>Checksum-1: First part of a checksum on the first 24 bytes of header
<tr><td valign=top align=center>28<td valign=top align=center>4
    <td>Checksum-2: Second part of the checksum on the first 24 bytes of header
</table>
</center>)^

<p>^Immediately following the wal-header are zero or more frames. ^Each
frame consists of a 24-byte frame-header followed by a <i>page-size</i> bytes
of page data. ^(The frame-header is six big-endian 32-bit unsigned 
integer values, as follows:

<center>
<i>WAL Frame Header Format</i><br>
<table width="80%" border=1>
<tr><th>Offset<th>Size<th>Description
<tr><td valign=top align=center>0<td valign=top align=center>4
    <td>Page number
<tr><td valign=top align=center>4<td valign=top align=center>4
    <td>For commit records, the size of the database file in pages
        after the commit.  For all other records, zero.
<tr><td valign=top align=center>8<td valign=top align=center>4
    <td>Salt-1 copied from the WAL header
<tr><td valign=top align=center>12<td valign=top align=center>4
    <td>Salt-2 copied from the WAL header
<tr><td valign=top align=center>16<td valign=top align=center>4
    <td>Checksum-1:  Cumulative checksum up through and including this page
<tr><td valign=top align=center>20<td valign=top align=center>4
    <td>Checksum-2:  Second half of the cumulative checksum.
</table>
</center>)^

^(<p>A frame is considered valid if and only if the following conditions are
true:</p>

<ol>
<li><p>The salt-1 and salt-2 values in the frame-header match
       salt values in the wal-header</p></li>

<li><p>The checksum values in the final 8 bytes of the frame-header
       exactly match the checksum computed consecutively on the
       first 24 bytes of the WAL header and the first 8 bytes and
       the content of all frames
       up to and including the current frame.</p></li></li>
</ol>)^

<tcl>hd_fragment walcksm {WAL checksum algorithm}</tcl>
<h2>Checksum Algorithm</h2>

<p>The checksum is computed by interpreting the input as
an even number of unsigned 32-bit integers: x(0) through x(N).
^The 32-bit integers are big-endian if the
magic number in the first 4 bytes of the WAL header is 0x377f0683 and
the integers are little-endian if the magic number is 0x377f0682.
^The checksum values are always stored in the frame header in a
big-endian format regardless of which byte order is used to compute
the checksum.</p>

<p>The checksum algorithm only works for content which is a multiple of
8 bytes in length.  In other words, if the inputs are x(0) through x(N)
then N must be odd.
^(The checksum algorithm is as follows:

<blockquote><pre> 
s0 = s1 = 0
for i from 0 to n-1 step 2:
   s0 += x(i) + s1;
   s1 += x(i+1) + s0;
endfor
# result in s0 and s1
</pre></blockquote>)^

<p>^The outputs s0 and s1 are both weighted checksums using Fibonacci weights
in reverse order.  (^The largest Fibonacci weight occurs on the first element
of the sequence being summed.)  ^The s1 value spans all 32-bit integer
terms of the sequence whereas s0 omits the final term.</p>

<h2>Checkpoint Algorithm</h2>

<p>^On a [checkpoint], the WAL is first flushed to persistent storage using
the xSync method of the [sqlite3_io_methods | VFS]. 
^Then valid content of the WAL is transferred into the database file.
^Finally, the database is flushed to persistent storage using another
xSync method call.
The xSync operations serve as write barriers - all writes launched
before the xSync must complete before any write that launches after the
xSync begins.</p>

<p>A checkpoint need not run to completion.  It might be that some
readers are still using older transactions with data that is contained
in the database file.  In that case, transferring content for newer
transactions from the WAL file into the database would delete the content
out from under readers still using the older transactions.  To avoid that,
checkpoints only run to completion if all reader are using the
last transaction in the WAL.

<tcl>hd_fragment walreset {reset the WAL} {WAL reset}</tcl>
<h2>WAL Reset</h2>

<p>^After a complete checkpoint, if no other connections are in transactions
that use the WAL, then subsequent write transactions can
overwrite the WAL file from the beginning.  This is called "resetting the
WAL".  ^At the start of the first new
write transaction, the WAL header salt-1 value is incremented
and the salt-2 value is randomized.  These changes to the salts invalidate
old frames in the WAL that have already been checkpointed but not yet
overwritten, and prevent them from being checkpointed again.</p>

<p>The WAL file can optionally be truncated on a reset, but it need not be.
Performance is usually a little better if the WAL is not truncated, since
filesystems generally will overwrite an existing file faster than they
will grow a file.

<tcl>hd_fragment walread {WAL read algorithm}</tcl>
<h2>Reader Algorithm</h2>

<p>^(To read a page from the database (call it page number P), a reader
first checks the WAL to see if it contains page P.  If so, then the
last valid instance of page P that is followed by a commit frame
or is a commit frame itself becomes the value read.)^  ^If the WAL
contains no copies of page P that are valid and which are a commit
frame or are followed by a commit frame, then page P is read from
the database file.</p>

<p>To start a read transaction, the reader records the number of value
frames in the WAL as "mxFrame".  ([mxFrame|More detail])
The reader uses this recorded mxFrame value
for all subsequent read operations.  New transactions can be appended
to the WAL, but as long as the reader uses its original mxFrame value
and ignores subsequently appended content, the reader will see a 
consistent snapshot of the database from a single point in time.  
^This technique allows multiple concurrent readers to view different 
versions of the database content simultaneously.</p>

<p>The reader algorithm in the previous paragraphs works correctly, but 
because frames for page P can appear anywhere within the WAL, the
reader has to scan the entire WAL looking for page P frames.  If the
WAL is large (multiple megabytes is typical) that scan can be slow,
and read performance suffers.  ^To overcome this problem, a separate
data structure called the wal-index is maintained to expedite the
search for frames of a particular page.</p>

<tcl>hd_fragment walindexformat</tcl>
<h2>WAL-Index Format</h2>

<p>Conceptually, the wal-index is shared memory, though the current
VFS implementations use a memory-mapped file for operating-system
portability.  ^The memory-mapped
file is in the same directory as the database and has the same name
as the database with a "<tt>-shm</tt>" suffix appended.  Because
the wal-index is shared memory, SQLite does not support 
[PRAGMA journal_mode | journal_mode=WAL] 
on a network filesystem when clients are on different machines, as
all clients of the database must be able to share the same memory.</p>








|

|

|




















|

|











|






|


|






|








|

|
|
|




|

|
|
















|


|













|


|











|






|








|







1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
<tr><td valign=top align=center>20<td valign=top align=center>4
    <td>Salt-2: a different random number for each checkpoint
<tr><td valign=top align=center>24<td valign=top align=center>4
    <td>Checksum-1: First part of a checksum on the first 24 bytes of header
<tr><td valign=top align=center>28<td valign=top align=center>4
    <td>Checksum-2: Second part of the checksum on the first 24 bytes of header
</table>
</center>)

<p>Immediately following the wal-header are zero or more frames. Each
frame consists of a 24-byte frame-header followed by a <i>page-size</i> bytes
of page data. (The frame-header is six big-endian 32-bit unsigned 
integer values, as follows:

<center>
<i>WAL Frame Header Format</i><br>
<table width="80%" border=1>
<tr><th>Offset<th>Size<th>Description
<tr><td valign=top align=center>0<td valign=top align=center>4
    <td>Page number
<tr><td valign=top align=center>4<td valign=top align=center>4
    <td>For commit records, the size of the database file in pages
        after the commit.  For all other records, zero.
<tr><td valign=top align=center>8<td valign=top align=center>4
    <td>Salt-1 copied from the WAL header
<tr><td valign=top align=center>12<td valign=top align=center>4
    <td>Salt-2 copied from the WAL header
<tr><td valign=top align=center>16<td valign=top align=center>4
    <td>Checksum-1:  Cumulative checksum up through and including this page
<tr><td valign=top align=center>20<td valign=top align=center>4
    <td>Checksum-2:  Second half of the cumulative checksum.
</table>
</center>)

(<p>A frame is considered valid if and only if the following conditions are
true:</p>

<ol>
<li><p>The salt-1 and salt-2 values in the frame-header match
       salt values in the wal-header</p></li>

<li><p>The checksum values in the final 8 bytes of the frame-header
       exactly match the checksum computed consecutively on the
       first 24 bytes of the WAL header and the first 8 bytes and
       the content of all frames
       up to and including the current frame.</p></li></li>
</ol>)

<tcl>hd_fragment walcksm {WAL checksum algorithm}</tcl>
<h2>Checksum Algorithm</h2>

<p>The checksum is computed by interpreting the input as
an even number of unsigned 32-bit integers: x(0) through x(N).
The 32-bit integers are big-endian if the
magic number in the first 4 bytes of the WAL header is 0x377f0683 and
the integers are little-endian if the magic number is 0x377f0682.
The checksum values are always stored in the frame header in a
big-endian format regardless of which byte order is used to compute
the checksum.</p>

<p>The checksum algorithm only works for content which is a multiple of
8 bytes in length.  In other words, if the inputs are x(0) through x(N)
then N must be odd.
(The checksum algorithm is as follows:

<blockquote><pre> 
s0 = s1 = 0
for i from 0 to n-1 step 2:
   s0 += x(i) + s1;
   s1 += x(i+1) + s0;
endfor
# result in s0 and s1
</pre></blockquote>)

<p>The outputs s0 and s1 are both weighted checksums using Fibonacci weights
in reverse order.  (The largest Fibonacci weight occurs on the first element
of the sequence being summed.)  The s1 value spans all 32-bit integer
terms of the sequence whereas s0 omits the final term.</p>

<h2>Checkpoint Algorithm</h2>

<p>On a [checkpoint], the WAL is first flushed to persistent storage using
the xSync method of the [sqlite3_io_methods | VFS]. 
Then valid content of the WAL is transferred into the database file.
Finally, the database is flushed to persistent storage using another
xSync method call.
The xSync operations serve as write barriers - all writes launched
before the xSync must complete before any write that launches after the
xSync begins.</p>

<p>A checkpoint need not run to completion.  It might be that some
readers are still using older transactions with data that is contained
in the database file.  In that case, transferring content for newer
transactions from the WAL file into the database would delete the content
out from under readers still using the older transactions.  To avoid that,
checkpoints only run to completion if all reader are using the
last transaction in the WAL.

<tcl>hd_fragment walreset {reset the WAL} {WAL reset}</tcl>
<h2>WAL Reset</h2>

<p>After a complete checkpoint, if no other connections are in transactions
that use the WAL, then subsequent write transactions can
overwrite the WAL file from the beginning.  This is called "resetting the
WAL".  At the start of the first new
write transaction, the WAL header salt-1 value is incremented
and the salt-2 value is randomized.  These changes to the salts invalidate
old frames in the WAL that have already been checkpointed but not yet
overwritten, and prevent them from being checkpointed again.</p>

<p>The WAL file can optionally be truncated on a reset, but it need not be.
Performance is usually a little better if the WAL is not truncated, since
filesystems generally will overwrite an existing file faster than they
will grow a file.

<tcl>hd_fragment walread {WAL read algorithm}</tcl>
<h2>Reader Algorithm</h2>

<p>(To read a page from the database (call it page number P), a reader
first checks the WAL to see if it contains page P.  If so, then the
last valid instance of page P that is followed by a commit frame
or is a commit frame itself becomes the value read.)  If the WAL
contains no copies of page P that are valid and which are a commit
frame or are followed by a commit frame, then page P is read from
the database file.</p>

<p>To start a read transaction, the reader records the number of value
frames in the WAL as "mxFrame".  ([mxFrame|More detail])
The reader uses this recorded mxFrame value
for all subsequent read operations.  New transactions can be appended
to the WAL, but as long as the reader uses its original mxFrame value
and ignores subsequently appended content, the reader will see a 
consistent snapshot of the database from a single point in time.  
This technique allows multiple concurrent readers to view different 
versions of the database content simultaneously.</p>

<p>The reader algorithm in the previous paragraphs works correctly, but 
because frames for page P can appear anywhere within the WAL, the
reader has to scan the entire WAL looking for page P frames.  If the
WAL is large (multiple megabytes is typical) that scan can be slow,
and read performance suffers.  To overcome this problem, a separate
data structure called the wal-index is maintained to expedite the
search for frames of a particular page.</p>

<tcl>hd_fragment walindexformat</tcl>
<h2>WAL-Index Format</h2>

<p>Conceptually, the wal-index is shared memory, though the current
VFS implementations use a memory-mapped file for operating-system
portability.  The memory-mapped
file is in the same directory as the database and has the same name
as the database with a "<tt>-shm</tt>" suffix appended.  Because
the wal-index is shared memory, SQLite does not support 
[PRAGMA journal_mode | journal_mode=WAL] 
on a network filesystem when clients are on different machines, as
all clients of the database must be able to share the same memory.</p>

1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976

<p>The <i>M</i> value in the previous paragraph is the "mxFrame" value
defined in [WAL read algorithm | section 4.4] that is read at the start 
of a transaction and which defines the maximum frame from the WAL that 
the reader will use.</p>

<p>The wal-index is transient.  After a crash, the wal-index is
reconstructed from the original WAL file.  ^The VFS is required
to either truncate or zero the header of the wal-index when the last
connection to it closes.  Because the wal-index is transient, it can
use an architecture-specific format; it does not have to be cross-platform.
Hence, unlike the database and WAL file formats which store all values
as big endian, the wal-index stores multi-byte values in the native
byte order of the host computer.</p>

<p>This document is concerned with the persistent state of the database
file, and since the wal-index is a transient structure, no further 
information about the format of the wal-index will be provided here.
Additional details on the format of the wal-index are contained in
the separate [WAL-index File Format] document.</p>







|












1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976

<p>The <i>M</i> value in the previous paragraph is the "mxFrame" value
defined in [WAL read algorithm | section 4.4] that is read at the start 
of a transaction and which defines the maximum frame from the WAL that 
the reader will use.</p>

<p>The wal-index is transient.  After a crash, the wal-index is
reconstructed from the original WAL file.  The VFS is required
to either truncate or zero the header of the wal-index when the last
connection to it closes.  Because the wal-index is transient, it can
use an architecture-specific format; it does not have to be cross-platform.
Hence, unlike the database and WAL file formats which store all values
as big endian, the wal-index stores multi-byte values in the native
byte order of the host computer.</p>

<p>This document is concerned with the persistent state of the database
file, and since the wal-index is a transient structure, no further 
information about the format of the wal-index will be provided here.
Additional details on the format of the wal-index are contained in
the separate [WAL-index File Format] document.</p>