Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Fix issues and clarify the operation of pager_playback_one_page(). A block comment in pager.c identifies 13 invariants on the pager subsystem. Ticket [9d68c883132c8]. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
0906597698b697ab2993a460f257e326 |
User & Date: | drh 2010-04-10 17:52:58.000 |
References
2010-06-26
| ||
16:30 | • Ticket [d11f09d36e] Cache corruption caused by ROLLBACK with synchronous=NORMAL status still Open with 1 other change (artifact: 16a03c7557 user: dan) | |
Context
2010-04-12
| ||
14:51 | Reset the simulated device in the test harness to its default configuration whenever it is restarted. (check-in: 562d20e662 user: drh tags: trunk) | |
2010-04-10
| ||
17:52 | Fix issues and clarify the operation of pager_playback_one_page(). A block comment in pager.c identifies 13 invariants on the pager subsystem. Ticket [9d68c883132c8]. (check-in: 0906597698 user: drh tags: trunk) | |
2010-04-09
| ||
23:05 | Add a test case for the OOM-fault corruption issue. Ticket [9d68c883132c8]. (check-in: 0a64a937b5 user: drh tags: trunk) | |
Changes
Changes to src/pager.c.
︙ | ︙ | |||
17 18 19 20 21 22 23 24 25 26 27 28 29 30 | ** locking to prevent two processes from writing the same database ** file simultaneously, or one process from reading the database while ** another is writing. */ #ifndef SQLITE_OMIT_DISKIO #include "sqliteInt.h" /* ** Macros for troubleshooting. Normally turned off */ #if 0 int sqlite3PagerTrace=1; /* True to enable tracing */ #define sqlite3DebugPrintf printf #define PAGERTRACE(X) if( sqlite3PagerTrace ){ sqlite3DebugPrintf X; } | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | ** locking to prevent two processes from writing the same database ** file simultaneously, or one process from reading the database while ** another is writing. */ #ifndef SQLITE_OMIT_DISKIO #include "sqliteInt.h" /* ******************** NOTES ON THE DESIGN OF THE PAGER ************************ ** ** Within this comment block, a page is deemed to have been synced ** automatically as soon as it is written when PRAGMA synchronous=OFF. ** Otherwise, the page is not synced until the xSync method of the VFS ** is called successfully on the file containing the page. ** ** Definition: A page of the database file is said to be "overwriteable" if ** one or more of the following are true about the page: ** ** (a) The original content of the page as it was at the beginning of ** the transaction has been written into the rollback journal and ** synced. ** ** (b) The page was a freelist leaf page at the start of the transaction. ** ** (c) The page number is greater than the largest page that existed in ** the database file at the start of the transaction. ** ** (1) A page of the database file is never overwritten unless one of the ** following are true: ** ** (a) The page and all other pages on the same sector are overwriteable. ** ** (b) The atomic page write optimization is enabled, and the entire ** transaction other than the update of the transaction sequence ** number consists of a single page change. ** ** (2) The content of a page written into the rollback journal exactly matches ** both the content in the database when the rollback journal was written ** and the content in the database at the beginning of the current ** transaction. ** ** (3) Writes to the database file are an integer multiple of the page size ** in length and are aligned to a page boundary. ** ** (4) Reads from the database file are either aligned on a page boundary and ** an integer multiple of the page size in length or are taken from the ** first 100 bytes of the database file. ** ** (5) All writes to the database file are synced prior to the rollback journal ** being deleted, truncated, or zeroed. ** ** (6) If a master journal file is used, then all writes to the database file ** are synced prior to the master journal being deleted. ** ** Definition: Two databases (or the same database at two points it time) ** are said to be "logically equivalent" if they give the same answer to ** all queries. Note in particular the the content of freelist leaf ** pages can be changed arbitarily without effecting the logical equivalence ** of the database. ** ** (7) At any time, if any subset, including the empty set and the total set, ** of the unsynced changes to a rollback journal are removed and the ** journal is rolled back, the resulting database file will be logical ** equivalent to the database file at the beginning of the transaction. ** ** (8) When a transaction is rolled back, the xTruncate method of the VFS ** is called to restore the database file to the same size it was at ** the beginning of the transaction. (In some VFSes, the xTruncate ** method is a no-op, but that does not change the fact the SQLite will ** invoke it.) ** ** (9) Whenever the database file is modified, at least one bit in the range ** of bytes from 24 through 39 inclusive will be changed prior to releasing ** the EXCLUSIVE lock. ** ** (10) The pattern of bits in bytes 24 through 39 shall not repeat in less ** than one billion transactions. ** ** (11) A database file is well-formed at the beginning and at the conclusion ** of every transaction. ** ** (12) An EXCLUSIVE lock is held on the database file when writing to ** the database file. ** ** (13) A SHARED lock is held on the database file while reading any ** content out of the database file. */ /* ** Macros for troubleshooting. Normally turned off */ #if 0 int sqlite3PagerTrace=1; /* True to enable tracing */ #define sqlite3DebugPrintf printf #define PAGERTRACE(X) if( sqlite3PagerTrace ){ sqlite3DebugPrintf X; } |
︙ | ︙ | |||
193 194 195 196 197 198 199 | ** It is cleared at the end of each transaction. ** ** It is used when committing or otherwise ending a transaction. If ** the dbModified flag is clear then less work has to be done. ** ** journalStarted ** | | > | 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 | ** It is cleared at the end of each transaction. ** ** It is used when committing or otherwise ending a transaction. If ** the dbModified flag is clear then less work has to be done. ** ** journalStarted ** ** This flag is set whenever the the main journal is opened and ** initialized ** ** The point of this flag is that it must be set after the ** first journal header in a journal file has been synced to disk. ** After this has happened, new pages appended to the database ** do not need the PGHDR_NEED_SYNC flag set, as they do not need ** to wait for a journal sync before they can be written out to ** the database file (see function pager_write()). |
︙ | ︙ | |||
219 220 221 222 223 224 225 | ** may attempt to commit the transaction again later (calling ** CommitPhaseOne() again). This flag is used to ensure that the ** master journal name is only written to the journal file the first ** time CommitPhaseOne() is called. ** ** doNotSync ** | > | > > | 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 | ** may attempt to commit the transaction again later (calling ** CommitPhaseOne() again). This flag is used to ensure that the ** master journal name is only written to the journal file the first ** time CommitPhaseOne() is called. ** ** doNotSync ** ** When enabled, cache spills are prohibited and the journal file cannot ** be synced. This variable is set and cleared by sqlite3PagerWrite() ** in order to prevent a journal sync from happening in between the ** journalling of two pages on the same sector. ** ** needSync ** ** TODO: It might be easier to set this variable in writeJournalHdr() ** and writeMasterJournal() only. Change its meaning to "unsynced data ** has been written to the journal". ** |
︙ | ︙ | |||
279 280 281 282 283 284 285 286 287 288 289 290 291 292 | u32 nSubRec; /* Number of records written to sub-journal */ Bitvec *pInJournal; /* One bit for each page in the database file */ sqlite3_file *fd; /* File descriptor for database */ sqlite3_file *jfd; /* File descriptor for main journal */ sqlite3_file *sjfd; /* File descriptor for sub-journal */ i64 journalOff; /* Current write offset in the journal file */ i64 journalHdr; /* Byte offset to previous journal header */ PagerSavepoint *aSavepoint; /* Array of active savepoints */ int nSavepoint; /* Number of elements in aSavepoint[] */ char dbFileVers[16]; /* Changes whenever database file changes */ u32 sectorSize; /* Assumed sector size during rollback */ u16 nExtra; /* Add this many bytes to each in-memory page */ i16 nReserve; /* Number of unused bytes at end of each page */ | > | 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 | u32 nSubRec; /* Number of records written to sub-journal */ Bitvec *pInJournal; /* One bit for each page in the database file */ sqlite3_file *fd; /* File descriptor for database */ sqlite3_file *jfd; /* File descriptor for main journal */ sqlite3_file *sjfd; /* File descriptor for sub-journal */ i64 journalOff; /* Current write offset in the journal file */ i64 journalHdr; /* Byte offset to previous journal header */ i64 journalSizeLimit; /* Size limit for persistent journal files */ PagerSavepoint *aSavepoint; /* Array of active savepoints */ int nSavepoint; /* Number of elements in aSavepoint[] */ char dbFileVers[16]; /* Changes whenever database file changes */ u32 sectorSize; /* Assumed sector size during rollback */ u16 nExtra; /* Add this many bytes to each in-memory page */ i16 nReserve; /* Number of unused bytes at end of each page */ |
︙ | ︙ | |||
305 306 307 308 309 310 311 | #ifdef SQLITE_HAS_CODEC void *(*xCodec)(void*,void*,Pgno,int); /* Routine for en/decoding data */ void (*xCodecSizeChng)(void*,int,int); /* Notify of page size changes */ void (*xCodecFree)(void*); /* Destructor for the codec */ void *pCodec; /* First argument to xCodec... methods */ #endif char *pTmpSpace; /* Pager.pageSize bytes of space for tmp use */ | < | 391 392 393 394 395 396 397 398 399 400 401 402 403 404 | #ifdef SQLITE_HAS_CODEC void *(*xCodec)(void*,void*,Pgno,int); /* Routine for en/decoding data */ void (*xCodecSizeChng)(void*,int,int); /* Notify of page size changes */ void (*xCodecFree)(void*); /* Destructor for the codec */ void *pCodec; /* First argument to xCodec... methods */ #endif char *pTmpSpace; /* Pager.pageSize bytes of space for tmp use */ PCache *pPCache; /* Pointer to page cache object */ sqlite3_backup *pBackup; /* Pointer to list of ongoing backup processes */ }; /* ** The following global variables hold counters used for ** testing purposes only. These variables do not exist in |
︙ | ︙ | |||
821 822 823 824 825 826 827 828 829 830 831 832 833 834 | ** database page size. Since the zHeader buffer is only Pager.pageSize ** bytes in size, more than one call to sqlite3OsWrite() may be required ** to populate the entire journal header sector. */ for(nWrite=0; rc==SQLITE_OK&&nWrite<JOURNAL_HDR_SZ(pPager); nWrite+=nHeader){ IOTRACE(("JHDR %p %lld %d\n", pPager, pPager->journalHdr, nHeader)) rc = sqlite3OsWrite(pPager->jfd, zHeader, nHeader, pPager->journalOff); pPager->journalOff += nHeader; } return rc; } /* | > | 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 | ** database page size. Since the zHeader buffer is only Pager.pageSize ** bytes in size, more than one call to sqlite3OsWrite() may be required ** to populate the entire journal header sector. */ for(nWrite=0; rc==SQLITE_OK&&nWrite<JOURNAL_HDR_SZ(pPager); nWrite+=nHeader){ IOTRACE(("JHDR %p %lld %d\n", pPager, pPager->journalHdr, nHeader)) rc = sqlite3OsWrite(pPager->jfd, zHeader, nHeader, pPager->journalOff); assert( pPager->journalHdr <= pPager->journalOff ); pPager->journalOff += nHeader; } return rc; } /* |
︙ | ︙ | |||
979 980 981 982 983 984 985 986 987 988 989 990 991 992 | || pPager->journalMode==PAGER_JOURNALMODE_MEMORY || pPager->journalMode==PAGER_JOURNALMODE_OFF ){ return SQLITE_OK; } pPager->setMaster = 1; assert( isOpen(pPager->jfd) ); /* Calculate the length in bytes and the checksum of zMaster */ for(nMaster=0; zMaster[nMaster]; nMaster++){ cksum += zMaster[nMaster]; } /* If in full-sync mode, advance to the next disk sector before writing | > | 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 | || pPager->journalMode==PAGER_JOURNALMODE_MEMORY || pPager->journalMode==PAGER_JOURNALMODE_OFF ){ return SQLITE_OK; } pPager->setMaster = 1; assert( isOpen(pPager->jfd) ); assert( pPager->journalHdr <= pPager->journalOff ); /* Calculate the length in bytes and the checksum of zMaster */ for(nMaster=0; zMaster[nMaster]; nMaster++){ cksum += zMaster[nMaster]; } /* If in full-sync mode, advance to the next disk sector before writing |
︙ | ︙ | |||
1408 1409 1410 1411 1412 1413 1414 | ** ** If this is a savepoint rollback, then memory may have to be dynamically ** allocated by this function. If this is the case and an allocation fails, ** SQLITE_NOMEM is returned. */ static int pager_playback_one_page( Pager *pPager, /* The pager being played back */ | < < < | > > > | 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 | ** ** If this is a savepoint rollback, then memory may have to be dynamically ** allocated by this function. If this is the case and an allocation fails, ** SQLITE_NOMEM is returned. */ static int pager_playback_one_page( Pager *pPager, /* The pager being played back */ i64 *pOffset, /* Offset of record to playback */ Bitvec *pDone, /* Bitvec of pages already played back */ int isMainJrnl, /* 1 -> main journal. 0 -> sub-journal. */ int isSavepnt /* True for a savepoint rollback */ ){ int rc; PgHdr *pPg; /* An existing page in the cache */ Pgno pgno; /* The page number of a page in journal */ u32 cksum; /* Checksum used for sanity checking */ char *aData; /* Temporary storage for the page */ sqlite3_file *jfd; /* The file descriptor for the journal file */ int isSynced; /* True if journal page is synced */ assert( (isMainJrnl&~1)==0 ); /* isMainJrnl is 0 or 1 */ assert( (isSavepnt&~1)==0 ); /* isSavepnt is 0 or 1 */ assert( isMainJrnl || pDone ); /* pDone always used on sub-journals */ assert( isSavepnt || pDone==0 ); /* pDone never used on non-savepoint */ aData = pPager->pTmpSpace; |
︙ | ︙ | |||
1503 1504 1505 1506 1507 1508 1509 1510 | */ pPg = pager_lookup(pPager, pgno); assert( pPg || !MEMDB ); PAGERTRACE(("PLAYBACK %d page %d hash(%08x) %s\n", PAGERID(pPager), pgno, pager_datahash(pPager->pageSize, (u8*)aData), (isMainJrnl?"main-journal":"sub-journal") )); if( (pPager->state>=PAGER_EXCLUSIVE) | > > > > > < | | 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 | */ pPg = pager_lookup(pPager, pgno); assert( pPg || !MEMDB ); PAGERTRACE(("PLAYBACK %d page %d hash(%08x) %s\n", PAGERID(pPager), pgno, pager_datahash(pPager->pageSize, (u8*)aData), (isMainJrnl?"main-journal":"sub-journal") )); if( isMainJrnl ){ isSynced = pPager->noSync || (*pOffset <= pPager->journalHdr); }else{ isSynced = (pPg==0 || 0==(pPg->flags & PGHDR_NEED_SYNC)); } if( (pPager->state>=PAGER_EXCLUSIVE) && isOpen(pPager->fd) && isSynced ){ i64 ofst = (pgno-1)*(i64)pPager->pageSize; testcase( !isSavepnt && pPg!=0 && (pPg->flags&PGHDR_NEED_SYNC)!=0 ); rc = sqlite3OsWrite(pPager->fd, (u8*)aData, pPager->pageSize, ofst); if( pgno>pPager->dbFileSize ){ pPager->dbFileSize = pgno; } |
︙ | ︙ | |||
1558 1559 1560 1561 1562 1563 1564 | pData = pPg->pData; memcpy(pData, (u8*)aData, pPager->pageSize); pPager->xReiniter(pPg); if( isMainJrnl && (!isSavepnt || *pOffset<=pPager->journalHdr) ){ /* If the contents of this page were just restored from the main ** journal file, then its content must be as they were when the ** transaction was first opened. In this case we can mark the page | | > | 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 | pData = pPg->pData; memcpy(pData, (u8*)aData, pPager->pageSize); pPager->xReiniter(pPg); if( isMainJrnl && (!isSavepnt || *pOffset<=pPager->journalHdr) ){ /* If the contents of this page were just restored from the main ** journal file, then its content must be as they were when the ** transaction was first opened. In this case we can mark the page ** as clean, since there will be no need to write it out to the ** database. ** ** There is one exception to this rule. If the page is being rolled ** back as part of a savepoint (or statement) rollback from an ** unsynced portion of the main journal file, then it is not safe ** to mark the page as clean. This is because marking the page as ** clean will clear the PGHDR_NEED_SYNC flag. Since the page is ** already in the journal file (recorded in Pager.pInJournal) and |
︙ | ︙ | |||
1905 1906 1907 1908 1909 1910 1911 | needPagerReset = isHot; /* This loop terminates either when a readJournalHdr() or ** pager_playback_one_page() call returns SQLITE_DONE or an IO error ** occurs. */ while( 1 ){ | < < | 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 | needPagerReset = isHot; /* This loop terminates either when a readJournalHdr() or ** pager_playback_one_page() call returns SQLITE_DONE or an IO error ** occurs. */ while( 1 ){ /* Read the next journal header from the journal file. If there are ** not enough bytes left in the journal file for a complete header, or ** it is corrupted, then a process must of failed while writing it. ** This indicates nothing more needs to be rolled back. */ rc = readJournalHdr(pPager, isHot, szJ, &nRec, &mxPg); if( rc!=SQLITE_OK ){ |
︙ | ︙ | |||
1947 1948 1949 1950 1951 1952 1953 | ** the journal, it means that the journal might contain additional ** pages that need to be rolled back and that the number of pages ** should be computed based on the journal file size. */ if( nRec==0 && !isHot && pPager->journalHdr+JOURNAL_HDR_SZ(pPager)==pPager->journalOff ){ nRec = (int)((szJ - pPager->journalOff) / JOURNAL_PG_SZ(pPager)); | < | 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 | ** the journal, it means that the journal might contain additional ** pages that need to be rolled back and that the number of pages ** should be computed based on the journal file size. */ if( nRec==0 && !isHot && pPager->journalHdr+JOURNAL_HDR_SZ(pPager)==pPager->journalOff ){ nRec = (int)((szJ - pPager->journalOff) / JOURNAL_PG_SZ(pPager)); } /* If this is the first header read from the journal, truncate the ** database file back to its original size. */ if( pPager->journalOff==JOURNAL_HDR_SZ(pPager) ){ rc = pager_truncate(pPager, mxPg); |
︙ | ︙ | |||
1969 1970 1971 1972 1973 1974 1975 | ** database file and/or page cache. */ for(u=0; u<nRec; u++){ if( needPagerReset ){ pager_reset(pPager); needPagerReset = 0; } | | | 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 | ** database file and/or page cache. */ for(u=0; u<nRec; u++){ if( needPagerReset ){ pager_reset(pPager); needPagerReset = 0; } rc = pager_playback_one_page(pPager,&pPager->journalOff,0,1,0); if( rc!=SQLITE_OK ){ if( rc==SQLITE_DONE ){ rc = SQLITE_OK; pPager->journalOff = szJ; break; }else if( rc==SQLITE_IOERR_SHORT_READ ){ /* If the journal has been truncated, simply stop reading and |
︙ | ︙ | |||
2122 2123 2124 2125 2126 2127 2128 | ** will be skipped automatically. Pages are added to pDone as they ** are played back. */ if( pSavepoint ){ iHdrOff = pSavepoint->iHdrOffset ? pSavepoint->iHdrOffset : szJ; pPager->journalOff = pSavepoint->iOffset; while( rc==SQLITE_OK && pPager->journalOff<iHdrOff ){ | | | 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 | ** will be skipped automatically. Pages are added to pDone as they ** are played back. */ if( pSavepoint ){ iHdrOff = pSavepoint->iHdrOffset ? pSavepoint->iHdrOffset : szJ; pPager->journalOff = pSavepoint->iOffset; while( rc==SQLITE_OK && pPager->journalOff<iHdrOff ){ rc = pager_playback_one_page(pPager, &pPager->journalOff, pDone, 1, 1); } assert( rc!=SQLITE_DONE ); }else{ pPager->journalOff = 0; } /* Continue rolling back records out of the main journal starting at |
︙ | ︙ | |||
2152 2153 2154 2155 2156 2157 2158 | */ if( nJRec==0 && pPager->journalHdr+JOURNAL_HDR_SZ(pPager)==pPager->journalOff ){ nJRec = (u32)((szJ - pPager->journalOff)/JOURNAL_PG_SZ(pPager)); } for(ii=0; rc==SQLITE_OK && ii<nJRec && pPager->journalOff<szJ; ii++){ | | | | 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 | */ if( nJRec==0 && pPager->journalHdr+JOURNAL_HDR_SZ(pPager)==pPager->journalOff ){ nJRec = (u32)((szJ - pPager->journalOff)/JOURNAL_PG_SZ(pPager)); } for(ii=0; rc==SQLITE_OK && ii<nJRec && pPager->journalOff<szJ; ii++){ rc = pager_playback_one_page(pPager, &pPager->journalOff, pDone, 1, 1); } assert( rc!=SQLITE_DONE ); } assert( rc!=SQLITE_OK || pPager->journalOff==szJ ); /* Finally, rollback pages from the sub-journal. Page that were ** previously rolled back out of the main journal (and are hence in pDone) ** will be skipped. Out-of-range pages are also skipped. */ if( pSavepoint ){ u32 ii; /* Loop counter */ i64 offset = pSavepoint->iSubRec*(4+pPager->pageSize); for(ii=pSavepoint->iSubRec; rc==SQLITE_OK && ii<pPager->nSubRec; ii++){ assert( offset==ii*(4+pPager->pageSize) ); rc = pager_playback_one_page(pPager, &offset, pDone, 0, 1); } assert( rc!=SQLITE_DONE ); } sqlite3BitvecDestroy(pDone); if( rc==SQLITE_OK ){ pPager->journalOff = szJ; |
︙ | ︙ | |||
2724 2725 2726 2727 2728 2729 2730 | if( 0==(iDc&SQLITE_IOCAP_SAFE_APPEND) ){ /* This block deals with an obscure problem. If the last connection ** that wrote to this database was operating in persistent-journal ** mode, then the journal file may at this point actually be larger ** than Pager.journalOff bytes. If the next thing in the journal ** file happens to be a journal-header (written as part of the | | | 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 | if( 0==(iDc&SQLITE_IOCAP_SAFE_APPEND) ){ /* This block deals with an obscure problem. If the last connection ** that wrote to this database was operating in persistent-journal ** mode, then the journal file may at this point actually be larger ** than Pager.journalOff bytes. If the next thing in the journal ** file happens to be a journal-header (written as part of the ** previous connection's transaction), and a crash or power-failure ** occurs after nRec is updated but before this connection writes ** anything else to the journal file (or commits/rolls back its ** transaction), then SQLite may become confused when doing the ** hot-journal rollback following recovery. It may roll back all ** of this connections data, then proceed to rolling back the old, ** out-of-date data that follows it. Database corruption. ** |
︙ | ︙ | |||
2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 | } /* The journal file was just successfully synced. Set Pager.needSync ** to zero and clear the PGHDR_NEED_SYNC flag on all pagess. */ pPager->needSync = 0; pPager->journalStarted = 1; sqlite3PcacheClearSyncFlags(pPager->pPCache); } return SQLITE_OK; } /* | > | 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 | } /* The journal file was just successfully synced. Set Pager.needSync ** to zero and clear the PGHDR_NEED_SYNC flag on all pagess. */ pPager->needSync = 0; pPager->journalStarted = 1; pPager->journalHdr = pPager->journalOff; sqlite3PcacheClearSyncFlags(pPager->pPCache); } return SQLITE_OK; } /* |
︙ | ︙ | |||
3658 3659 3660 3661 3662 3663 3664 | } } } if( rc!=SQLITE_OK ){ goto failed; } | | > > > | > > > > > > > > > > | > | 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 | } } } if( rc!=SQLITE_OK ){ goto failed; } /* Reset the journal status fields to indicates that we have no ** rollback journal at this time. */ pPager->journalStarted = 0; pPager->journalOff = 0; pPager->setMaster = 0; pPager->journalHdr = 0; /* Make sure the journal file has been synced to disk. */ /* Playback and delete the journal. Drop the database write ** lock and reacquire the read lock. Purge the cache before ** playing back the hot-journal so that we don't end up with ** an inconsistent cache. Sync the hot journal before playing ** it back since the process that crashed and left the hot journal ** probably did not sync it and we are required to always sync ** the journal before playing it back. */ if( isOpen(pPager->jfd) ){ if( !pPager->noSync ){ rc = sqlite3OsSync(pPager->jfd, SQLITE_SYNC_NORMAL); } if( rc==SQLITE_OK ){ rc = sqlite3OsFileSize(pPager->jfd, &pPager->journalHdr); } if( rc==SQLITE_OK ){ rc = pager_playback(pPager, 1); } if( rc!=SQLITE_OK ){ rc = pager_error(pPager, rc); goto failed; } } assert( (pPager->state==PAGER_SHARED) || (pPager->exclusiveMode && pPager->state>PAGER_SHARED) |
︙ | ︙ | |||
3776 3777 3778 3779 3780 3781 3782 | ** ** If noContent is true, it means that we do not care about the contents ** of the page. This occurs in two seperate scenarios: ** ** a) When reading a free-list leaf page from the database, and ** ** b) When a savepoint is being rolled back and we need to load | | | 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 | ** ** If noContent is true, it means that we do not care about the contents ** of the page. This occurs in two seperate scenarios: ** ** a) When reading a free-list leaf page from the database, and ** ** b) When a savepoint is being rolled back and we need to load ** a new page into the cache to be filled with the data read ** from the savepoint journal. ** ** If noContent is true, then the data returned is zeroed instead of ** being read from the database. Additionally, the bits corresponding ** to pgno in Pager.pInJournal (bitvec of pages already written to the ** journal file) and the PagerSavepoint.pInSavepoint bitvecs of any open ** savepoints are set. This means if the page is made writable at any |
︙ | ︙ | |||
4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 | u32 cksum; char *pData2; /* We should never write to the journal file the page that ** contains the database locks. The following assert verifies ** that we do not. */ assert( pPg->pgno!=PAGER_MJ_PGNO(pPager) ); CODEC2(pPager, pData, pPg->pgno, 7, return SQLITE_NOMEM, pData2); cksum = pager_cksum(pPager, (u8*)pData2); rc = write32bits(pPager->jfd, pPager->journalOff, pPg->pgno); if( rc==SQLITE_OK ){ rc = sqlite3OsWrite(pPager->jfd, pData2, pPager->pageSize, pPager->journalOff + 4); pPager->journalOff += pPager->pageSize+4; | > > | 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 | u32 cksum; char *pData2; /* We should never write to the journal file the page that ** contains the database locks. The following assert verifies ** that we do not. */ assert( pPg->pgno!=PAGER_MJ_PGNO(pPager) ); assert( pPager->journalHdr <= pPager->journalOff ); CODEC2(pPager, pData, pPg->pgno, 7, return SQLITE_NOMEM, pData2); cksum = pager_cksum(pPager, (u8*)pData2); rc = write32bits(pPager->jfd, pPager->journalOff, pPg->pgno); if( rc==SQLITE_OK ){ rc = sqlite3OsWrite(pPager->jfd, pData2, pPager->pageSize, pPager->journalOff + 4); pPager->journalOff += pPager->pageSize+4; |
︙ | ︙ |