Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Update description of on-disk format in fts5_index.c. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | fts5-incompatible |
Files: | files | file ages | folders |
SHA1: |
85aac7b8b6731e2f6880b80cfd62d431 |
User & Date: | dan 2015-09-10 15:49:16.123 |
Context
2015-09-10
| ||
15:52 | Merge latest changes from trunk. Including fts5_expr.c fixes. (check-in: 716e7e7477 user: dan tags: fts5-incompatible) | |
15:49 | Update description of on-disk format in fts5_index.c. (check-in: 85aac7b8b6 user: dan tags: fts5-incompatible) | |
10:40 | Revert an accidentally committed makefile change. (check-in: 402704b13f user: dan tags: fts5-incompatible) | |
Changes
Changes to ext/fts5/fts5_index.c.
︙ | ︙ | |||
83 84 85 86 87 88 89 | ** ** Then, for each level from 0 to nMax: ** ** + number of input segments in ongoing merge. ** + total number of segments in level. ** + for each segment from oldest to newest: ** + segment id (always > 0) | < | | | > < | > > > > > > > > > > > > > > > > > | | | < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < | 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | ** ** Then, for each level from 0 to nMax: ** ** + number of input segments in ongoing merge. ** + total number of segments in level. ** + for each segment from oldest to newest: ** + segment id (always > 0) ** + first leaf page number (often 1, always greater than 0) ** + final leaf page number ** ** 2. The Averages Record: ** ** A single record within the %_data table. The data is a list of varints. ** The first value is the number of rows in the index. Then, for each column ** from left to right, the total number of tokens in the column for all ** rows of the table. ** ** 3. Segment leaves: ** ** TERM/DOCLIST FORMAT: ** ** Most of each segment leaf is taken up by term/doclist data. The ** general format of term/doclist, starting with the first term ** on the leaf page, is: ** ** varint : size of first term ** blob: first term data ** doclist: first doclist ** zero-or-more { ** varint: number of bytes in common with previous term ** varint: number of bytes of new term data (nNew) ** blob: nNew bytes of new term data ** doclist: next doclist ** } ** ** doclist format: ** ** varint: first rowid ** poslist: first poslist ** zero-or-more { ** varint: rowid delta (always > 0) ** poslist: next poslist ** } ** ** poslist format: ** ** varint: size of poslist in bytes multiplied by 2, not including ** this field. Plus 1 if this entry carries the "delete" flag. ** collist: collist for column 0 ** zero-or-more { ** 0x01 byte ** varint: column number (I) ** collist: collist for column I ** } ** ** collist format: ** ** varint: first offset + 2 ** zero-or-more { ** varint: offset delta + 2 ** } ** ** PAGE FORMAT ** ** Each leaf page begins with a 4-byte header containing 2 16-bit ** unsigned integer fields in big-endian format. They are: ** ** * The byte offset of the first rowid on the page, if it exists ** and occurs before the first term (otherwise 0). ** ** * The byte offset of the start of the page footer. If the page ** footer is 0 bytes in size, then this field is the same as the ** size of the leaf page in bytes. ** ** The page footer consists of a single varint for each term located ** on the page. Each varint is the byte offset of the current term ** within the page, delta-compressed against the previous value. In ** other words, the first varint in the footer is the byte offset of ** the first term, the second is the byte offset of the second less that ** of the first, and so on. ** ** The term/doclist format described above is accurate if the entire ** term/doclist data fits on a single leaf page. If this is not the case, ** the format is changed in two ways: ** ** + if the first rowid on a page occurs before the first term, it ** is stored as a literal value: ** ** varint: first rowid ** ** + the first term on each page is stored in the same way as the ** very first term of the segment: ** ** varint : size of first term ** blob: first term data ** ** 5. Segment doclist indexes: ** ** Doclist indexes are themselves b-trees, however they usually consist of ** a single leaf record only. The format of each doclist index leaf page ** is: ** ** * Flags byte. Bits are: |
︙ | ︙ | |||
233 234 235 236 237 238 239 | /* ** Rowids for the averages and structure records in the %_data table. */ #define FTS5_AVERAGES_ROWID 1 /* Rowid used for the averages record */ #define FTS5_STRUCTURE_ROWID 10 /* The structure record */ /* | | < | | < < | < < < < < | < | < > | | | | 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 | /* ** Rowids for the averages and structure records in the %_data table. */ #define FTS5_AVERAGES_ROWID 1 /* Rowid used for the averages record */ #define FTS5_STRUCTURE_ROWID 10 /* The structure record */ /* ** Macros determining the rowids used by segment leaves and dlidx leaves ** and nodes. All nodes and leaves are stored in the %_data table with large ** positive rowids. ** ** Each segment has a unique non-zero 16-bit id. ** ** The rowid for each segment leaf is found by passing the segment id and ** the leaf page number to the FTS5_SEGMENT_ROWID macro. Leaves are numbered ** sequentially starting from 1. */ #define FTS5_DATA_ID_B 16 /* Max seg id number 65535 */ #define FTS5_DATA_DLI_B 1 /* Doclist-index flag (1 bit) */ #define FTS5_DATA_HEIGHT_B 5 /* Max dlidx tree height of 32 */ #define FTS5_DATA_PAGE_B 31 /* Max page number of 2147483648 */ #define fts5_dri(segid, dlidx, height, pgno) ( \ ((i64)(segid) << (FTS5_DATA_PAGE_B+FTS5_DATA_HEIGHT_B+FTS5_DATA_DLI_B)) + \ ((i64)(dlidx) << (FTS5_DATA_PAGE_B + FTS5_DATA_HEIGHT_B)) + \ ((i64)(height) << (FTS5_DATA_PAGE_B)) + \ ((i64)(pgno)) \ ) #define FTS5_SEGMENT_ROWID(segid, pgno) fts5_dri(segid, 0, 0, pgno) #define FTS5_DLIDX_ROWID(segid, height, pgno) fts5_dri(segid, 1, height, pgno) /* ** Maximum segments permitted in a single index */ #define FTS5_MAX_SEGMENT 2000 #ifdef SQLITE_DEBUG |
︙ | ︙ | |||
352 353 354 355 356 357 358 | /* ** The contents of the "structure" record for each index are represented ** using an Fts5Structure record in memory. Which uses instances of the ** other Fts5StructureXXX types as components. */ struct Fts5StructureSegment { int iSegid; /* Segment id */ | < | 320 321 322 323 324 325 326 327 328 329 330 331 332 333 | /* ** The contents of the "structure" record for each index are represented ** using an Fts5Structure record in memory. Which uses instances of the ** other Fts5StructureXXX types as components. */ struct Fts5StructureSegment { int iSegid; /* Segment id */ int pgnoFirst; /* First leaf page number in segment */ int pgnoLast; /* Last leaf page number in segment */ }; struct Fts5StructureLevel { int nMerge; /* Number of segments in incr-merge */ int nSeg; /* Total number of segments on level */ Fts5StructureSegment *aSeg; /* Array of segments. aSeg[0] is oldest. */ |
︙ | ︙ | |||
818 819 820 821 822 823 824 | p->rc = sqlite3_reset(p->pDeleter); } /* ** Remove all records associated with segment iSegid. */ static void fts5DataRemoveSegment(Fts5Index *p, int iSegid){ | | | | 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 | p->rc = sqlite3_reset(p->pDeleter); } /* ** Remove all records associated with segment iSegid. */ static void fts5DataRemoveSegment(Fts5Index *p, int iSegid){ i64 iFirst = FTS5_SEGMENT_ROWID(iSegid, 0); i64 iLast = FTS5_SEGMENT_ROWID(iSegid+1, 0)-1; fts5DataDelete(p, iFirst, iLast); if( p->pIdxDeleter==0 ){ Fts5Config *pConfig = p->pConfig; fts5IndexPrepareStmt(p, &p->pIdxDeleter, sqlite3_mprintf( "DELETE FROM '%q'.'%q_idx' WHERE segid=?", pConfig->zDb, pConfig->zName )); |
︙ | ︙ | |||
916 917 918 919 920 921 922 | nTotal * sizeof(Fts5StructureSegment) ); if( rc==SQLITE_OK ){ pLvl->nSeg = nTotal; for(iSeg=0; iSeg<nTotal; iSeg++){ i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].iSegid); | < | 883 884 885 886 887 888 889 890 891 892 893 894 895 896 | nTotal * sizeof(Fts5StructureSegment) ); if( rc==SQLITE_OK ){ pLvl->nSeg = nTotal; for(iSeg=0; iSeg<nTotal; iSeg++){ i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].iSegid); i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].pgnoFirst); i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].pgnoLast); } }else{ fts5StructureRelease(pRet); pRet = 0; } |
︙ | ︙ | |||
1073 1074 1075 1076 1077 1078 1079 | Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl]; fts5BufferAppendVarint(&p->rc, &buf, pLvl->nMerge); fts5BufferAppendVarint(&p->rc, &buf, pLvl->nSeg); assert( pLvl->nMerge<=pLvl->nSeg ); for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){ fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].iSegid); | < | 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 | Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl]; fts5BufferAppendVarint(&p->rc, &buf, pLvl->nMerge); fts5BufferAppendVarint(&p->rc, &buf, pLvl->nSeg); assert( pLvl->nMerge<=pLvl->nSeg ); for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){ fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].iSegid); fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].pgnoFirst); fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].pgnoLast); } } fts5DataWrite(p, FTS5_STRUCTURE_ROWID, buf.p, buf.n); fts5BufferFree(&buf); |
︙ | ︙ | |||
1470 1471 1472 1473 1474 1475 1476 | pIter->iLeafPgno++; if( pIter->pNextLeaf ){ assert( pIter->iLeafPgno<=pSeg->pgnoLast ); pIter->pLeaf = pIter->pNextLeaf; pIter->pNextLeaf = 0; }else if( pIter->iLeafPgno<=pSeg->pgnoLast ){ pIter->pLeaf = fts5DataRead(p, | | | 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 | pIter->iLeafPgno++; if( pIter->pNextLeaf ){ assert( pIter->iLeafPgno<=pSeg->pgnoLast ); pIter->pLeaf = pIter->pNextLeaf; pIter->pNextLeaf = 0; }else if( pIter->iLeafPgno<=pSeg->pgnoLast ){ pIter->pLeaf = fts5DataRead(p, FTS5_SEGMENT_ROWID(pSeg->iSegid, pIter->iLeafPgno) ); }else{ pIter->pLeaf = 0; } pLeaf = pIter->pLeaf; if( pLeaf ){ |
︙ | ︙ | |||
1693 1694 1695 1696 1697 1698 1699 | fts5DataRelease(pIter->pLeaf); pIter->pLeaf = 0; while( p->rc==SQLITE_OK && pIter->iLeafPgno>pIter->iTermLeafPgno ){ Fts5Data *pNew; pIter->iLeafPgno--; pNew = fts5DataRead(p, FTS5_SEGMENT_ROWID( | | | 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 | fts5DataRelease(pIter->pLeaf); pIter->pLeaf = 0; while( p->rc==SQLITE_OK && pIter->iLeafPgno>pIter->iTermLeafPgno ){ Fts5Data *pNew; pIter->iLeafPgno--; pNew = fts5DataRead(p, FTS5_SEGMENT_ROWID( pIter->pSeg->iSegid, pIter->iLeafPgno )); if( pNew ){ if( pIter->iLeafPgno==pIter->iTermLeafPgno ){ if( pIter->iTermLeafOffset<pNew->szLeaf ){ pIter->pLeaf = pNew; pIter->iLeafOffset = pIter->iTermLeafOffset; } |
︙ | ︙ | |||
1886 1887 1888 1889 1890 1891 1892 | Fts5DlidxIter *pDlidx = pIter->pDlidx; Fts5Data *pLast = 0; int pgnoLast = 0; if( pDlidx ){ int iSegid = pIter->pSeg->iSegid; pgnoLast = fts5DlidxIterPgno(pDlidx); | | | | 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 | Fts5DlidxIter *pDlidx = pIter->pDlidx; Fts5Data *pLast = 0; int pgnoLast = 0; if( pDlidx ){ int iSegid = pIter->pSeg->iSegid; pgnoLast = fts5DlidxIterPgno(pDlidx); pLast = fts5DataRead(p, FTS5_SEGMENT_ROWID(iSegid, pgnoLast)); }else{ int iOff; /* Byte offset within pLeaf */ Fts5Data *pLeaf = pIter->pLeaf; /* Current leaf data */ /* Currently, Fts5SegIter.iLeafOffset (and iOff) points to the first ** byte of position-list content for the current rowid. Back it up ** so that it points to the start of the position-list size field. */ pIter->iLeafOffset -= sqlite3Fts5GetVarintLen(pIter->nPos*2+pIter->bDel); /* If this condition is true then the largest rowid for the current ** term may not be stored on the current page. So search forward to ** see where said rowid really is. */ if( pIter->iEndofDoclist>=pLeaf->szLeaf ){ int pgno; Fts5StructureSegment *pSeg = pIter->pSeg; /* The last rowid in the doclist may not be on the current page. Search ** forward to find the page containing the last rowid. */ for(pgno=pIter->iLeafPgno+1; !p->rc && pgno<=pSeg->pgnoLast; pgno++){ i64 iAbs = FTS5_SEGMENT_ROWID(pSeg->iSegid, pgno); Fts5Data *pNew = fts5DataRead(p, iAbs); if( pNew ){ int iRowid, bTermless; iRowid = fts5LeafFirstRowidOff(pNew); bTermless = fts5LeafIsTermless(pNew); if( iRowid ){ SWAPVAL(Fts5Data*, pNew, pLast); |
︙ | ︙ | |||
2871 2872 2873 2874 2875 2876 2877 | xChunk(p, pCtx, pChunk, nChunk); nRem -= nChunk; fts5DataRelease(pData); if( nRem<=0 ){ break; }else{ pgno++; | | | 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 | xChunk(p, pCtx, pChunk, nChunk); nRem -= nChunk; fts5DataRelease(pData); if( nRem<=0 ){ break; }else{ pgno++; pData = fts5DataRead(p, FTS5_SEGMENT_ROWID(pSeg->pSeg->iSegid, pgno)); if( pData==0 ) break; pChunk = &pData->p[4]; nChunk = MIN(nRem, pData->szLeaf - 4); if( pgno==pgnoSave ){ assert( pSeg->pNextLeaf==0 ); pSeg->pNextLeaf = pData; pData = 0; |
︙ | ︙ | |||
3176 3177 3178 3179 3180 3181 3182 | fts5WriteBtreeNoTerm(p, pWriter); }else{ /* Append the pgidx to the page buffer. Set the szLeaf header field. */ fts5BufferAppendBlob(&p->rc, &pPage->buf, pPage->pgidx.n, pPage->pgidx.p); } /* Write the page out to disk */ | | | 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 | fts5WriteBtreeNoTerm(p, pWriter); }else{ /* Append the pgidx to the page buffer. Set the szLeaf header field. */ fts5BufferAppendBlob(&p->rc, &pPage->buf, pPage->pgidx.n, pPage->pgidx.p); } /* Write the page out to disk */ iRowid = FTS5_SEGMENT_ROWID(pWriter->iSegid, pPage->pgno); fts5DataWrite(p, iRowid, pPage->buf.p, pPage->buf.n); /* Initialize the next page. */ fts5BufferZero(&pPage->buf); fts5BufferZero(&pPage->pgidx); fts5BufferAppendBlob(&p->rc, &pPage->buf, 4, zero); pPage->iPrevPgidx = 0; |
︙ | ︙ | |||
3356 3357 3358 3359 3360 3361 3362 | /* ** Flush any data cached by the writer object to the database. Free any ** allocations associated with the writer. */ static void fts5WriteFinish( Fts5Index *p, Fts5SegWriter *pWriter, /* Writer object */ | < < < | 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 | /* ** Flush any data cached by the writer object to the database. Free any ** allocations associated with the writer. */ static void fts5WriteFinish( Fts5Index *p, Fts5SegWriter *pWriter, /* Writer object */ int *pnLeaf /* OUT: Number of leaf pages in b-tree */ ){ int i; Fts5PageWriter *pLeaf = &pWriter->writer; if( p->rc==SQLITE_OK ){ if( pLeaf->pgno==1 && pLeaf->buf.n==0 ){ *pnLeaf = 0; }else{ if( pLeaf->buf.n>4 ){ fts5WriteFlushLeaf(p, pWriter); } *pnLeaf = pLeaf->pgno-1; fts5WriteFlushBtree(p, pWriter); } } fts5BufferFree(&pLeaf->term); fts5BufferFree(&pLeaf->buf); fts5BufferFree(&pLeaf->pgidx); fts5BufferFree(&pWriter->btterm); |
︙ | ︙ | |||
3451 3452 3453 3454 3455 3456 3457 | }else{ int iOff = pSeg->iTermLeafOffset; /* Offset on new first leaf page */ i64 iLeafRowid; Fts5Data *pData; int iId = pSeg->pSeg->iSegid; u8 aHdr[4] = {0x00, 0x00, 0x00, 0x00}; | | | 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 | }else{ int iOff = pSeg->iTermLeafOffset; /* Offset on new first leaf page */ i64 iLeafRowid; Fts5Data *pData; int iId = pSeg->pSeg->iSegid; u8 aHdr[4] = {0x00, 0x00, 0x00, 0x00}; iLeafRowid = FTS5_SEGMENT_ROWID(iId, pSeg->iTermLeafPgno); pData = fts5DataRead(p, iLeafRowid); if( pData ){ fts5BufferZero(&buf); fts5BufferGrow(&p->rc, &buf, pData->nn); fts5BufferAppendBlob(&p->rc, &buf, sizeof(aHdr), aHdr); fts5BufferAppendVarint(&p->rc, &buf, pSeg->term.n); fts5BufferAppendBlob(&p->rc, &buf, pSeg->term.n, pSeg->term.p); |
︙ | ︙ | |||
3479 3480 3481 3482 3483 3484 3485 | fts5BufferAppendBlob(&p->rc, &buf, pData->nn - pSeg->iPgidxOff, &pData->p[pSeg->iPgidxOff] ); } fts5DataRelease(pData); pSeg->pSeg->pgnoFirst = pSeg->iTermLeafPgno; | | | 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 | fts5BufferAppendBlob(&p->rc, &buf, pData->nn - pSeg->iPgidxOff, &pData->p[pSeg->iPgidxOff] ); } fts5DataRelease(pData); pSeg->pSeg->pgnoFirst = pSeg->iTermLeafPgno; fts5DataDelete(p, FTS5_SEGMENT_ROWID(iId, 1), iLeafRowid); fts5DataWrite(p, iLeafRowid, buf.p, buf.n); } } } fts5BufferFree(&buf); } |
︙ | ︙ | |||
3599 3600 3601 3602 3603 3604 3605 | /* Append the position-list data to the output */ fts5ChunkIterate(p, pSegIter, (void*)&writer, fts5MergeChunkCallback); } /* Flush the last leaf page to disk. Set the output segment b-tree height ** and last leaf page number at the same time. */ | | | 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 | /* Append the position-list data to the output */ fts5ChunkIterate(p, pSegIter, (void*)&writer, fts5MergeChunkCallback); } /* Flush the last leaf page to disk. Set the output segment b-tree height ** and last leaf page number at the same time. */ fts5WriteFinish(p, &writer, &pSeg->pgnoLast); if( fts5MultiIterEof(p, pIter) ){ int i; /* Remove the redundant segments from the %_data table */ for(i=0; i<nInput; i++){ fts5DataRemoveSegment(p, pLvl->aSeg[i].iSegid); |
︙ | ︙ | |||
3790 3791 3792 3793 3794 3795 3796 | pStruct = fts5StructureRead(p); iSegid = fts5AllocateSegid(p, pStruct); if( iSegid ){ const int pgsz = p->pConfig->pgsz; Fts5StructureSegment *pSeg; /* New segment within pStruct */ | < | 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 | pStruct = fts5StructureRead(p); iSegid = fts5AllocateSegid(p, pStruct); if( iSegid ){ const int pgsz = p->pConfig->pgsz; Fts5StructureSegment *pSeg; /* New segment within pStruct */ Fts5Buffer *pBuf; /* Buffer in which to assemble leaf page */ Fts5Buffer *pPgidx; /* Buffer in which to assemble pgidx */ const u8 *zPrev = 0; Fts5SegWriter writer; fts5WriteInit(p, &writer, iSegid); |
︙ | ︙ | |||
3893 3894 3895 3896 3897 3898 3899 | /* TODO2: Doclist terminator written here. */ /* pBuf->p[pBuf->n++] = '\0'; */ assert( pBuf->n<=pBuf->nSpace ); zPrev = (const u8*)zTerm; sqlite3Fts5HashScanNext(pHash); } sqlite3Fts5HashClear(pHash); | | < | 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 | /* TODO2: Doclist terminator written here. */ /* pBuf->p[pBuf->n++] = '\0'; */ assert( pBuf->n<=pBuf->nSpace ); zPrev = (const u8*)zTerm; sqlite3Fts5HashScanNext(pHash); } sqlite3Fts5HashClear(pHash); fts5WriteFinish(p, &writer, &pgnoLast); /* Update the Fts5Structure. It is written back to the database by the ** fts5StructureRelease() call below. */ if( pStruct->nLevel==0 ){ fts5StructureAddLevel(&p->rc, &pStruct); } fts5StructureExtendLevel(&p->rc, pStruct, 0, 1, 0); if( p->rc==SQLITE_OK ){ pSeg = &pStruct->aLevel[0].aSeg[ pStruct->aLevel[0].nSeg++ ]; pSeg->iSegid = iSegid; pSeg->pgnoFirst = 1; pSeg->pgnoLast = pgnoLast; pStruct->nSegment++; } fts5StructurePromote(p, 0, pStruct); } |
︙ | ︙ | |||
4910 4911 4912 4913 4914 4915 4916 | int iLast ){ int i; /* Now check that the iter.nEmpty leaves following the current leaf ** (a) exist and (b) contain no terms. */ for(i=iFirst; p->rc==SQLITE_OK && i<=iLast; i++){ | | | 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 | int iLast ){ int i; /* Now check that the iter.nEmpty leaves following the current leaf ** (a) exist and (b) contain no terms. */ for(i=iFirst; p->rc==SQLITE_OK && i<=iLast; i++){ Fts5Data *pLeaf = fts5DataRead(p, FTS5_SEGMENT_ROWID(pSeg->iSegid, i)); if( pLeaf ){ if( !fts5LeafIsTermless(pLeaf) ) p->rc = FTS5_CORRUPT; if( i>=iNoRowid && 0!=fts5LeafFirstRowidOff(pLeaf) ) p->rc = FTS5_CORRUPT; } fts5DataRelease(pLeaf); if( p->rc ) break; } |
︙ | ︙ | |||
5001 5002 5003 5004 5005 5006 5007 | const char *zIdxTerm = (const char*)sqlite3_column_text(pStmt, 1); int iIdxLeaf = sqlite3_column_int(pStmt, 2); int bIdxDlidx = sqlite3_column_int(pStmt, 3); /* If the leaf in question has already been trimmed from the segment, ** ignore this b-tree entry. Otherwise, load it into memory. */ if( iIdxLeaf<pSeg->pgnoFirst ) continue; | | | 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 | const char *zIdxTerm = (const char*)sqlite3_column_text(pStmt, 1); int iIdxLeaf = sqlite3_column_int(pStmt, 2); int bIdxDlidx = sqlite3_column_int(pStmt, 3); /* If the leaf in question has already been trimmed from the segment, ** ignore this b-tree entry. Otherwise, load it into memory. */ if( iIdxLeaf<pSeg->pgnoFirst ) continue; iRow = FTS5_SEGMENT_ROWID(pSeg->iSegid, iIdxLeaf); pLeaf = fts5DataRead(p, iRow); if( pLeaf==0 ) break; /* Check that the leaf contains at least one term, and that it is equal ** to or larger than the split-key in zIdxTerm. Also check that if there ** is also a rowid pointer within the leaf page header, it points to a ** location before the term. */ |
︙ | ︙ | |||
5056 5057 5058 5059 5060 5061 5062 | for(pDlidx=fts5DlidxIterInit(p, 0, iSegid, iIdxLeaf); fts5DlidxIterEof(p, pDlidx)==0; fts5DlidxIterNext(p, pDlidx) ){ /* Check any rowid-less pages that occur before the current leaf. */ for(iPg=iPrevLeaf+1; iPg<fts5DlidxIterPgno(pDlidx); iPg++){ | | | | 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 | for(pDlidx=fts5DlidxIterInit(p, 0, iSegid, iIdxLeaf); fts5DlidxIterEof(p, pDlidx)==0; fts5DlidxIterNext(p, pDlidx) ){ /* Check any rowid-less pages that occur before the current leaf. */ for(iPg=iPrevLeaf+1; iPg<fts5DlidxIterPgno(pDlidx); iPg++){ iKey = FTS5_SEGMENT_ROWID(iSegid, iPg); pLeaf = fts5DataRead(p, iKey); if( pLeaf ){ if( fts5LeafFirstRowidOff(pLeaf)!=0 ) p->rc = FTS5_CORRUPT; fts5DataRelease(pLeaf); } } iPrevLeaf = fts5DlidxIterPgno(pDlidx); /* Check that the leaf page indicated by the iterator really does ** contain the rowid suggested by the same. */ iKey = FTS5_SEGMENT_ROWID(iSegid, iPrevLeaf); pLeaf = fts5DataRead(p, iKey); if( pLeaf ){ i64 iRowid; int iRowidOff = fts5LeafFirstRowidOff(pLeaf); ASSERT_SZLEAF_OK(pLeaf); if( iRowidOff>=pLeaf->szLeaf ){ p->rc = FTS5_CORRUPT; |
︙ | ︙ | |||
5274 5275 5276 5277 5278 5279 5280 | for(iLvl=0; iLvl<p->nLevel; iLvl++){ Fts5StructureLevel *pLvl = &p->aLevel[iLvl]; sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " {lvl=%d nMerge=%d nSeg=%d", iLvl, pLvl->nMerge, pLvl->nSeg ); for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){ Fts5StructureSegment *pSeg = &pLvl->aSeg[iSeg]; | | < | | 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 | for(iLvl=0; iLvl<p->nLevel; iLvl++){ Fts5StructureLevel *pLvl = &p->aLevel[iLvl]; sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " {lvl=%d nMerge=%d nSeg=%d", iLvl, pLvl->nMerge, pLvl->nSeg ); for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){ Fts5StructureSegment *pSeg = &pLvl->aSeg[iSeg]; sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " {id=%d leaves=%d..%d}", pSeg->iSegid, pSeg->pgnoFirst, pSeg->pgnoLast ); } sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "}"); } } /* |
︙ | ︙ | |||
5511 5512 5513 5514 5515 5516 5517 | if( nArg==0 ){ sqlite3_result_error(pCtx, "should be: fts5_rowid(subject, ....)", -1); }else{ zArg = (const char*)sqlite3_value_text(apVal[0]); if( 0==sqlite3_stricmp(zArg, "segment") ){ i64 iRowid; int segid, height, pgno; | | | < | | | | < < | 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 | if( nArg==0 ){ sqlite3_result_error(pCtx, "should be: fts5_rowid(subject, ....)", -1); }else{ zArg = (const char*)sqlite3_value_text(apVal[0]); if( 0==sqlite3_stricmp(zArg, "segment") ){ i64 iRowid; int segid, height, pgno; if( nArg!=3 ){ sqlite3_result_error(pCtx, "should be: fts5_rowid('segment', segid, pgno))", -1 ); }else{ segid = sqlite3_value_int(apVal[1]); pgno = sqlite3_value_int(apVal[2]); iRowid = FTS5_SEGMENT_ROWID(segid, pgno); sqlite3_result_int64(pCtx, iRowid); } }else{ sqlite3_result_error(pCtx, "first arg to fts5_rowid() must be 'segment'" , -1 ); } } } /* ** This is called as part of registering the FTS5 module with database |
︙ | ︙ |
Changes to ext/fts5/test/fts5aa.test.
︙ | ︙ | |||
47 48 49 50 51 52 53 | } do_execsql_test 2.1 { INSERT INTO t1 VALUES('a b c', 'd e f'); } do_test 2.2 { execsql { SELECT fts5_decode(id, block) FROM t1_data WHERE id==10 } | | | 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | } do_execsql_test 2.1 { INSERT INTO t1 VALUES('a b c', 'd e f'); } do_test 2.2 { execsql { SELECT fts5_decode(id, block) FROM t1_data WHERE id==10 } } {/{{structure} {lvl=0 nMerge=0 nSeg=1 {id=[0123456789]* leaves=1..1}}}/} foreach w {a b c d e f} { do_execsql_test 2.3.$w.asc { SELECT rowid FROM t1 WHERE t1 MATCH $w; } {1} do_execsql_test 2.3.$w.desc { SELECT rowid FROM t1 WHERE t1 MATCH $w ORDER BY rowid DESC; |
︙ | ︙ |
Changes to ext/fts5/test/fts5rowid.test.
︙ | ︙ | |||
23 24 25 26 27 28 29 | do_catchsql_test 1.1 { SELECT fts5_rowid() } {1 {should be: fts5_rowid(subject, ....)}} do_catchsql_test 1.2 { SELECT fts5_rowid('segment') | | | | | | 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | do_catchsql_test 1.1 { SELECT fts5_rowid() } {1 {should be: fts5_rowid(subject, ....)}} do_catchsql_test 1.2 { SELECT fts5_rowid('segment') } {1 {should be: fts5_rowid('segment', segid, pgno))}} do_execsql_test 1.3 { SELECT fts5_rowid('segment', 1, 1) } {137438953473} do_catchsql_test 1.4 { SELECT fts5_rowid('nosucharg'); } {1 {first arg to fts5_rowid() must be 'segment'}} #------------------------------------------------------------------------- # Tests of the fts5_decode() function. # reset_db do_execsql_test 2.1 { |
︙ | ︙ |
Changes to main.mk.
︙ | ︙ | |||
328 329 330 331 332 333 334 | $(TOP)/ext/misc/series.c \ $(TOP)/ext/misc/spellfix.c \ $(TOP)/ext/misc/totype.c \ $(TOP)/ext/misc/wholenumber.c \ $(TOP)/ext/misc/vfslog.c \ $(TOP)/ext/fts5/fts5_tcl.c \ $(TOP)/ext/fts5/fts5_test_mi.c \ | > > > | | 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 | $(TOP)/ext/misc/series.c \ $(TOP)/ext/misc/spellfix.c \ $(TOP)/ext/misc/totype.c \ $(TOP)/ext/misc/wholenumber.c \ $(TOP)/ext/misc/vfslog.c \ $(TOP)/ext/fts5/fts5_tcl.c \ $(TOP)/ext/fts5/fts5_test_mi.c \ $(FTS5_SRC) # fts5.c #TESTSRC += $(TOP)/ext/fts2/fts2_tokenizer.c #TESTSRC += $(TOP)/ext/fts3/fts3_tokenizer.c TESTSRC2 = \ |
︙ | ︙ |