Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add comment describing format of row and global size records.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | matchinfo
Files: files | file ages | folders
SHA1: 7cfa40b5c1dad04fb959d418be4e76c8a584b506
User & Date: dan 2013-01-03 20:35:50.308
Context
2013-01-04
18:37
Allow an fts5 tokenizer to split a single document into multiple streams (i.e. sub-fields within a single column value). Modify the matchinfo APIs so that a ranking function may handle streams and/or columns separately or otherwise. check-in: f3ac136843 user: dan tags: matchinfo
2013-01-03
20:35
Add comment describing format of row and global size records. check-in: 7cfa40b5c1 user: dan tags: matchinfo
18:13
Fill in more of the matchinfo functions so that the BM25 function works. check-in: 0e439483d7 user: dan tags: matchinfo
Changes
Unified Diff Ignore Whitespace Patch
Changes to src/fts5.c.
10
11
12
13
14
15
16
17



18





19

20











21
22






23








24




25
26
27
28
29
30
31
**
*************************************************************************
*/

#include "sqliteInt.h"
#include "vdbeInt.h"

/*



** The global count record is a set of N varints, where N is one greater





** than the number of columns in the indexed table. The first varint

** contains the number of records in the table. Each subsequent varint











** contains the total number of tokens stored in each column.
**






** The key used for the global record in the KV store is the root page 








** number of the FTS index followed by a single 0x00 byte.




*/

/*
** Default distance value for NEAR operators.
*/
#define FTS5_DEFAULT_NEAR 10








|
>
>
>
|
>
>
>
>
>
|
>
|
>
>
>
>
>
>
>
>
>
>
>
|

>
>
>
>
>
>
|
>
>
>
>
>
>
>
>
|
>
>
>
>







10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
**
*************************************************************************
*/

#include "sqliteInt.h"
#include "vdbeInt.h"

/* 
** Stream numbers must be lower than this.
*/
#define SQLITE4_FTS5_NSTREAM 60

/*
** Records stored within the index:
**
** Row size record:
**   There is one "row size" record in the index for each row in the
**   indexed table. The "row size" record contains the number of tokens
**   in the associated row for each combination of a stream and column
**   number (i.e. contains the data required to find the number of
**   tokens associated with stream S present in column C of the row for
**   all S and C).
**
**   The key for the row size record is a single 0x00 byte followed by
**   a copy of the PK blob for the table row. 
**
**   The value is a series of varints. Each column of the table is
**   represented by one or more varints packed into the array.
**
**   If a column contains only stream 0 tokens, then it is represented
**   by a single varint - (nToken << 1), where nToken is the number of
**   stream 0 tokens stored in the column.
**
**   Or, if the column contains tokens from multiple streams, the first
**   varint contains a bitmask indicating which of the streams are present
**   (stored as ((bitmask << 1) | 0x01)). Following the bitmask is a
**   varint containing the number of tokens for each stream present, in
**   ascending order of stream number.
**
** Global size record:
**   There is a single "global size" record stored in the database. The
**   database key for this record is a single byte - 0x00.
**
**   The data for this record is a series of varint values. The first 
**   varint is the total number of rows in the table. The subsequent
**   varints make up a "row size" record containing the total number of
**   tokens for each S/C combination in all rows of the table.
**
** FTS index records:
**
**   The FTS index records implement the following mapping:
**
**       (token, document-pk) -> (list of instances)
*/

/*
** Default distance value for NEAR operators.
*/
#define FTS5_DEFAULT_NEAR 10