/ History for ext/fts1/fts1.c

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

History for ext/fts1/fts1.c

[0b225ee6] part of check-in [a80ae2c9] Dozens and dozens of typo fixes in comments. This change adds no value to the end product and is disruptive, so it is questionable whether or not it will ever land on trunk. (check-in: [a80ae2c9] user: drh branch: typos, size: 101992)
[a39f7d21] part of check-in [34eb6911] Fix uses of ctype functions (ex: isspace()) on signed characters in test programs and in some obscure extensions. No changes to the core. (check-in: [34eb6911] user: drh branch: trunk, size: 101988)
[f7739dc3] part of check-in [f2ab8747] Modify several extensions to use the new exported function naming. Fix some shared library compilation issues. (check-in: [f2ab8747] user: mistachkin branch: extRefactor, size: 101943)
[3e7b253e] part of check-in [6404afa0] Corrected typos and misspellings. Ticket #3702. (CVS 6336) (check-in: [6404afa0] user: shane branch: trunk, size: 101910)
[2ecd182d] part of check-in [062bf5d4] Remove all instances of sprintf() from the FTS modules. Ticket #3049. (CVS 4996) (check-in: [062bf5d4] user: drh branch: trunk, size: 101909)
[b964a4e7] part of check-in [4e91a267] Change all instances of "it's" in comments to either "its" or "it is", as appropriate, in case the comments are ever again read by a pedantic grammarian. Ticket #2840. (CVS 4629) (check-in: [4e91a267] user: drh branch: trunk, size: 101884)
[878951b8] part of check-in [f94cdcfd] Do not require SQLITE_ENABLE_BROKEN_FTS2 if FTS2 is not enabled. The same for FTS1. Ticket #2777. (CVS 4556) (check-in: [f94cdcfd] user: drh branch: trunk, size: 101882)
[e45ff77a] part of check-in [fec6567a] Drop the forced error from fts3.c and add forced errors to fts2.c and fts1.c. (CVS 4427) (check-in: [fec6567a] user: shess branch: trunk, size: 101808)
[d07c6eeb] part of check-in [eee02502] Fix memory leak reported by an fts1 user. Was losing a doclist on a query error. (CVS 4347) (check-in: [eee02502] user: shess branch: trunk, size: 100860)
[61cce595] part of check-in [3f9a6661] Fix ticket #2439: the FTS1 and FTS2 extensions use the non-standard, unportable and highly deprecated <malloc.h> header on all platforms except Apple Mac OS X. The <malloc.h> actually is never required on any OS with an at least partly POSIX-conforming API as the malloc(3) & friends functions officially live in <stdlib.h> since over 10 years. Under some platform like FreeBSD the inclusion of <malloc.h> since a few years even causes an "#error" and this way a build failure. So, just get rid of the bad <malloc.h> usage in FTS1 and FTS2 extensions at all and stick with <stdlib.h> there only. (CVS 4191) (check-in: [3f9a6661] user: rse branch: trunk, size: 100825)
[d32c3202] part of check-in [febf75f0] Implement xRename() for fts1 so that it is possible to rename fts1 tables. See http://www.sqlite.org/cvstrac/chngview?cn=4143 (CVS 4184) (check-in: [febf75f0] user: shess branch: trunk, size: 100882)
[61fa4154] part of check-in [f9020cff] Replicates http://www.sqlite.org/cvstrac/chngview?cn=4151 which modified fts2:

Modify handling of SQLITE_SCHEMA in fts2 code. An SQLITE_SCHEMA error may cause SQLite to reload the internal schema, deleting and recreating v-table objects. So the sqlite3_vtab structure can be deleted out from under a v-table implementation. (CVS 4183) (check-in: [f9020cff] user: shess branch: trunk, size: 100359)

[f9294f39] part of check-in [5db25e36] Sorry, previous check-in included a last-minute "Did it really work?" change :-). (CVS 4182) (check-in: [5db25e36] user: shess branch: trunk, size: 100411)
[cb12e675] part of check-in [c2ba3cc0] Apply change 4095 to fts1. Fix snippet generation when the left-most column of an fts table is used in the MATCH clause. Fix for ticket #2429. (CVS 4181) (check-in: [c2ba3cc0] user: shess branch: trunk, size: 100439)
[b51a4e27] part of check-in [3be2a6d1] All the use of MySQL-style quoting in the FTS modules. Ticket #2446. (CVS 4119) (check-in: [3be2a6d1] user: drh branch: trunk, size: 100394)
[1ee986c3] part of check-in [81be7290] Fix crash in delete when existing row has null fields. Previous code assumed that the row had values in all columns, sigh. Fixes bug http://www.sqlite.org/cvstrac/tktview?tn=2289 . (CVS 3833) (check-in: [81be7290] user: shess branch: trunk, size: 100380)
[7585d9cb] part of check-in [f6c3abdc] Don't call ctype functions on hi-bit chars. Some platforms raise assertions when this occurs, and it's almost certainly not the right thing to do in the first place. (CVS 3746) (check-in: [f6c3abdc] user: shess branch: trunk, size: 100213)
[0aab3cf2] part of check-in [283385d2] http://www.sqlite.org/cvstrac/tktview?tn=2219

When creating fts tables in an attached database, the backing tables are created in database 'main'. This change propagates the appropriate database name to the routines which build sql statements.

Note that I propagate the database name and table name separately. I briefly considered just making the table name be "db.table", but it didn't fit so well in the model used to store the table name and other information, and having the db name passed separately seemed a bit more transparent. (CVS 3631) (check-in: [283385d2] user: shess branch: trunk, size: 99576)

[37b91a1a] part of check-in [4f2ab4b6] http://www.sqlite.org/cvstrac/tktview?tn=2166,35

Calling UPDATE against an fts table in a UTF-16 database inserts corrupted data into the database. The UTF-8 data is being inserted directly. This appears to happen because sqlite3_ value_text() destructively coerces a value to UTF-8, and it's never converted back when updating the table. This works around the problem by rearranging things so that the update happens before the coercion. (CVS 3596) (check-in: [4f2ab4b6] user: shess branch: trunk, size: 99070)

[18906d1e] part of check-in [08c2cc0e] Drop a couple variables which are no longer used anywhere. (CVS 3524) (check-in: [08c2cc0e] user: shess branch: trunk, size: 99070)
[a6b6a7b5] part of check-in [18142fdb] http://www.sqlite.org/cvstrac/tktview?tn=2046

The virtual table interface allows for a cursor to field multiple xFilter() calls. For instance, if a join is done with a virtual table, there could be a call for each row which potentially matches. Unfortunately, fulltextFilter() assumes that it has a fresh cursor, and overwrites a prepared statement and a malloc'ed pointer, resulting in unfinalized statements and a memory leak.

This change hacks the code to manually clean up offending items in fulltextFilter(), emphasis on "hacks", since it's a fragile fix insofar as future additions to fulltext_cursor could continue to have the problem. (CVS 3521) (check-in: [18142fdb] user: shess branch: trunk, size: 99090)

[78218fb0] part of check-in [9628a61a] Allow backing tables to be missing on dropping fts table. Fixes http://www.sqlite.org/cvstrac/tktview?tn=1992,35 . (CVS 3509) (check-in: [9628a61a] user: shess branch: trunk, size: 98665)
[10ab0ca2] part of check-in [5e8bbb85] Fix leaky symbols. With this change, fts1 and fts2 can both be statically linked. (CVS 3472) (check-in: [5e8bbb85] user: shess branch: trunk, size: 98613)
[fc8ce471] part of check-in [144e3f11] Fix incorrect doclist initialization in term_select_all(). docListRestrictColumn() generates a DL_POSITIONS doclist, which means that after the first doclist is processed, the second doclist is initialized as DL_POSITIONS, but with DL_POSITIONS_OFFSETS data. (Note that DL_DEFAULT is now DL_POSITIONS, which masks this bug.) (CVS 3467) (check-in: [144e3f11] user: shess branch: trunk, size: 98536)
[37ba4a1c] part of check-in [6cf1fb9f] The snippet generator adds ellipsis between text from different columns. (CVS 3465) (check-in: [6cf1fb9f] user: drh branch: trunk, size: 98539)
[6eca4867] part of check-in [df1a4b48] Make DL_POSITION the default mode in FTS1. Remove the need to compile with SQLITE_CORE when SQLITE_ENABLE_FTS1 is used. (CVS 3462) (check-in: [df1a4b48] user: drh branch: trunk, size: 98672)
[302d4fa0] part of check-in [fdcea7b1] Add the option to omit offset information from posting lists in FTS1. (CVS 3456) (check-in: [fdcea7b1] user: drh branch: trunk, size: 98577)
[4f6278a6] part of check-in [936b06aa] Add a Porter stemmer option to the FTS1 module. (CVS 3452) (check-in: [936b06aa] user: drh branch: trunk, size: 97920)
[bad8872d] part of check-in [8cdf1d6a] Fix a bug in the handling of the OR operator in FTS1. Test cases added to prevent a repeat. (CVS 3450) (check-in: [8cdf1d6a] user: drh branch: trunk, size: 97823)
[197909c5] part of check-in [0934d220] More snippet generator improvements and test cases. (CVS 3449) (check-in: [0934d220] user: drh branch: trunk, size: 97450)
[8531a2a8] part of check-in [d3f4ae82] Bug fix in the FTS1 snippet generator. Improvements in the way the snippet generator handles whitespace. (CVS 3448) (check-in: [d3f4ae82] user: drh branch: trunk, size: 97423)
[5cb7829d] part of check-in [165645d3] Avoid segfaults when inserted NULL values into FTS1. (CVS 3447) (check-in: [165645d3] user: drh branch: trunk, size: 96753)
[6efbbd6f] part of check-in [757fa224] Implemented UPDATE for full-text tables.

We handle an UPDATE to a row by performing an UPDATE on the content table and by building new position lists for each term which appears in either the old or new versions of the row. We write these position lists all at once; this is presumably more efficient than a delete followed by an insert (which would first write empty position lists, then new position lists). (CVS 3434) (check-in: [757fa224] user: adamd branch: trunk, size: 96723)

[16f58a0c] part of check-in [111ca616] When gathering a doclist for querying, don't discard empty position lists until the end; this allows empty position lists to override non-empty lists encountered later in the gathering process. This fixes #1982, which was caused by the fact that for all-column queries we weren't discarding empty position lists at all. (CVS 3433) (check-in: [111ca616] user: adamd branch: trunk, size: 93394)
[02c5b614] part of check-in [c7ee60d0] Implementation of the snippet() function for FTS1. Includes a few simple test cases but more testing is needed. (CVS 3431) (check-in: [c7ee60d0] user: drh branch: trunk, size: 92812)
[50770451] part of check-in [bb2e1871] Fixed a build problem in sqlite3_extension_init(). (CVS 3430) (check-in: [bb2e1871] user: adamd branch: trunk, size: 88311)
[c33206af] part of check-in [efa8fb32] Convert all names to lower case before sending them to the xFindFunction method of a virtual table. In FTS1, use strcmp instead of strcasecmp. Ticket #1981. (CVS 3428) (check-in: [efa8fb32] user: drh branch: trunk, size: 88341)
[298a1b77] part of check-in [5e35dc1f] Modify FTS1 so that the "magic" column has the same name as the virtual table. Offsets are retrieved using a special "offsets" function whose first argument is the magic column. Snippets will ultimately be retrieved in the same way. (CVS 3427) (check-in: [5e35dc1f] user: drh branch: trunk, size: 88349)
[10d0c351] part of check-in [aa7728f9] Add the sqlite3_overload_function() API - part of the virtual table interface. (CVS 3426) (check-in: [aa7728f9] user: drh branch: trunk, size: 86703)
[ff2b92dd] part of check-in [5a18dd88] Fix an initialization problem in FTS1. Ticket #1977. (CVS 3424) (check-in: [5a18dd88] user: drh branch: trunk, size: 85695)
[9fcad5a1] part of check-in [f25cfa1a] The FTS1 tables have a new automatic column named "offset" that returns a string containing byte offset information for all matching terms. Also added a large test case based on SQLite mailing list entries. (CVS 3417) (check-in: [f25cfa1a] user: drh branch: trunk, size: 85694)
[997a09e0] part of check-in [607d928c] In FTS1: Retain the Query structure as part of the cursor. It will be used laster as part of snippet generation. (CVS 3414) (check-in: [607d928c] user: drh branch: trunk, size: 78409)
[65aaeb02] part of check-in [fca59281] Minor code cleanup in FTS1. (CVS 3412) (check-in: [fca59281] user: drh branch: trunk, size: 77260)
[b5d7a61a] part of check-in [820634f7] Implementation of "column:" modifiers in FTS1 queries. (CVS 3411) (check-in: [820634f7] user: drh branch: trunk, size: 76772)
[de9c9027] part of check-in [adb780e0] Module spec parser enhancements for FTS1. Now able to cope with column names in the spec that are SQL keywords or have special characters, etc. Also added support for additional control lines. Column names can be followed by a type specifier (which is ignored.) (CVS 3410) (check-in: [adb780e0] user: drh branch: trunk, size: 75477)
[dc11410c] part of check-in [528036c8] Fix the FTS1 test cases and add new tests. Comments added to the FTS1 code. (CVS 3409) (check-in: [528036c8] user: drh branch: trunk, size: 66536)
[bbca0688] part of check-in [366a70b0] Allow virtual tables to contain multiple full-text-indexed columns. Added a magic column "_all" which can be used for querying all columns in a table at once.

For now, each posting list stores position/offset information for multiple columns. We may implement separate posting lists for separate columns at some future point. (CVS 3408) (check-in: [366a70b0] user: adamd branch: trunk, size: 65307)

[9ba2598d] part of check-in [877d5558] Answer queries for a particular rowid in a full-text table by looking up that rowid directly rather than by performing a table scan. (CVS 3407) (check-in: [877d5558] user: adamd branch: trunk, size: 63833)
[5c5e362e] part of check-in [2f5f6290] Re-use deleted rowids for new segments. This has a somewhat surprising impact on performance, I believe because it keeps the index smaller (by keeping rowids smaller), and also because it improves locality in the table (deleting a row means we've already touched the pages leading to that rowid). (CVS 3405) (check-in: [2f5f6290] user: shess branch: trunk, size: 63299)
[022a985b] part of check-in [227dc3fe] Add a rudimentary tokenizer and parser to FTS1 for parsing the module arguments during initialization. Recognized arguments include a tokenizer selector and a list of virtual table columns. (CVS 3403) (check-in: [227dc3fe] user: drh branch: trunk, size: 62677)
[a0f9600c] part of check-in [f44b8bae] Add pzErr parameters to the xConnect and xCreate methods of virtual tables in order to provide better error reporting. This is an interface change for virtual tables. Prior virtual table implementations will need to be modified and recompiled. (CVS 3402) (check-in: [f44b8bae] user: drh branch: trunk, size: 56612)
[36a33f0d] part of check-in [70bcff02] Add some simple test cases for the OR and NOT logic of the fts1 module. Fix lots of bugs discovered while developing these test cases. (CVS 3400) (check-in: [70bcff02] user: drh branch: trunk, size: 56522)
[c212d1a3] part of check-in [ae502657] Add support for OR and NOT terms in fts1. (CVS 3399) (check-in: [ae502657] user: drh branch: trunk, size: 56322)
[9197a418] part of check-in [b6b93a33] Write doclists using a segmented technique to amortize costs better. New items for a term are merged with the term's segment 0 doclist, until that doclist exceeds CHUNK_MAX. Then the segments are merged in exponential fashion, so that segment 1 contains approximately 2*CHUNK_MAX data, segment 2 4*CHUNK_MAX, and so on. (CVS 3398) (check-in: [b6b93a33] user: shess branch: trunk, size: 49415)
[a17d32e4] part of check-in [55a03b96] A minor change to fts1.c to fix broken build. (CVS 3393) (check-in: [55a03b96] user: adamd branch: trunk, size: 47294)
[e4742aa2] part of check-in [d4923e98] Add a TRACE macro to the FTS1 module for troubleshooting. Turned off by default. (CVS 3388) (check-in: [d4923e98] user: drh branch: trunk, size: 47276)
[c8532f13] part of check-in [098cbafc] Convert static variables into constants in the FTS module. (CVS 3385) (check-in: [098cbafc] user: drh branch: trunk, size: 46710)
[98f1b10b] part of check-in [e98b0cf2] Miscellaneous restructuring and cleanup based on suggestions from shess. (CVS 3382) (check-in: [e98b0cf2] user: adamd branch: trunk, size: 46680)
[6ac8a4d6] part of check-in [5844db1a] Make fts1.c not rely on nul-terminated strings. Mostly a matter of making sure we always pass around ptr/len, but there were a few places where we actually relied on nul-termination.

An earlier change had additionally changed appropriate sqlite3_bind_text() calls to sqlite3_bind_blob(). I've found that this changes what's actually stored in the database, so backed those changes out. Also (and this is weird), I found that I could no longer do straight-forward = queries against %_term.term at a command-line. (CVS 3379) (check-in: [5844db1a] user: shess branch: trunk, size: 45799)

[8c6e1f62] part of check-in [e1891f0d] Refactor the FTS1 module so that its name is "fts1" instead of "fulltext", so that all symbols with external linkage begin with "sqlite3Fts1", and so that all filenames begin with "fts1". (CVS 3377) (check-in: [e1891f0d] user: drh branch: trunk, size: 45362) Added