Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
History for ext/fts5/fts5_tokenize.c
2024-11-18
| ||
14:08 | [49aea8cc40] part of check-in [9b79b999d4] Fix a "applying zero offset to null pointer" usan error in the fts5 trigram tokenizer. (check-in: [9b79b999d4] user: dan branch: trunk, size: 40573) | |
2024-11-11
| ||
19:49 | [87ab719f05] part of check-in [84f4e37178] Fix the fts5 trigram tokenizer so that it handles non-nul-terminated strings. (check-in: [84f4e37178] user: dan branch: trunk, size: 40561) | |
2024-10-14
| ||
18:43 | [033e2e43b8] part of check-in [20e60bf058] Avoid the possibility of buffer overrun in the READ_UTF8 macro by using an less-than operator rather than not-equal-to. (check-in: [20e60bf058] user: drh branch: trunk, size: 40519) | |
2024-08-17
| ||
17:22 | [ae9c4fa931] part of check-in [8f9257361b] Add tests to restore coverage of fts5_tokenizer.c. (check-in: [8f9257361b] user: dan branch: fts5-locale, size: 40519) | |
2024-08-12
| ||
11:46 | [96efa85a21] part of check-in [3291ce3a33] Update the porter tokenizer to use locales. (check-in: [3291ce3a33] user: dan branch: fts5-locale, size: 40556) | |
2024-08-10
| ||
16:29 | [2321cbcef0] part of check-in [fc956353d3] Revision to check-in d9f726ade6b258f8 so that OOM and other unrelated failures are not overridden by a syntax error in the tokenizer spec. (check-in: [fc956353d3] user: drh branch: branch-3.46, size: 39895) | |
15:46 | [63ebe9057e] part of check-in [3778b2a9ca] Revision to check-in [d9f726ade6b258f8] so that OOM and other unrelated failures are not overridden by a syntax error in the tokenizer spec. (check-in: [3778b2a9ca] user: drh branch: trunk, size: 40240) | |
2024-08-06
| ||
22:54 | [b94826fc23] part of check-in [7a65ac42c2] Improved robustness of parsing of tokenize= arguments in FTS5. (check-in: [7a65ac42c2] user: drh branch: branch-3.46, size: 39844) | |
22:49 | [b1c3dc4de2] part of check-in [d9f726ade6] Improved robustness of parsing of tokenize= arguments in FTS5. Forum post 171bcc2bcd. (check-in: [d9f726ade6] user: drh branch: trunk, size: 40189) | |
2024-05-14
| ||
17:16 | [fa54930751] part of check-in [ca4fdcb8ae] Have fts5 tables delay initializing the tokenizer until it is first used in all cases where the tokenizer is not "trigram". (check-in: [ca4fdcb8ae] user: dan branch: fts5-delay-tokenizer, size: 40070) | |
2023-11-02
| ||
18:10 | [83cfcede38] part of check-in [8f046c82c9] Fix a problem with amalgamation builds on this branch. (check-in: [8f046c82c9] user: dan branch: fts5-trigram-diacritics, size: 39725) | |
17:31 | [5a895f3bf3] part of check-in [83da80135b] Add the "remove_diacritics" option to the fts5 trigram tokenizer. (check-in: [83da80135b] user: dan branch: fts5-trigram-diacritics, size: 39689) | |
2020-11-25
| ||
16:28 | [5e251efb0f] part of check-in [25d067c270] Fix harmless compiler warnings about unused function parameters. (check-in: [25d067c270] user: drh branch: trunk, size: 38035) | |
2020-10-03
| ||
14:36 | [6f47244681] part of check-in [b1d048748c] FTS5 does not handle tokens that contain embedded nul characters. Prevent the trigram tokenizer from returning such tokens. Fix for [2ba5930b2]. (check-in: [b1d048748c] user: dan branch: trunk, size: 37972) | |
2020-10-01
| ||
16:10 | [5711f17006] part of check-in [897ced99b4] Add tests for the trigram tokenizer. Fix minor issues. (check-in: [897ced99b4] user: dan branch: fts5-trigram, size: 37890) | |
2020-09-30
| ||
20:35 | [be911fbd2f] part of check-in [0d7810c1ae] Add experimental unicode-aware trigram tokenizer to fts5. And support for LIKE and GLOB optimizations for fts5 tables that use said tokenizer. (check-in: [0d7810c1ae] user: dan branch: fts5-trigram, size: 37304) | |
2020-07-29
| ||
16:18 | [4bbcf897c0] part of check-in [a80ae2c98b] Dozens and dozens of typo fixes in comments. This change adds no value to the end product and is disruptive, so it is questionable whether or not it will ever land on trunk. (check-in: [a80ae2c98b] user: drh branch: typos, size: 34630) | |
2019-04-13
| ||
04:38 | [2e508c6a3b] part of check-in [07ee06fd39] Use the 64-bit memory allocator interfaces in extensions, whenever possible. (check-in: [07ee06fd39] user: drh branch: trunk, size: 34630) | |
2019-01-08
| ||
20:02 | [4d904c2377] part of check-in [ca67f2ec0e] Use 64-bit math to compute the sizes of memory allocations in extensions. (check-in: [ca67f2ec0e] user: drh branch: trunk, size: 34628) | |
2018-12-31
| ||
21:43 | [8b7ef00cf0] part of check-in [b57c545a38] Fix harmless compiler warnings. (check-in: [b57c545a38] user: drh branch: trunk, size: 34557) | |
2018-12-28
| ||
14:33 | [d49f479ca1] part of check-in [c3a3a11194] Avoid an undefined left-shift operation in fts5 caused by malformed utf-8 text. (check-in: [c3a3a11194] user: dan branch: trunk, size: 34554) | |
07:37 | [240f849c91] part of check-in [c564bf8701] Fix problems in fts5 found by ASAN. (check-in: [c564bf8701] user: dan branch: trunk, size: 34559) | |
2018-12-03
| ||
16:14 | [ca2b6a0337] part of check-in [06177f3f11] Add the "remove_diacritics=2" option to the unicode61 tokenizer in both FTS5 and FTS3/4. (check-in: [06177f3f11] user: dan branch: trunk, size: 34549) | |
2018-07-13
| ||
19:52 | [ebd13d034f] part of check-in [80d2b9e635] Add the "categories" option to the unicode61 tokenizer in fts5. (check-in: [80d2b9e635] user: dan branch: trunk, size: 34058) | |
2016-02-11
| ||
17:01 | [2ce7b44183] part of check-in [bc3f7900d5] Handle parser stack overflow when parsing fts5 query expressions. Fix some compiler warnings in fts5 code. (check-in: [bc3f7900d5] user: dan branch: trunk, size: 33283) | |
2016-01-23
| ||
18:51 | [4d5c4f183c] part of check-in [72d53699bf] Fix an fts5 problem with using both xPhraseFirst() and xPhraseFirstColumn() within a single statement in detail=col mode. (check-in: [72d53699bf] user: dan branch: fts5-perf, size: 33170) | |
2015-12-23
| ||
16:42 | [504984ac69] part of check-in [5d44d4a6cf] Fix some harmless gcc compiler warnings. Mostly in fts5, but also two in the core code. (check-in: [5d44d4a6cf] user: dan branch: trunk, size: 33175) | |
2015-12-16
| ||
23:30 | [618efe033b] part of check-in [1d0e6aa119] Fix even more harmless compiler warnings. (check-in: [1d0e6aa119] user: mistachkin branch: msvcWarn, size: 33187) | |
2015-10-14
| ||
20:34 | [12c5d92528] part of check-in [1c46c194a2] Fix harmless compiler warnings. (check-in: [1c46c194a2] user: mistachkin branch: trunk, size: 33172) | |
2015-09-02
| ||
19:48 | [f380f46f34] part of check-in [bdedd838bb] Further tests to raise coverage of fts5 synonym code to 100%. Fix a dropped error code in the same. (check-in: [bdedd838bb] user: dan branch: fts5-incompatible, size: 33167) | |
2015-08-29
| ||
15:44 | [710541513e] part of check-in [fc71868496] Another change to the fts5 tokenizer API. (check-in: [fc71868496] user: dan branch: fts5-incompatible, size: 33174) | |
2015-08-28
| ||
19:56 | [07a894410b] part of check-in [90b85b42f2] Change the fts5 tokenizer API to allow more than one token to occupy a single position within a document. (check-in: [90b85b42f2] user: dan branch: fts5-incompatible, size: 33224) | |
2015-07-31
| ||
14:43 | [2836f6728b] part of check-in [c3c672af97] Fix a bug in the fts5 porter tokenizer preventing it from passing xCreate() arguments through to its parent tokenizer. (check-in: [c3c672af97] user: dan branch: trunk, size: 33071) | |
2015-07-02
| ||
15:52 | [30f97a8c74] part of check-in [7819002ed8] Remove "#ifdef SQLITE_ENABLE_FTS5" from individual fts5 source files. Add a single "#if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5)" to fts5.c. (check-in: [7819002ed8] user: dan branch: trunk, size: 32972) | |
2015-05-30
| ||
11:49 | [97251d68d7] part of check-in [e008c3c8e2] Remove the "#include sqlite3Int.h" from fts5Int.h. (check-in: [e008c3c8e2] user: dan branch: fts5, size: 33045) | |
2015-05-22
| ||
06:08 | [24649425ad] part of check-in [fea8a4db9d] Improve test coverage of fts5_unicode2.c. (check-in: [fea8a4db9d] user: dan branch: fts5, size: 33082) | |
2015-05-20
| ||
09:27 | [6f4d2cbe7e] part of check-in [0e91a6a520] Improve test coverage of fts5_tokenize.c. (check-in: [0e91a6a520] user: dan branch: fts5, size: 33260) | |
2015-05-19
| ||
11:32 | [4d9d504781] part of check-in [de9f8ef6eb] Fix a memory leak that could follow an OOM condition in fts5. (check-in: [de9f8ef6eb] user: dan branch: fts5, size: 33245) | |
2015-04-29
| ||
20:54 | [830eae0d35] part of check-in [c1f07a3aa9] Improve fts5 tests. (check-in: [c1f07a3aa9] user: dan branch: fts5, size: 33150) | |
2015-03-11
| ||
14:51 | [c07f2c2f74] part of check-in [f5db489250] Add an optimization to the fts5 unicode tokenizer code. (check-in: [f5db489250] user: dan branch: fts5, size: 33078) | |
2015-03-04
| ||
08:29 | [c3fe30914f] part of check-in [a5d5468c05] Fix a couple of build problems. (check-in: [a5d5468c05] user: dan branch: fts5, size: 33019) | |
2015-02-02
| ||
11:32 | [0d108148c2] part of check-in [fb10bbb9f9] Fix some problems with building fts5 and fts3 together using the amalgamation. (check-in: [fb10bbb9f9] user: dan branch: fts5, size: 32945) | |
2015-01-17
| ||
17:48 | [7c61d5c35c] part of check-in [96ea600440] Improve the performance of the fts5 porter tokenizer implementation. (check-in: [96ea600440] user: dan branch: fts5, size: 32932) | |
2015-01-12
| ||
17:58 | [bdb6a1f599] part of check-in [f22dbccad9] Optimize the unicode61 tokenizer so that it handles ascii text faster. Make it the default tokenizer. Change the name of the simple tokenizer to "ascii". (check-in: [f22dbccad9] user: dan branch: fts5, size: 24110) | |
2015-01-06
| ||
19:08 | [4c30cf32c6] part of check-in [65f0262fb8] Remove the iPos parameter from the tokenizer callback. Fix the "tokenchars" and "separators" options on the simple tokenizer. (check-in: [65f0262fb8] user: dan branch: fts5, size: 22293) | |
2015-01-01
| ||
16:46 | [5a0ad46408] part of check-in [d09f7800cf] Add a version of the unicode61 tokenizer to fts5. (check-in: [d09f7800cf] user: dan branch: fts5, size: 21276) | |
2014-12-29
| ||
11:24 | [5d6e785345] part of check-in [b33fe0dd89] Fixes to built-in tokenizers. (check-in: [b33fe0dd89] user: dan branch: fts5, size: 12914) | |
2014-11-15
| ||
20:07 | Added: [8360c0d1ae] part of check-in [fba0b5fc7e] Fix the customization interfaces so that they match the documentation. (check-in: [fba0b5fc7e] user: dan branch: fts5, size: 3561) | |