SQLite Forum

fts5 does not treat left half ring character as token
Login
If it's in category "Lm", then fts5 treats it as a token character (i.e. a letter). So the search term must include it as well.

I'm not literate in whichever language this is, but at first glance that does look sub-optimal. Maybe it should be stripped out before tokenization or something. It would be good to create a tokenizer that does better. The tricky bit is where to get the data - there are thousands of characters in unicode and we need to categorize them all. Suggestions welcome!