SQLite Forum

fts5 does not treat left half ring character as token
I think the unicode61 tokenizer just strips marks from letters. The left half-ring character is a letter, not a mark, so it is left as part of the token. I have no idea if Unicode has it right or wrong, but it is what it is. You may want to process your special cases before feeding to FTS5. Do you know how does this work in Solr for example?