SQLite Forum

fts5 does not treat left half ring character as token
Login
Hello I was reading the [documentation](https://www.sqlite.org/fts5.html):

It mentioned the default values for the Unicode61 tokenizer is:
```
The default value is "L* N* Co"
```

Notice the left half-ring character (ʿ) is found [here](https://www.compart.com/en/unicode/category/Lm) in the `Lm` category which should be covered by the default `L*` category.

However when I run the following set of commands, notice the match fails:

```
SQLite version 3.32.3 2020-06-18 14:00:33
Enter ".help" for usage hints.
sqlite> CREATE TABLE "entities" (id INTEGER PRIMARY KEY, display_name TEXT);
sqlite> INSERT INTO entities (display_name) VALUES ('ʿAbd al-Muḥsin al-ʿAbbād');
sqlite> INSERT INTO entities (display_name) VALUES ('ʿAbd al-Raḥmān ʿAjjāl al-Lībī');
sqlite> CREATE VIRTUAL TABLE fts_idx USING fts5(display_name, content='entities', content_rowid='id');
sqlite> INSERT INTO fts_idx(fts_idx) VALUES('rebuild');
sqlite> SELECT * FROM fts_idx WHERE fts_idx MATCH 'Ajjal';

```

However, using the half-ring character in the search seems to find it:
```
sqlite> SELECT * FROM fts_idx WHERE fts_idx MATCH 'ʿAjjal';
ʿAbd al-Raḥmān ʿAjjāl al-Lībī
sqlite> 
```

This seems like a bug. Am I doing something wrong?