Difference between MATCH('x') and MATCH('x*')
(1) By Spaced Cowboy (SpacedCowboy) on 2020-08-16 18:19:42 [link] [source]
I'm a bit confused over exactly what the difference is between the two queries of an FTS index:
sql = "SELECT rowid FROM fts WHERE fts MATCH ('x*')"; sql = "SELECT rowid FROM fts WHERE fts MATCH ('x')";
.. where 'x' is a single character. I'm getting very different results,
tsql> select rowid from fts where fts match ('x'); Time taken: 0.024171 secs, rows=6383 tsql> select rowid from fts where fts match ('x*'); Time taken: 0.147986 secs, rows=19496
So clearly it is different, but I'm not really seeing why - from the description in 3.3, it seems that 'x' will match anything that 'x*' does, apart from perhaps words ending in x (therefore not a prefix), but 'x*' is matching more rows than 'x'.
(2) By RandomCoder on 2020-08-16 19:03:06 in reply to 1 [source]
I think the key you're missing from the documentation is that FTS generally operates on tokens:
sqlite> create virtual table example using fts5(content); sqlite> insert into example(content) values('x'); sqlite> insert into example(content) values('xother'); sqlite> select * from example where content match 'x'; x sqlite> select * from example where content match 'x*'; x xother sqlite>
The first select query only finds 'x' because that's the only entry with just that token, the other entry doesn't have the token 'x', rather it has the token 'xother'.
The second select query asks for all entries with any token that starts with 'x', so it finds both entries since they both have a token that starts with 'x'.
(3) By Spaced Cowboy (SpacedCowboy) on 2020-08-17 17:10:50 in reply to 2 [link] [source]
I think the confusion stemmed from the snippet() function, which highlights any occurrence of 'x' in the string, not just the ones that the query matches. So I'm seeing highlighting of the X in 'my xylophone is broken' as well as 'x marks the spot', even when searching for just 'x'