SQLite Forum

FTS5: calculating column weight dynamically?
Login
What i'm about to ask isn't an attempt to solve a specific problem, but i'm curious whether it's at all possible (if it is, i *may* apply it some rainy day, otherwise i'm just curious whether it's possible at all)...

[My website](https://wanderinghorse.net) uses, unsurprisingly, sqlite's FTS5 for its site search feature. One of the interesting aspects of FTS5 is the ability to give more or less weight to any given content column when searching by using the `bm25` function, as demonstrated in this snippet from my site's search script:

<https://fossil.wanderinghorse.net/r/www-wh/file?udc=1&ln=28-33&ci=tip&name=cgi-bin%2Fsearch.s2>

In short, what that does is give more weight to search terms which are found in the URL path than it does to those same terms found in page-level content.

Now i'm wondering if this can be taken to *The Next Level* and modify the weight based on what level of the site's directory hierarchy the search term is found. The idea is that terms found on deeply-nested pages should weigh more than those same terms found on higher-up pages, under the presumption that they are more specific matches (because that pattern applies, generally speaking, to the whole site).

For example, searching for "painting" might find a hit on...

- 1st level: a mention in the content of the `/news` page
- 2nd level: a URI path element (`/gaming/painting`) and several mentions in page content.
- 3rd level: as for (2), at `/gaming/painting/airbrush`, plus more page-level mentions on the airbrush page.

Ideally, assuming that the content-level mentions of (2) and (3) are roughly equivalent, they would be weighed such that their relative weights put (3) as the heaviest, (2) as the next one, and (1) in distant last place.

Is such a dynamic interpretation of FTS weights at least *hypothetically* possible?

Something along the lines of:

weight for each hit = hit in page content (1 * directory level) + hit in uri path element (3 * directory level)

:-?

For completeness's sake, though these *probably* aren't necessary for the discussion:

- [the site's FTS schema](https://fossil.wanderinghorse.net/r/www-wh/file?name=site-tools/fts-pages.sql&ci=tip)
- [the script which updates FTS](https://fossil.wanderinghorse.net/r/www-wh/file?name=site-tools/fts-pages.s2&ci=tip)
- [the makefile bit which drives that script](https://fossil.wanderinghorse.net/r/www-wh/file?udc=1&ln=368-400&ci=tip&name=Makefile)