SQLite Forum

sqlite-utils 3.0: CLI tool can now execute FTS seaches against a .db file
Login
I just released version 3.0 of my sqlite-utils combination command-line tool and Python library for maniuplating SQLite databases.

<https://sqlite-utils.readthedocs.io/en/stable/changelog.html#v3-0>

The big new feature in this release is the new `sqlite-utils search` command, which can be used on the command-line to run FTS searches against correctly configured tables and return results ordered by relevance.

Here's a demo, using a CSV file of film locations in San Francisco from <https://data.sfgov.org/Culture-and-Recreation/Film-Locations-in-San-Francisco/yitu-d5am>

First, to create a database from that CSV file:

    curl 'https://data.sfgov.org/api/views/yitu-d5am/rows.csv?accessType=DOWNLOAD' \
      | sqlite-utils insert sf.db locations - --csv

This creates a `sf.db` SQLite database containing a single `locations` table.

Next, enable full-text search on that table (for the `Title` and `Locations` columns):

    sqlite-utils enable-fts sf.db locations 'Title' 'Locations'

Now we can run searches!

    sqlite-utils search sf.db locations 'cable'                                          
    [{"rowid": 1215, "Title": "Play it Again, Sam", "Release Year": "1972", "Locations": "Hyde Street Cable Car", "Fun Facts": "", "Production Company": "Paramount Pictures", "Distributor": "Paramount Pictures", "Director": "Herbert Ross", "Writer": "Woody Allen", "Actor 1": "Woody Allen", "Actor 2": "Diane Keaton", "Actor 3": "Tony Roberts"},
     {"rowid": 2288, "Title": "Always Be My Maybe", "Release Year": "2019", "Locations": "California Cable Car Line", "Fun Facts": "", "Production Company": "Isla Productions, LLC", "Distributor": "Netflix", "Director": "Nahnatchka Khan", "Writer": "Michael Golamco, Randall Park, Ali Wong", "Actor 1": "Ali Wong", "Actor 2": "Randall Park", "Actor 3": "Keanu Reeves"}

Search results output as JSON by default, but we can request a more readable table format using the `-t` option. We can also specify the columns we want to see using `-c`.

    sqlite-utils search sf.db locations 'cable' -t -c Title -c Locations
    Title                          Locations
    -----------------------------  ---------------------------------------------------------
    Play it Again, Sam             Hyde Street Cable Car
    Always Be My Maybe             California Cable Car Line
    Play it Again, Sam             Hyde Street Cable Car
    Attack of the Killer Tomatoes  Hyde Street Cable Car
    Attack of the Killer Tomatoes  Hyde Street Cable Car
    The Bachelor                   Roof of Cable Car (California at Front Street)
    The Bachelor                   Roof of Cable Car (California at Front Street)
    Woman on the Run               Cable Car Signal Box (California Street at Powell Street)
    Woman on the Run               Cable Car Signal Box (California Street at Powell Street)

Full documentation for the new `sqlite-utils search` command can be found here: <https://sqlite-utils.readthedocs.io/en/stable/cli.html#cli-search>

One detail I'm particularly proud of: it works with both FTS5 and FTS4 tables, including sort by relevance! This was tricky because FTS5 has relevance sort built-in, but FTS4 requires you to provide your own scoring function. I'm using a scoring function I wrote in Python a couple of years ago which I described on my blog in [Exploring search relevance algorithms with SQLite](https://simonwillison.net/2019/Jan/7/exploring-search-relevance-algorithms-sqlite/).