SQLite Forum

Did the Variable opcode change recently?
Login

Did the Variable opcode change recently?

(1) By Simon Willison (simonw) on 2024-06-11 21:01:52 [link] [source]

Does SQLite 3.46.0 include a change to the Variable opcode?

In previous SQLite versions that opcode would reveal the name of any query :parameters in the query - for example:

explain select * from sqlite_master where tbl_name = :table

Would return this (truncated a bit):

addr	opcode	p1	p2	p3	p4	p5	comment
10 	ResultRow 	3 	5 	0 	  	0 	 
11 	Next 	0 	3 	0 	  	1 	 
12 	Halt 	0 	0 	0 	  	0 	 
13 	Transaction 	0 	0 	0 	0 	1 	 
14 	Variable 	1 	2 	0 	:table 	0 	 
15 	Goto 	0 	1 	0 	  	0 	
Note that handy :table value in the p4 column there.

But... against SQLite 3.46.0 I get this instead:

addr	opcode	p1	p2	p3	p4	p5	comment
10 	ResultRow 	3 	5 	0 	  	0 	 
11 	Next 	0 	3 	0 	  	1 	 
12 	Halt 	0 	0 	0 	  	0 	 
13 	Transaction 	0 	0 	35 	0 	1 	 
14 	Variable 	1 	2 	0 	  	0 	 
15 	Goto 	0 	1 	0 	  	0 	
Note that the p4 column there is now null.

Now... I know that opcodes are NOT part of the stable, documented interface to SQLite - so I did a bad thing by writing code against them.

But having this as a mechanism for extracting the named parameters from a SQL query was so useful! Without that I'm going to have to go back to running an ugly regular expression.

Ideally I'd love for this particular piece of Opcode magic to become stable and documented, or failing that I'd love a documented, supported (and available to Python) SQLite mechanism for answering the question "given this string of SQL what are the named parameters within it?"

Here's my buggy code that this broke: https://github.com/simonw/datasette/blob/7437d40e5dd4d614bb769e16c0c1b96c6c19647f/datasette/utils/__init__.py#L1137-L1150 - issue here: https://github.com/simonw/datasette/issues/2353

(2) By ddevienne on 2024-06-12 07:06:46 in reply to 1 [link] [source]

Hi. Prepare the statement, get the parameter count, then ask for the name of each index?

(3) By Simon Willison (simonw) on 2024-06-12 16:34:04 in reply to 2 [link] [source]

Sadly I don't think there's a way for me to do that with the Python SQLite bindings.

(5) By Bo Lindbergh (_blgl_) on 2024-06-12 22:15:34 in reply to 3 [link] [source]

# Warning: horrible code.  It's not intended to ever be used;
# instead, you should attach it when requesting an enhancement
# of Python's SQLite interface.

import sqlite3
import re

scraper = re.compile(r"You did not supply a value for binding parameter .(.+)\.")

def scrape_param_names(conn, sql):
    names = []
    bindings = {}
    sql = "explain "+sql
    while True:
        try:
            conn.execute(sql, bindings)
            break
        except sqlite3.ProgrammingError as e:
            match = scraper.fullmatch(str(e))
            if match:
                name = match.group(1)
                names.append(name)
                bindings[name] = None
                continue
            raise
    return tuple(names)

(6) By Simon Willison (simonw) on 2024-06-12 22:28:27 in reply to 5 [link] [source]

Haha, that is a delightfully devious trick!

I'm loathe to rely on the specific format of error message, purely because I expect that's not a documented, stable API either and I just got burned by the opcode.

(4) By Richard Hipp (drh) on 2024-06-12 17:17:57 in reply to 1 [source]

The check-in for this change was dd5977c9a8a418be.

The P4 parameter to OP_Variable was not being used for anything. By omitting it, we make the prepared statement slightly smaller, reduce the size of the SQLite library by a few bytes, and help sqlite3_prepare() and similar run slightly faster.

Long, long ago, the P4 parameter to OP_Variable was in fact used. But that ceased to be the case back in 2016. I don't know why it took us so long to notice this and omit the P4 parameter. I just asked Dan and he does not remember how he noticed it.

(7.1) By Simon Willison (simonw) on 2024-06-13 20:24:31 edited from 7.0 in reply to 1 [link] [source]

I ended up fixing this by switching to a less crafty regular expression based mechanism:

_single_line_comment_re = re.compile(r"--.*")
_multi_line_comment_re = re.compile(r"/\*.*?\*/", re.DOTALL)
_single_quote_re = re.compile(r"'(?:''|[^'])*'")
_double_quote_re = re.compile(r'"(?:\"\"|[^"])*"')
_named_param_re = re.compile(r":(\w+)")


def named_parameters(sql: str) -> List[str]:
    """
    Given a SQL statement, return a list of named parameters
    that are used in the statement
    e.g. for ``select * from foo where id=:id``
    this would return ``["id"]``
    """
    sql = _single_line_comment_re.sub("", sql)
    sql = _multi_line_comment_re.sub("", sql)
    sql = _single_quote_re.sub("", sql)
    sql = _double_quote_re.sub("", sql)
    # Extract parameters from what is left
    return _named_param_re.findall(sql)