Possible to see how many opcodes were executed for a query?
(1) By Simon Willison (simonw) on 2022-03-20 16:42:36 [link] [source]
I'm experimenting with the mechanism described here for limiting the number of opcodes that a query is allowed to execute before it is terminated.
Is there a way to answer the question "how many virtual machine opcodes did the last query execute?" - I'd like to use something like that to help me pick a sensible limit that would cover reasonable queries but terminate unreasonable ones.
(2) By Keith Medcalf (kmedcalf) on 2022-03-20 17:47:33 in reply to 1 [link] [source]
Yes, there are many. Are you using the API or the CLI?
(3) By Keith Medcalf (kmedcalf) on 2022-03-20 17:54:49 in reply to 2 [source]
Python 3.9.10 (tags/v3.9.10:f2f3f53, Jan 17 2022, 15:14:21) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import apsw >>> db = apsw.Connection() >>> for row in db.execute('select value from generate_series where start=1 and stop=2;'): print(row) ... Row(value=1) Row(value=2) >>> for row in db.execute('select * from sqlite_stmt'): print(row) ... Row(sql='select * from sqlite_stmt', ncol=11, ro=1, busy=1, nscan=0, nsort=0, naidx=0, nstep=0, reprep=0, run=1, mem=7276) Row(sql='select value from generate_series where start=1 and stop=2;', ncol=1, ro=1, busy=0, nscan=0, nsort=0, naidx=0, nstep=23, reprep=0, run=1, mem=2232) >>>
If using the CLI, simply turn on statistics:
SQLite version 3.39.0 2022-03-19 03:28:05 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> .stats on sqlite> select value from generate_series where start=1 and stop=2; ┌───────┐ │ value │ ├───────┤ │ 1 │ │ 2 │ └───────┘ Memory Used: 274951 (max 306089) bytes Number of Outstanding Allocations: 1853 (max 1901) Number of Pcache Overflow Bytes: 94888 (max 94888) bytes Largest Allocation: 65544 bytes Largest Pcache Allocation: 4360 bytes Lookaside Slots Used: 33 (max 94) Successful lookaside attempts: 1208 Lookaside failures due to size: 12 Lookaside failures due to OOM: 0 Pager Heap Usage: 92687 bytes Page cache hits: 67 Page cache misses: 14 Page cache writes: 0 Page cache spills: 0 Schema Heap Usage: 82148 bytes Statement Heap/Lookaside Usage: 5392 bytes Fullscan Steps: 0 Sort Operations: 0 Autoindex Inserts: 0 Virtual Machine Steps: 23 Reprepare operations: 0 Number of times run: 1 Memory used by prepared stmt: 5392 sqlite>
(4) By Simon Willison (simonw) on 2022-03-20 18:21:14 in reply to 3 [link] [source]
Thanks, this is exactly what I needed - I'm using Python with the standard library
sqlite3 package, but the
.stats output in the
sqlite3 CLI works well for me too.
(5.3) By Keith Medcalf (kmedcalf) on 2022-03-20 19:45:21 edited from 5.2 in reply to 4 [link] [source]
The sqlite_stmt vtab will work with the standard pysqlite/sqlite3 interface but I believe you will need to compile and load it as an extension or build a custom sqlite3.dll with it included.
You have to query sqlite_stmt before finalizing the statement (or, since pysqlite2/sqlite3 does not finalize the statements (they are cached) the sqlite_stmt vtab will return a row for each "open" statement (ie, in the cache whether currently executing or not).
The CLI .stats command merely returns the statement statistics (and others such as memory usage) after the query completes but before it is finalized and discarded.
Edited to note the underlying C API is described here: https://sqlite.org/c3ref/stmt_status.html
Edited to add see also for the sqlite_stmt vtab: https://sqlite.org/stmt.html
Edited to note that .stats can take on
|vmstep as arguments (see .help stats in the CLI) to determine exactly what you want to see. .stats vmstep will show just the number of virtual machine steps (instructions, basically) that executed.
(6) By Simon Willison (simonw) on 2022-03-21 19:47:17 in reply to 5.3 [link] [source]
Do you know if VM-steps as reported by
.stats vmstep are the same thing as virtual machine instructions as counted by the
N parameter to sqlite3_progress_handler()?
I ask because I'm getting a surprising difference from the following:
sqlite> .stats vmstep sqlite> with recursive counter(x) as ( ...> select 0 ...> union ...> select x + 1 from counter ...> ) ...> select * from counter limit 10000; ... VM-steps: 200007
That seems to indicate the query is running 200,007 steps. But if I try the same thing in Python by counting calls to a progress handler:
import sqlite3 db = sqlite3.connect(":memory:") i = 0 def count(): global i i += 1 db.set_progress_handler(count, 1) db.execute(""" with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter limit 10000; """) print(i)
I get back a result of
24 - so my
count() function was only called 24 times for the same DB query.
(7) By David Raymond (dvdraymond) on 2022-03-21 19:58:57 in reply to 6 [link] [source]
Don't forget to step through the results. All .execute() has done is prepared it. db.execute(query).fetchall() or for record in db.execute(query): pass
(8) By Simon Willison (simonw) on 2022-03-21 21:27:55 in reply to 7 [link] [source]
That was it! Mystery solved, thank you.
(9) By Keith Medcalf (kmedcalf) on 2022-03-21 21:30:50 in reply to 7 [link] [source]
or, of course, db.executescript(query)
.executescript is syntactic sugar for
for row in db.execute(query): pass
and merely returns the cursor object on which the query(s) executed while discarding all results.
(10.1) By Keith Medcalf (kmedcalf) on 2022-03-21 21:57:54 edited from 10.0 in reply to 7 [link] [source]
.execute does a prepare, any specified bindings, and steps once returning the open cursor positioned on the first row (if the first step returned SQLITE_ROW), or an open cursor positioned at EOF (if the first step ran the query to completion, returning SQLITE_DONE).
This is why vmsteps were executed when the returned cursor was not iterated to completion -- one row was produced which (apparently) required 24 vmsteps.