Best way to enforce a query time limit?

(1) By Simon Willison (simonw) on 2022-03-18 17:28:47 [link] [source]

My software Datasette allows people to execute custom SQL queries directly against a read-only SQLite database, for example:

https://global-power-plants.datasettes.com/global-power-plants?sql=select+*+from+%22global-power-plants%22+where+country_long+%3D+%27France%27+and+primary_fuel+%3D+%27Hydro%27&p0=France&p1=Hydro

To protect against queries that exhaust server resources, I enforce a 1 second time limit.

My Python code for doing that is here: https://github.com/simonw/datasette/blob/0.60.2/datasette/utils/__init__.py#L137-L156

It works using the conn.set_progress_handler(handler, n) Python method, running every 1000 virtual machine ticks (n = 1000) unless the time limit being enforced is less than 50ms, at which point it uses n = 1 instead.

The handler function checks to see if the query has exceeded the desired time limit and, if it has, returns 1 to cancel the operation.

The system I'm using works, but I'm interested in exploring alternatives. Are there any other good options for achieving this?

One particular limitation with my mechanism is that the handler is set for a connection. This means that if I want to execute more than one query in parallel I need to open multiple connections, because otherwise the time limit handler could accidentally cancel other queries.

My ideal solution here would be for SQLite to provide query time limits as a built-in feature. I'd love to be able to run PRAGMA query_time_limit=100; to set a limit of 100ms for all queries on the connection.

Even better, I'd love a mechanism where I could set a time limit as part of executing a query - so a single SQLite connection could be used to execute different queries each with different time limits.

(2) By Simon Slavin (slavin) on 2022-03-18 18:18:39 in reply to 1 [link] [source]

Implement your own timer in your own code, as you have, and have it call

https://sqlite.org/c3ref/interrupt.html

when your timer expires. The parameter to that call is the connection, so an interrupt on one connection won't interfere with any other connection.

Whatever might be interrupted must handle SQLITE_INTERRUPT appropriately.

(3) By Simon Willison (simonw) on 2022-03-18 19:26:11 in reply to 2 [link] [source]

This looks like a great alternative. Something that worries me about my progress handler implementation is that there is presumably some substantial overhead involved in executing a Python function every X ticks, even when X = 1000.

Using interrupt (available in the Python library here: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.interrupt) seems like it could reduce that overhead substantially.

I need to do some research and run some benchmarks. Thanks for the pointer!

(4) By Larry Brasfield (larrybr) on 2022-03-18 20:00:10 in reply to 3 [link] [source]

You should be sure to read the SQLite doc on that interrupt API because it creates a hazard if you do not properly coordinate the occurrences of calls into it with the lifetime of the targeted connection object.

(5.1) By Simon Willison (simonw) on 2022-03-18 20:30:54 edited from 5.0 in reply to 4 [source]

I'm using Python, and the Python library makes interrupt() a method of the connection:

connection.interrupt()

So I'm trusting that the Python library is already smart enough to avoid calling it on a connection that has already been closed.

It looks like it does that here: https://github.com/python/cpython/blob/4674fd4e938eb4a29ccd5b12c15455bd2a41c335/Modules/_sqlite/connection.c#L1583-L1594

(6.1) By Larry Brasfield (larrybr) on 2022-03-18 21:07:40 edited from 6.0 in reply to 5.1 [link] [source]

I do not wish to get into any debate here, but feel obliged to mention these considerations.

The code you plan to invoke necessarily runs in a different thread than the one whose normal progression you want to interrupt. I do not see, in the code you linked, anything which blocks the following scenario:




connection.interrupt() is called, and begins executing, getting just past that "smart enough" check, but no further due to task switching.
The long-running query execution, which motivated an interruption, completes and then whatever client code instigated that closes the connection.
The connection.interrup() thread is resumed, and makes the very call which is warned against in the 2nd paragraph here.

</ol>

I personally do not find the existence of an API enough to promote trust that it cannot fail in multi-threading scenarios.

(7) By Van Schelven (vanschelven) on 2025-04-24 10:14:55 in reply to 3 [link] [source]

I took Simon's approach, which worked well for me... but have run into 1 snag with it:

When running expensive COUNT operations, the number of VM instructions seems to be very low, and I'm not even sure whether the number of such operations within the expensive part of such operations is greater than zero. This certainly means Simon's approach of 1000-tick checks won't work there, but it may even mean the whole idea of limiting such operations using the interrupt mechanism is flawed in general.

In particular: I've been able to run up some 25 seconds of wall clock time before incurring even a 100-tick check.

I wonder if anyone (including Simon) has found better / alternative methods in the past 3 years.

(Meta: yes I'm necromancing, but I'd rather keep the information where it already is... especially since there's so few people posting about this subject anyway).

(13) By Nuno Cruces (ncruces) on 2025-04-24 19:48:48 in reply to 7 [link] [source]

I also use the progress handler for my Go wrapper, so I'm interested in these pathological queries that don't trigger it.

Can you share an example?

(14) By Van Schelven (vanschelven) on 2025-04-26 05:44:40 in reply to 13 [link] [source]

It's literally just a COUNT FROM (no WHERE) on a 30M record table.

(8.1) By Roger Binns (rogerbinns) on 2025-04-24 15:01:43 edited from 8.0 in reply to 1 [link] [source]

My Python SQLite wrapper APSW supports query limits with example code.

It uses the Python context block (with) approach:

You can set a wall clock time limit
You can set a number of rows returned limit
When either of those limits is hit, the block exits either cleanly, or with an exception of your choice
You can nest the context blocks, and do this across multiple threads, or even async

The way it works is using context vars to keep track of current scope time and row limits. The sqlite3_trace_v2 API is used where the row and time limits are checked. Once hit an exception is raised, which the context exit handler either re-raises or suppresses if no exception is desired.

APSW internally handles multiple concurrent registered callbacks for sqlite3_trace_v2 (and several other callbacks/hooks) which is why threads, async, and nested blocks etc work without interfering with each other.

I originally implemented nested blocks as the outer blocks constraining the limits for inner blocks. This does work, but turned out annoying to use because the inner block knows more about what it is trying to do, and then an escape mechanism would be needed. For example an inner block may know it wants more rows or time. That is why the outer block limits do not constrain inner blocks, and limits only constrain code at the same level of nesting.

(9) By ddevienne on 2025-04-24 15:01:04 in reply to 8.0 [link] [source]

Sounds very impressive Roger.

Have you looked at the overhead of always using tracing, for that feature? Just curious.

(10) By Roger Binns (rogerbinns) on 2025-04-24 15:18:40 in reply to 9 [link] [source]

I assume you mean having a global query limit. It is possible to do by overriding the cursor_factory to return a cursor whose execute and executemany methods always do the context block limits.

I haven't measured the tracing overhead, since it will depend on the amount of Python code executed and how often it is called. There is also a GIL acquire and release around the callback.

(11) By Van Schelven (vanschelven) on 2025-04-24 18:19:14 in reply to 8.1 [link] [source]

Very interesting!

I will certainly play with this for a bit... I can't believe I haven't run into it yet!

Given the whole "swap out the full sqlite layer" it's going to take some research before I can report back with an "this answers all my questions" but it certainly looks promising!

(12.1) By Roger Binns (rogerbinns) on 2025-04-24 18:43:04 edited from 12.0 in reply to 1 [link] [source]

This thread has the issue of how bad the overhead is for a progress handler in Python, so I benchmarked it. As is usual, benchmarks aren't reality, and your mileage will vary.

I used this code as the progress handler callback, where it updates the calls count, checks the current time, does some arithmetic, a comparison, a variable assignment, and lets execution continue.

def callback():
    global calls
    calls += 1
    x = time.monotonic() + 3 > 77
    return False

For the query I used the Outlandish Recursive Query Examples first example of making the mandlebrot set. It does lots of SQL level activity but does not do any disk operations. I changed the iter<28 to iter<2800 so the query took about a second to execute on my machine.

steps is the parameter to sqlite3_progress_handler. The elapsed time in seconds is shown for APSW (my Python SQLite wrapper), and sqlite3 is the Python standard library wrapper. calls is how many times the progress handler was called. It was identical for both wrappers. Commas inserted for readability.

steps	APSW	sqlite3	calls
0	0.966	1.206	0
1	10.116	10.219	54,683,116
50	1.263	1.534	1,093,662
250	1.034	1.276	218,732
500	0.994	1.262	109,366
1,000	0.996	1.220	54,683
10,000	0.972	1.205	5,468

Conclusions A step value of 1 is a lot of overhead, but it rapidly diminishes from there. It works out at about one and half millionths of a second for each progress handler callback.

Benchmark setup Done on a Ryzen 3900X CPU (2020) with boost clocks turned off. Operating system is Ubuntu 25.04. System Python 3.13.3 as shipped by Ubuntu. APSW came from pypi wrapping SQLite 3.49.1.0. sqlite3 is the Ubuntu provided version, wrapping SQLite 3.46.1. Memory database (empty string) as the database file. Statement caches were disabled, so the above times also include the prepare time for each run. I did do multiple runs, and the results were within a single digit number of milliseconds of each other.