SQLite Forum

Documentation for where query optimizations & relation algebra occurs?
Login
I'm building a two-tier query engine system using SQLite. In this system I have a layer that receives the SQL query and does a naive pattern match on it and if it recognizes it, the query is handled by my layer, and if it does not, it forwards the SQL to SQLite. The underlying storage is a virtual table, not a real database. It is a file format for a system I'm working on.

I'm doing this both for fun and as a side-project so I'm trying to learn techniques as well.

For example, this query is something I recently handled at my layer and don't forward to SQLite:

SELECT SUM(col1), col2 FROM Table WHERE col3 = 100 AND col4 = 200 GROUP BY col2.

The code I generate for the above query is that I recognize (very hard-coded!) that col3 and col4 are different columns, I scan them in parallel to generate two lists of rowids. I then intersect these two lists and they are my final rowis I need to extract the value of col2 for and then group by them.

I do seem to beat SQLite, but this is just one query and I've spent a ton of time on it. And frankly, I'm not sure if this is the best way, perhaps there is a better way to optimize this query.

Therefore what I'm looking for is from SQLite that given all costs are equal (which probably is a naive idea, but humor me ...) can I ask it to give me the plans opcodes then I can see what I can generate code for?

If you're asking why I'm doing this? Education ... but also I believe the virtual table infrastructure is too inefficient (i.e. has too many callbacks) if you have to scan the entire table which is often the case in a group by query.

P.S. - In addition to everything I said, I would love if there was a query planner tool (even if it was outside SQLite) that could tell me all the possible relation alegbra ways a query could be accomplished.