SQLite User Forum: LIKE optimization - limitation to string literals

As a general process, the value of parameters may be used to optimize the generated query plan.  However, at the time the prepare is first executed the values of the parameters are completely unknown (they are not available), so code is generated which will work for any and all values of the parameters.  This should always result in a "full scan" of all candidates to any LIKE or GLOB constraint because there is no way of knowing whether or not the parameter will begin with a wild-card.  The resulting VDBE code must work whether or not the parameter provided at "runtime" starts with a wild-card.  There are a few other conditions in which knowledge of the values of parameters may affect the query plan because the value affects the selectivity of an index.

In all such cases, the query planner has to generate a plan that will work no matter what the value of the parameter(s) since the parameter values are unknowable at prepare time.

However, at the time of execution of the first step() the values of the parameters is available.  If STAT4 is enabled then the query plan will be revisited at the first step (since parameters are now bound and knowable) and the query plan may be revised based on the newly available parameter data if examination of the parameter data indicates and generates a more efficient plan by, for example, allowing the use of an index which has become more (or less) selective.

However, when generating the query plan I do not think the optimizer looks deeply into parameter values and does not evaluate *expressions* (they are black boxes since the planner does not execute the query yet) and so, for example, in cases where you used an expression that transmutes the value of a parameter before using it (an arithmetic expression, function, etc) there is no way to evaluate the affect of the result of the expression from the value of the parameter, hence the planner must generate code that will always work.

So, in your example:

`SELECT * FROM t1 WHERE name LIKE RTRIM(?) || '%';`

there is no way to know that the parameter will allow the optimization of using an index to constrain the table search because it is not used directly -- it is part of an expression.  If, however, you did the `RTRIM(?) || '%'` in the application code and executed:

`SELECT * FROM t1 WHERE name LIKE ?;`

passing the result of `RTRTIM(?) || '%'` as a parameter, the planner can now examine the parameter at the first step (after the parameters are bound) and if the first character is not a wild-card then re-generate the query plan to include a partial scan (via an index) if the proper indexes are available.

Overriding the LIKE function and various different collations being available will obviously change how this process works.

I believe this is an accurate description of how the query plan is generated in such circumstances and if not, then I am sure that either Richard or Larry will correct the inaccuracies.