SQLite Forum

Bug Report: Expression index forgets value subtype
Login

Bug Report: Expression index forgets value subtype

(1) By Russell Mull (russell_mull) on 2023-11-07 20:21:41 [source]

I have a small collection of application defined functions which use sqlite3_result_subtype and sqlite3_value_subtype to communicate context between the functions. When I create an expression index using one of the functions which sets the subtype, the subtype value is reset to zero for the data coming from the index.

For example, suppose I have two functions, "inner" which sets the subtype, and "outer" which relies on the subtype being present. If I create an index such as:

    CREATE INDEX just_inner on my_table(id, inner(col));

When I later run a query which hits the index, including the return value of the inner function:

    SELECT id FROM my_table WHERE outer(inner(col)) > 1;

Then the value observed by the implementation of outer does not include the subtype. (That is, sqlite3_value_subtype returns 0, the default value)

If the index is not present, then the subtype has the expected value. Since the presence of the index changes the result of the query, I think this is probably a bug.

Here is a contrived repro which demonstrates the problem. I have tested it against sqlite 3.42.0 and 3.44.0: https://github.com/mullr/sqlite-bug-expr-index-with-subtype

(2) By Bo Lindbergh (_blgl_) on 2023-11-07 21:34:17 in reply to 1 [link] [source]

Subtypes are never stored; they apply to in-memory values only.

(3) By Richard Hipp (drh) on 2023-11-07 21:57:34 in reply to 2 [link] [source]

I was going to reply to Russell in the same way. But if you read his bug report closely, he does have a point. His title is misleading, but the body of the text makes it clear that the current behavior is wrong.

In the SQL statement:

    SELECT id FROM my_table WHERE outer(inner(col)) > 1;

The subtype generated by inner() should be passed directly through to the parameter of outer(). But that pass-through gets disrupted if the "inner(col)" value is indexed. An optimization replaces the "inner(col)" with the value stored in the index, and as you observe, indexes don't store subtypes.

On my to-do list I now have a note to disable the optimization that replaces an expression with the value of an index on that expression (forcing the expression to be recomputed rather than just read out of the index) if:

  1. The indexed expression is a function that has the SQLITE_FUNC_SUBTYPE property. (Russell: Make sure your custom functions set SQLITE_FUNC_SUBTYPE if they are setting subtypes!)

  2. And, the indexed expression is used as a parameter of a function.

I also plan to update the documentation on subtypes to point out this problem so that future programmers as less likely to run into it.

(4) By Bo Lindbergh (_blgl_) on 2023-11-07 22:31:53 in reply to 3 [link] [source]

SQLITE_FUNC_SUBTYPE is an internal constant. Did you mean SQLITE_SUBTYPE?

In the current documentation, Pointer Passing Interfaces has the most information about subtypes.

(6) By Richard Hipp (drh) on 2023-11-07 23:26:17 in reply to 4 [link] [source]

Did you mean SQLITE_SUBTYPE?

Yes, probably that is what I meant...

(5) By Russell Mull (russell_mull) on 2023-11-07 22:41:10 in reply to 3 [link] [source]

Thank you for the prompt reply. Reading the doucmentation, I wasn't sure what I should expect from the subtyping feature. Clarifying it there would certainly help.

(7) By Richard Hipp (drh) on 2023-11-08 13:16:55 in reply to 1 [link] [source]

The following SQL demonstrates the malfunction:

CREATE TABLE t1(x INT, y TEXT);
INSERT INTO t1(x,y) VALUES(1,'{b:5}');
SELECT json_insert('{}', '$.a', json(y)) FROM t1;
CREATE INDEX t1j ON t1(json(y));
SELECT json_insert('{}', '$.a', json(y)) FROM t1;

A different answer is computed after the index has been created. This should not be the case. Creating or dropping indexes might change the order of results, and hence might result in a different answer if there is a LIMIT clause or something like that, but in this case where there is only a single row of output and no LIMIT clause, dropping or creating an index should not make a difference in the output.

(8) By Richard Hipp (drh) on 2023-11-09 17:48:38 in reply to 1 [link] [source]

I believe this is now fixed at check-in ba789a7804ab96d8 and its descendants, including the tip of branch-3.44.

To get your application working you need to:

  1. Update your source tree to check-in ba789a7804ab96d8 or one of its descendants.

  2. When you register application-defined functions that might call sqlite3_result_subtype(), you must add the new SQLITE_RESULT_SUBTYPE flag to the text encoding parameter.

  3. You can verify that you did step 2 correctly by adding the -DSQLITE_STRICT_SUBTYPE=1 compile-time option. If you missed any SQLITE_RESULT_SUBTYPE flags, then the call to sqlite3_result_subtype() will cause the function to raise an error.

Please try out this fix. Report back whether or not it fixes your issue.

(9) By Russell Mull (russell_mull) on 2023-11-10 23:39:46 in reply to 8 [link] [source]

I have tested this with both my repro and my application, and it appears to fix the issue.

Thank you!