SQLite Forum

Is using prepared statements without a FROM to employ built-in methods bad practice?
Login

Is using prepared statements without a FROM to employ built-in methods bad practice?

(1.1) By Gary (1codedebugger) on 2020-12-27 20:20:50 edited from 1.0 [link] [source]

When it comes to C, I'm just a hack at best; so, this might be good for humor at least, but...

Is it bad practice or a coding abuse to keep a prepared statement in memory that is a SELECT without a FROM just to use json_extract to perform simple parses? For example,

sqlite3_stmt *stmt_json;
sqlite3_prepare_v2( dbEN, "select json_extract(?,?)", -1, &stmt_json, NULL );

And if j is the json string of an array of strings, such as ["string","string",..,"string"] and n its length, is it awful to do as follows (with all the error checking removed) in rebinding the array index in a loop ? It worked and seems quick enough, but feels like I'm misusing SQLite in some way.

Thank you.

void json_decode_array_text( sqlite3_stmt *stmt_json, char *j, char *list[], int *n )
{
int rc, k, error = 0;
char idx[25];

sqlite3_bind_text( stmt_json, 1, j, strlen(j), SQLITE_STATIC );
for ( k = 0; k < *n && !error; k++ )
{
sprintf( idx, "$[%d]", k )
sqlite3_bind_text( stmt_json, 2, idx, strlen( idx ), SQLITE_STATIC );
while ( ( rc = sqlite3_step( stmt_json ) ) == SQLITE_ROW )
sprintf( list[ k ], "%s", sqlite3_column_text( stmt_json, 0 ) );
sqlite3_reset( stmt_json );
} // next k
}

(2) By Clemens Ladisch (cladisch) on 2020-12-27 10:08:11 in reply to 1.0 [link] [source]

Reusing a prepared statement in a loop is certainly OK.

Keeping the statement longer does not hurt because it does not need much memory. But re-parsing the SQL would not need much timer either. It does not really matter either way.

The only misuse is using NULL as the last parameter of sqlite3_bind_text().

(4) By Gary (1codedebugger) on 2020-12-27 20:19:29 in reply to 2 [link] [source]

Thank you. I'll fix that in the post also.

(5) By Keith Medcalf (kmedcalf) on 2020-12-27 20:45:43 in reply to 2 [link] [source]

SQLITE_STATIC, NULL, 0; and, (void*)0 all have the same value. The difference is merely a matter of syntactic sugar (different spellings for the same thing).

That parameter is supposed to be a pointer to a function. SQLITE_STATIC and SQLITE_TRANSIENT are merely sentinel values for use by "sniveling compilers" that do not handle the generic pointer type correctly and "snivel" instead of doing what they have been instructed to do.

So, using NULL is correct insofar as the "destructor function" does not exist.

However using the syntactic sugar SQLITE_STATIC is not incorrect either.

(3) By anonymous on 2020-12-27 17:55:18 in reply to 1.0 [link] [source]

Depends what your goal is. Because the code you show is quadratic, i.e. is proportional
to the square of the length of the json string. SQLite is fast. JSON1 is fast. But each $[N]
extract must reparse the whole string from scratch each time.

You can rewrite the sql to avoid that, which would be better, but using a json lib directly
would avoid any SQLite overhead at all. But I can see some appeal to what you are doing
and even find it creative and pragmatic ;)

(6) By Gary (1codedebugger) on 2020-12-27 20:49:09 in reply to 3 [link] [source]

Thank you. Would you mind elaborating a bit on how the SQL could be rewritten to avoid reparsing the string from scratch for each $[N]?

The array piece is extracted from the full JSON in a separate step and the result passed to the $[N] function as ["string","string"..."string"]. Is there a way to start each $[N] at the position of the last $[N-1] without having to start the parse at the beginning each time? Thanks.

(7.1) By Keith Medcalf (kmedcalf) on 2020-12-27 21:22:24 edited from 7.0 in reply to 6 [link] [source]

Why not just use the sql select value from json_each(?)?

Then you only need to run the statement once to get the equivalent result?

int json_decode_array_text(sqlite3_stmt *stmt_json, char *j, char *list[], int *n)
{
    int rc, k = 0;

    sqlite3_bind_text(stmt_json, 1, j, -1, SQLITE_STATIC);
    while ((rc = sqlite3_step(stmt_json)) == SQLITE_ROW)
    {
        sprintf(list[k], "%s", sqlite3_column_text(stmt_json, 0));
        if (++k > *n) 
            break;
    }
    sqlite3_reset(stmt_json);
    return rc;
}

(10) By Gary (1codedebugger) on 2020-12-27 22:23:57 in reply to 7.1 [link] [source]

Thank you very much for this. I've been messing around with json_each and json_tree on the command line for the past half hour now and, at least for my needs of having not overly complex strings, it appears that one can parse the entire string with ease.

When json_extract is used to get a specific value, does SQLite have to first run a json_tree and then select the correct row based on the fullkey?

This function that fills the array list was to be used in carray that you helped me with the past few days. If I just parse the JSON upfront with json_tree in a temporary table, I could get the same result without carray by selecting from that table and conditioning on path.

My C application keeps shrinking as the fog around C, the command line, and SQLite thins. Thanks.

(8.2) By Keith Medcalf (kmedcalf) on 2020-12-27 22:13:35 edited from 8.1 in reply to 6 [link] [source]

Of course, the "safe and secure" way to do this is:

int json_decode_array_text(sqlite3_stmt *stmt_json, char *j, char *list[], int n)
{
    int rc, k;

    for (k=0; k<n; k++)
    {
        if (list[k])
            free(list[k]);
        list[k] = NULL;
    }

    sqlite3_bind_text(stmt_json, 1, j, -1, SQLITE_STATIC);
    for (k=0; (((rc = sqlite3_step(stmt_json)) == SQLITE_ROW) && (k < n)); k++)
    {
        char *value = sqlite3_column_text(stmt_json, 0);
        int length = sqlite3_column_bytes(stmt_json, 0);

        list[k] = calloc(length + 1);
        memcpy(list[k], value, length);
    }
    sqlite3_reset(stmt_json);
    return rc;
}

because it does not make any assumption that the space pointed to by list[N] is sufficient but rather allocates the needed space dynamically and the parameter n is merely the number of elements (char*) in list.

The returned result will be either SQLITE_DONE if all the values could fit in the list, or SQLITE_ROW if there were more elements than list entries. (or possibly some other error condition detected during the execution of the SQL statement).

Unused entries in the list would be NULL pointers.

The underlying assumption would require that the list and its number of entries as well as the json test string j are constant during the execution of this routine (that is, that the same "list" and "j" are not changed or deallocated on another thread.

I don't think I missed any unsafenesses, but no warranty is implied.

(9) By Keith Medcalf (kmedcalf) on 2020-12-27 22:20:47 in reply to 8.2 [source]

If you really need the list size to be int* rather than a plain int, then simply replace all references to n with *n and change the function signature. In this case the underlying assumptions also apply to n ...

(11) By Keith Medcalf (kmedcalf) on 2020-12-27 22:29:28 in reply to 8.2 [link] [source]

You could accomplish the same thing by checking that the size of the allocated list[] is sufficient and only re-alloc if it is not, and only set NULL and free any unused list[] entries. This would reduce the number of calls to the memory allocator. You could use malloc rather than calloc if you also copied length+1 bytes rather that length bytes. Whether this makes a difference depends on whether the underlying implementation of the memory manager does something different for calloc vs malloc, which is platform dependent.

(12) By Gary (1codedebugger) on 2020-12-27 22:34:09 in reply to 8.2 [link] [source]

Thank you for taking the time to show me how to do this properly. I'll study it.

(13.1) By Keith Medcalf (kmedcalf) on 2020-12-28 00:26:37 edited from 13.0 in reply to 12 [link] [source]

So, presumably you have a json array that you want to process by "converting" it into a carray. Why not just re-write the SQL slightly and use the json string and json_each directly?

select ...
  from ...
  join carray(?)
...

where the parameter is the json string converted into a C array, why not just use:

select ...
  from ...
  join json_each(?)
...

which will avoid having to process the json string into a C-array at all? If you need to know the number of elements in the json list, simply code the subquery (select count(*) from json_each(?)) which will give you the number of elements in the array? (Warning: there may be a more efficient method to do this, I don't know)

presumably this is for the other query you mentioned elsewhere that could now be something like this:

select id, name, path
  from media
 where id in (
                select media_id
                  from media_tags
                 where tag_id in (
                                  select distinct id
                                    from tags
                                   where name in json_each(?1) collate nocase
                                 )
              group by media_id
                having count(*) = (
                                   select count(distinct value collate nocase)
                                     from json_each(?1)
                                  )
             )

where you would only need to pass the json string as bind parameter 1 and let the database engine do all the work?

** Edited to make the comparisons case insensitive **

(14) By Gary (1codedebugger) on 2020-12-28 00:58:37 in reply to 13.1 [link] [source]

Thank you very much. I appreciate you providing the example query and explanation, and pointing me in the right direction in all these areas.

And, if a built-in aggregate function is used with your query above, such as below(shown to me in another post), then the SQL can even build the full JSON response to return to the requesting application.

select '{\"s\":0,\"c\":\"' || ? || '\",\"d\":,' || json_group_array(json_object('k', id, 'p', path, 'n', name)) || '\"}'

It's all getting quite interesting.

(15.1) By Gary (1codedebugger) on 2020-12-30 08:48:04 edited from 15.0 in reply to 13.1 [link] [source]

I appreciate how helpful and patient you've been with my lack of skill in both C and SQL. May I ask an additional question concerning this query please?

After you pointed out json_each to me, I started reading more about json_tree also; and it seems to be great for the scenario of receiving all incoming requests as JSON strings. Would you please tell me if this is the right approach? It sure beats the heck out of looping through a JSON array and rebinding values. Using json_each was a huge improvement and json_tree appears to be even better in that no C code is required to extract the text array to provide to json_each since json_tree parses it all to a CTE; and the JSON string can be bound as the only parameter in the prepared statement.

But, of course, I might be overlooking something very important. Thank you.

I should've added that, because my use case is fairly small, I tested it with 10,000 records in the media table, 100,000 in media_tags, and 115 tags. Each media file in the test data has a random set of 10 tags from the 115. Most of the queries complete in between 5 to 16 milliseconds, not counting preparing the statement.

with j ( value, fullkey, path ) as ( select value, fullkey, path from json_tree( ? ) where path = '$.name' or fullkey in ( '$.n', '$.c' ) ) select '{"s":0,"c":"' || ( select substr(value,2,1) from j where fullkey = '$.c' ) || '","d":,' || json_group_array(json_object('k', id, 'p', path, 'n', name)) || '"}' from media where id in ( select media_id from media_tags where tag_id in ( select id from tags where name in ( select value from j where path = '$.name' ) ) group by media_id having count( media_id ) = ( select value from j where fullkey = '$.n' ) )