multiple queries
(1) By anonymous on 2020-12-11 17:38:04 [link] [source]
The CLI evaluates multiple selects e.g. select * from table1;select * from table2;
sqlite3_prepare does not complain when the sql statement contains multiple selects.
When I invoke sqlite3_step, it has the first of the multiple select statements.
How do I move to the second & subsequent statements? i.e What API do I call?
(2) By Richard Hipp (drh) on 2020-12-11 18:56:45 in reply to 1 [link] [source]
That's what the pzTail parameter to sqlite3_prepare() is for - it returns a pointer to the remaining text in the input that has not yet been prepared.
So you do a loop. You run "sqlite3_prepare()" on the whole SQL, but you remember the pzTail. Then you run sqlite3_step() on the prepared statement until it finishes. Then you run sqlite3_finalize() on the prepared statement. Then as long as you have more SQL text to process, you do the whole thing again.
(3) By anonymous on 2020-12-11 20:51:06 in reply to 2 [link] [source]
Thank you for explaining with exceptional clarity (as usual).
(4) By anonymous on 2020-12-12 13:15:44 in reply to 2 [link] [source]
int sqlite3_prepare_v2(
sqlite3 *db, /* Database handle */
const char *zSql, /* SQL statement, UTF-8 encoded */
int nByte, /* Maximum length of zSql in bytes. */
sqlite3_stmt **ppStmt, /* OUT: Statement handle */
const char **pzTail /* OUT: Pointer to unused portion of zSql */
);
I can retrieve the SQL statement that is executed (in the first pass of the prepare ... step ... finalize loop) from **pmtStmt; using the same code with argument **pzTail returns null or "". (I've tried several other C# expressions without success).
Is there an SQLite3 function to retrieve the string from the pointer **pzTail?
If there isn't, any clues on how to retrieve the string from the pointer **pzTail with C#?
(5) By Larry Brasfield (LarryBrasfield) on 2020-12-12 16:22:35 in reply to 4 [link] [source]
This is more of a C# question than a SQLite API question.
That said, there is not going to be a good way to use that pzTail out pointer from the C# calling context. At the C API level, it will be pointing within the range of chars referenced by zSql. But at the C# calling level, the string parameter passed whose content ultimately becomes something referenced by a zSql is likely to be stored in a temporary whose lifetime expires before or during the return to the C# calling context. Hence, the pzTail value coming out of the C-level call will be referencing memory that likely will not be allocated to hold zSql content when the C# calling code regains control.
If you have control of the adapter layer between the C# interface and the SQLite C library, you could create there a new C# string reflecting the content portion referenced by pzTail, and make that an out parameter of the C# interface. The means of copying C string content to C# string objects should be easily found (and they are off-topic here.)
Is there an SQLite3 function to retrieve the string from the pointer **pzTail?
No. That would be a simple C expression too trivial to merit an API entry.
(6) By anonymous on 2020-12-12 16:50:27 in reply to 5 [link] [source]
whose lifetime expires before or during the return to the C# calling context.
Thanks for the insight; that is what seems to be happening although the pzTail pointer remains non-zero on return to C#.
To me that suggests that the pointer is still alive in the DLL. If that is the case, then an API to return the remaining portion of the SQL would be handy.
Unless I find a way (still trying to find one) to get the pzTail string value, I have a (I think neat) workaround, as below in pseudo code:
while (sql.Length!=0) /* sql = multiple queries separated by ;
{
prepare ...
step ...
finalize ...
executedSQL = the SQL statement that ppStmt executed /*I can get this */
sql = sql.Replace(executedSQL,"");
}
(7) By Larry Brasfield (LarryBrasfield) on 2020-12-12 18:44:14 in reply to 6 [link] [source]
To me that suggests that the pointer is still alive in the DLL. If that is the case, then an API to return the remaining portion of the SQL would be handy.
The pointer is nothing more than a single value, easily passed by value. The issue is whether that pointer points to something that can be referenced. My point is that is probably does not upon return to the C# call site. There is no way for a heretofore nonexistent SQLite API to later "return the remaining portion of the SQL" unless the connection were to store it away for future reference, which would also require a way to cease storing it. I would bet long odds against that happening, particularly because the existing API already permits the operations you would like to perform. (It does not support them in quite the manner you are thinking, but it does support the work-around that I suggested earlier.)
Your possibly neat workaround could be made to work. With very similar memory allocation, you could just extract the single SQL statements from the statement glom and do the prepare/step/finalize on each one. That would be more straightforward, IMO.
(8) By Keith Medcalf (kmedcalf) on 2020-12-12 18:57:56 in reply to 7 [link] [source]
With very similar memory allocation, you could just extract the single SQL statements from the statement glom and do the prepare/step/finalize on each one. That would be more straightforward, IMO.
Though of course you would have to parse the statement to find the endings if the statement contained any quotes (quote, double-quote, square-brackets or backticks) and make sure they were nested/unnested properly in order to find the end of a statement.
Alternatively I suppose you could split on a semicolon and re-assemble complete statements using the sqlite3_complete
API to fixup improper divisions...
(9) By Larry Brasfield (LarryBrasfield) on 2020-12-12 19:13:23 in reply to 8 [link] [source]
The OP had claimed "/*I can get this */
", which I elected to bypass because I suspect it is the same glutton for tedious work responsible for several other threads here lately. Using sqlite3_complete() would certainly work, albeit without the clarity one might like. Not glomming statements together from the outset would be clearer yet. Or, if the OP in fact is creating the C#/SQLite-C adapter layer, it would be very simple for it to have an out integer parameter which returns the number of characters consumed. Given that the zSql content is utf-8, computing that may take a bit more than pointer differencing, but it would be at least clean.
(10) By anonymous on 2020-12-12 21:49:36 in reply to 9 [link] [source]
it would be very simple for it to have an out integer parameter which returns the number of characters consumed.
I am unclear as to where (prepare or step or finalize or else where .. by reference to the pseudo code above) you would specify this?
(12) By Larry Brasfield (LarryBrasfield) on 2020-12-13 19:44:58 in reply to 10 [link] [source]
public int RunOneStatement( SQLiteConnection db, string sqlGlom, out int charsUsed, ... )
{
char * zSql = sqlGlom.?;
rc = sqlite3_prepare_v2(db.?, zSql, sqlGlom.bytelength, & pStmt, & pzTail);
step ...;
finalize ...;
charsUsed = 0;
while (*zSql && zSql < pzTail){
// Advance zSql by one utf-8 code.
++charsUsed;
}
// Free zSql if necessary.
}
The '...' in the signature would likely be a delegate to handle per-step actions that are needed. The '.?' methods are whatever it takes to get representations usable in native (or C) code.
This would enable the same sort of loop, consuming a single SQL statement per iteration, that you envisioned when you asked about getting/using pzTail. The difference here is that pzTail is still a valid pointer where used. At the C# level calling the above function, just lop off as many character codes as charsUsed indicates, or terminate the loop when it equals zero.
(13) By anonymous on 2020-12-14 14:45:57 in reply to 12 [link] [source]
The value for ++charsUsed I get is none of these:
- The length of the original SQL statement
- The length of the executed SQL (in the current iteration)
- The length of the remaining SQL (for the subsequent iterations)
Possibly coding error on my part but the fact that it is a non-zero value is intriguing.
I can workaround by simply replacing the executed SQL from the original SQL in every iteration.
(14) By Keith Medcalf (kmedcalf) on 2020-12-14 15:15:54 in reply to 13 [link] [source]
zTail needs to be a char*
eg:
char *zSql = <the SQL CString>
char *zTail = 0;
int bytesused = -1;
sqlite3_prepare(... zSql ... &zTail);
if (zTail) bytesused = (int)((IntPtr)zTail-(IntPtr)zSql);
...
The size of zTail and zSql are the size of a pointer (32-bit for 32-bit or 64-bit for 64-bit). bytesused will contain the number of bytes used, or -1 if all of them was used.
NB: I don't know C# -- as a Microsoft language I assume it is completely broken for all purposes so I have no idea how you interface C# with a actual computer code
(15) By Larry Brasfield (LarryBrasfield) on 2020-12-14 15:55:27 in reply to 14 [link] [source]
I believe that zSql, and hence zTail also, are pointers to a UTF-8 code sequence. [a] The reason that I did not advise the pointer arithmetic you suggest is because it does not yield the number of characters consumed unless they happened to be restricted to the ASCII subset of UTF-8 code points.
[a. Per the sqlite3_prepare doc, zSql is "/* SQL statement, UTF-8 encoded */". ]
(16) By Keith Medcalf (kmedcalf) on 2020-12-14 16:11:11 in reply to 15 [link] [source]
But C# strings, like everything Microsoft, are UTF-16. So you have to convert it to some kind of "array of bytes" that are UTF-8 encoded. So since you are fiddling with an array of bytes then simply knowing the difference in bytes is meaningful.
Knowing the number of "UTF-8" (Unicode) codepoints does not help with handling UTF-16 character-points. Unless you are expecting to be lucky.
(17) By Larry Brasfield (LarryBrasfield) on 2020-12-14 16:34:46 in reply to 16 [link] [source]
I have no expectation of luck or desire to rely on it.
At the point I suggested, to the OP, a way of knowing how much of a multi-statement string was accepted by sqlite3_prepare, the "strings" are simple char* but known to be referencing UTF-8 code sequences. Computing the accepted number of "characters" (or UTF-8 code points) has nothing to do with C# at that level, with this tiny proviso: Once control gets back to the C# domain, where we can presume the multi-statement string appears as a CLR String type, it is quite easy to lop off the accepted portion using the .Substring(int startIndex) method, where that index is not a count of UTF-16 words but is a zero-based "character position". That .Substring() method can be safely used without having to anticipate that the UTF-16 encoding used for CLR String objects will result in some other number of characters being lopped off or that a UTF-16 code point representation will be sliced into pieces.
It is because zSql (and zTail) point to possibly multi-byte character representations that my suggested code did not do simple pointer arithmetic. Doing that might have worked, but only by luck. As we know, having something work by luck is often a form of bad luck.
(11) By anonymous on 2020-12-12 21:57:38 in reply to 8 [link] [source]
glom
Never seen this word being used; I had to look it up!
Alternatively I suppose you could split on a semicolon
I did consider this but getting the executed statement from ppStmt seems easier.
(18) By anonymous on 2020-12-18 07:07:13 in reply to 2 [link] [source]
I found further sound advice here.
(22) By David Jones (vman59) on 2020-12-19 21:23:23 in reply to 18 [link] [source]
I wrote my own support library that lets my applications deal with SQLite in a style in between the
simplicity of sqlite3_exec() and tediousness of the standard prepare,bind,step,column_xxx loop.
The basic pattern for this library is:
#include "statement_store.h"
sps_store sps;
struct sps_context *ctx;
int emp_num, dept_num;
char *first, *last;
sps_define1 (sps, "emp-by-last",
"SELECT * FROM employee WHERE last LIKE ?1", "s");
ctx = sps_execute2 (sps, "emp-by-last","%son");
while (sps_next_row3(ctx,"itti",&emp_num,&first,&last,&dept_num))
{
printf("%8d %-12s %-15s %5d\n", emp_num, first, last, dept_num);
free ( first );
free ( last );
}
if (ctx->rc != SQLITE_DONE) printf ("Error retrieving data");
printf ("Rows retrieved: %d\n", ctx->count);
rc = sps_rundown4(ctx);
Notes:
- Pepares the SQL statement and saves the resulting sqlite3_stmt object in
the sps_store object,
associating it with a tag ("emp-by-last"). - Lookup the statement object and bind the callers arguments to the statement's
parameters. The
function uses the bind map passed to sps_define() ("s") to determine the number and data types
of these arguments. - Retrieve the next row in the result set and convert the column values to
the caller's arguments
based on the conversion codes specified in the second argument ("itti" -> int, text, text, int). - Reset the statement and free the sps_context object. The statement is not
finalized and may be
reused by subsequent calls to sps_execute.
(23) By anonymous on 2020-12-19 22:31:27 in reply to 22 [link] [source]
I'm having a hard time following your code since I have virtually nil experience of C/C++. Nonetheless I think I get the idea.
Is your approach executing each SQL statement ('fully') twice thus doubling runtime? ('fully' i.e. without LIMITing the number of records returned).
This grabbed my attention:
printf ("Rows retrieved: %dn", ctx->count);
Is the value of ctx on reaching this statement always 1 or is it the number of rows returned by the SQL statement? I didn't think there was a way of retrieving the number of rows in a result except by reiteration. Looks like
ctx = sps_execute2 (sps, "emp-by-last","%son");
is returning the row count with sps_execute2 reiterating the result. Is that correct?
(24) By David Jones (vman59) on 2020-12-20 10:57:04 in reply to 23 [link] [source]
sps_execute() binds the parameter values and returns a context for retrieving the result rows of a query, no rows have been returned yet by the SQLite VDBE. The sps_next_row() loop retrieves at most one row each call and increments ctx->count if successful.
My library has another function, sps_execute_rowless(), which does run the statement 'fully' before returning, but that's used where you expect sqlite3_step() to immediately return SQLITE_DONE rather than one or more SQLITE_ROW return values first (e.g. "COMMIT;").
(19) By anonymous on 2020-12-19 18:19:54 in reply to 2 [source]
I've sorted the loop using pzTail.
SQL is: select * from highscores;select * from ajay;select * from highscores where 0 = 1;select date("now");
SQL executed is: select * from highscores;
SQL remaining is: select * from ajay;select * from highscores where 0 = 1;select date("now");
SQL executed is: select * from ajay;
SQL remaining is: select * from highscores where 0 = 1;select date("now");
SQL executed is: select * from highscores where 0 = 1;
SQL remaining is: select date("now");
SQL executed is: select date("now");
SQL remaining is:
Having started with multiple statements, every iteration progresses to the next SQL statement.
However, if a statement encounters an error, the SQL in that iteration and those following is wrong. I introduced a deliberate error in the second SQL statement ... selectx instead of select as the first time round:
SQL is: select * from highscores;selectx* from ajay;select * from highscores where 0 = 1;select date("now");
SQL executed is: select * from highscores;
SQL remaining is: selectx* from ajay;select * from highscores where 0 = 1;select date("now");
SQL executed is: /* goes wrong hereon */
SQL remaining is: * from ajay;select * from highscores where 0 = 1;select date("now");
SQL executed is:
SQL remaining is: from ajay;select * from highscores where 0 = 1;select date("now");
SQL executed is:
SQL remaining is: ajay;select * from highscores where 0 = 1;select date("now");
SQL executed is:
SQL remaining is: ;select * from highscores where 0 = 1;select date("now");
SQL executed is: ;select * from highscores where 0 = 1;
SQL remaining is: select date("now");
SQL executed is: select date("now");
SQL remaining is:
Any guidance on overcoming invalid SQL statements when in a prepare/step/finalize loop?
(20) By Larry Brasfield (LarryBrasfield) on 2020-12-19 20:14:33 in reply to 19 [link] [source]
You might do a search on "parser" and "error recovery" (together.) As you will see, it is not a trivial task. I am not surprised that the SQLite parser has not been made to somehow figure out what should be consumed as erroneous while leaving what is maybe not.
If you insist on solving the problem, finding a statement separator not embedded in quoting delimiters is likely your best bet. You will need to replicate the SQLite scanner (or "lexical analyzer") for that.
I am curious as to what the application is that makes solving your stated problem preferable to just complaining about the whole conglomerated statement sequence.
(21) By anonymous on 2020-12-19 21:22:05 in reply to 20 [link] [source]
Given multiple statements, the CLI either completes or stops at the first error. (I'd use the CLI for building/testing scripts).
Programmatically, stopping at the first error is an option (examine the return code from sqlite3_step and stop if not zero). However, this makes the user experience terse/repetitive in that subsequent statements in the multiple queries statement may also be 'incorrect'.
My point in asking was to see if there was some way of resuming with the next statement on encountering an error. From what you say, there isn't. Fair enough!
To validate each statement in a multiple query statement, I am contemplating splitting at ; and executing each statement and reporting failures in the format "statements nx, ny failed" thereby allowing the user to revisit each failing statement in one go. Of course there is no guarantee that the revision will correct every invalid statement successfully; given this, stopping on the first error is a very reasonable option.
(25) By Larry Brasfield (LarryBrasfield) on 2020-12-20 12:20:16 in reply to 21 [link] [source]
If you merely split on semicolons, consider how that scheme will treat this:
select "silly;column;name" from "silly;table;name"
where "silly;column;name" not like '%;%'
(26) By anonymous on 2020-12-20 12:55:51 in reply to 25 [link] [source]
I know; thanks for making it obvious. Splitting at ; is not a straightforward solution since
- there can be a literal value with semi-colon(s) embedded
- there can be a literal value with semi-colon AND the literal value can be badly specified i.e. quotation marks are unbalanced.
Looks like aborting the loop might be the only practical option when sqlite3_step does not return an expected return code
On the back burner for now!
(27) By Richard Hipp (drh) on 2020-12-20 13:22:36 in reply to 21 [link] [source]
You can search ahead for either end-of-string or a semicolon. If you find a semicolon, then you also need to check that you have a complete SQL statement using the sqlite3_complete() interface, because otherwise the semicolon you found might be in the middle of a string literal or a trigger.
(28) By Ryan Smith (cuz) on 2020-12-20 13:46:34 in reply to 27 [link] [source]
... the semicolon you found might be in the middle of a string literal or a trigger.
or indeed inside an object name or inside a comment.
I had to make an SQLite-like SQL parser for the SQLiteSpeed project long ago and the semi-colon test-case was very similar to Larry's example, only also inside a trigger definition with commented out semi-colons both EOL comments (-- ..;. EOL) and in-line comments (/* ..;. */) and so on.
It seems such a simple rule, but parsing it can get hairy if tried straight from the text.
(29) By Richard Hipp (drh) on 2020-12-20 14:07:51 in reply to 28 [link] [source]
parsing it can get hairy
That's why there is the sqlite3_complete() interface! The sqlite3_complete() function takes care of all the messy details for you and lets you know, whether or not the semicolon you found is the end of an SQL statement, or if it is embedded in the middle of an identifier or string literal or trigger.
(30) By Ryan Smith (cuz) on 2020-12-20 14:20:20 in reply to 29 [link] [source]
Well exactly - and a magnificent API it is.
Perhaps my intent wasn't clear - I posted that specifically to discourage self-parsing of the SQL when such marvelous API's exist.
As an aside...
The reason why I know the trouble of doing it the other way, is that when I started doing work on said project, that was now more than 10 years ago (how time flies!) and at the time, this API was very much not available yet (at least, I believe it wasn't, right? - else I'm just a masochist). :)
(32) By Richard Hipp (drh) on 2020-12-20 15:05:28 in reply to 30 [link] [source]
this API was very much not available yet
The sqlite3_complete() interface (or its pre-version-3 incarnation of "sqlite_complete()") has been available since the very first check-in of SQLite code on 2000-05-29. It's one of the first things I wrote, as it is important for the operation of the CLI.
Perhaps it has not been sufficiently publicized...
(31) By anonymous on 2020-12-20 14:46:27 in reply to 29 [link] [source]
Time to take a look at sqlite3_complete.
Question: sqlite3_step appears not to use sqlite3_complete. It appears to parse using each discreet word in the SQL string incrementally until it either fails to parse or hits a semi-colon. Is this correct?
(33) By Keith Medcalf (kmedcalf) on 2020-12-20 18:23:09 in reply to 31 [link] [source]
sqlite3_step does not use sqlite3_complete nor does it parse anything.
sqlite3_step causes the VDBE program created by sqlite3_prepare* to execute to the point at which the next row of output is available, or the execution of the program completes -- or perhaps returns an error if the program execution detects an error condition.
sqlite3_prepare_v2 is the thing which parses SQL statements and outputs VDBE programs to be executed by sqlite3_step. It returns it own error returns indicating if a problem was detected in preparation of the VDBE program.
They are quite separate and distinct things. If sqlite3_prepare_v2 returns an error, then you have nought to be executing so the question is like saying "so if the house falls down then the store was out of green paint". It rests on a false premise.
It is hard to tell if your real underlying problem is merely a failure to check error returns because you appear to be getting to somewhere that should be impossible for a rational person to get to -- or you are continually mistaking sqlite3_step for sqlite3_prepare_v2 -- and it is very difficult to tell what your problem is because of that.
(34) By anonymous on 2020-12-20 23:26:00 in reply to 33 [link] [source]
or you are continually mistaking sqlite3_step for sqlite3_prepare_v2
True - my mistake. Now sorted.
(35) By Keith Medcalf (kmedcalf) on 2020-12-21 07:27:46 in reply to 34 [link] [source]
You also suffer from a false premise in that you somehow assume the multiple statements executed in a batch are independent and that a failure in one of them means that you should just carry on with the rest of the batch.
This is foolish and likely to lead to significant problems. When a series of statements are submitted to be executed "side by each" as the newfy's say, then when the second one of the sequence fails then those following after that are obviously not going to work as intended?
So if a "batch of statements" are submitted is not the fact that a syntax error is detected somewhere in the batch properly an indication that the batch is in error? Why on earth would you want to ignore errors in a non-interactive program?
Should not the executor of the batch of commands say "Hold on there matey, the engine did not start so the rest of your instructions cannot be executed until you fix that error condition"? There is no point starting the engine and steering to the open sea if the docking lines could not be untied.
There may be exceptional situations in which someone batches together a bunch of non-dependent commands in sequence, but that is probably a very rare exception and will never be the general rule.
(36) By anonymous on 2020-12-21 08:04:55 in reply to 35 [link] [source]
Your assumptions & you are entitled.
Now imagine this: the prepare/step/finalize loop reiterates through 10 queries.
Assume that queries 2 5 7 9 are faulty i.e. cannot be prepared.
My view is this:
1.logging queries 2 5 7 9 as faulty up front & in one go is a better response than logging query 2 is faulty, then query 5 is faulty, ... etc.
2.reiterating through all queries & executing ones that are valid means that 6 queries (1, 3, 4 6, 8 & 10 ) are processed. Stopping at 2 means that just 1 query is processed.
However, I concede that the merits of the 2 approaches is debatable & dependent on the context ... for practical purposes, the SQL scripts will have been tested/proven so the likelihood of failure because of syntax is minimal but runtime failure because of data remains.
If the task were to collate daily sales data from 10 community pharmacies stopping at 2 will discard data from 9 of them, completing the loop means that data from only 4 of them is discarded.
I think that it is the business logic tier rather than the data tier that has the final say.
(37) By Ryan Smith (cuz) on 2020-12-21 10:23:47 in reply to 36 [link] [source]
Again, you may be able to craft a set of queries that are independent, but that cannot shape engine behaviour.
1.logging queries 2 5 7 9 as faulty up front & in one go is a better response than logging query 2 is faulty, then query 5 is faulty, ... etc.
Imagine that query 1 in the list created a table and query 2 then needs to insert some values into that table - actually this is a very common way to start scripts on this forum intended to explain some bug report.
Your idea that we can "check" if query 2 will fail regardless of whether query 1 failed or not becomes very obviously flawed - The Insert would fail horribly, yet if query 1 did actually succeed, query two would be perfectly executable.
This is and always has been the premise of "ERROR" generation in all software I think, stop the moment we reach something that cannot be done (and perhaps cannot be handled by any of the provided levels of error-handling).
Even your example sends chills down my spine:
If the task were to collate daily sales data from 10 community pharmacies stopping at 2 will discard data from 9 of them, completing the loop means that data from only 4 of them is discarded.
What financial manger would OK this behaviour? Seeing a report that only includes financials from those business units where "the query succeeded for"...
That's Enron-level math.
Fix the whole thing first, and only then use it.
(38) By anonymous on 2020-12-21 12:48:15 in reply to 37 [link] [source]
Fix the whole thing first, and only then use it.
There is nothing to fix. The whole thing works to start with. And there is no practical way to envisage every single scenario of data that can ensue in real life.
I think that it is the business logic tier rather than the data tier that has the final say.