SQLite User Forum

Possible bug about changes() behavior after a CTAS
Login

Possible bug about changes() behavior after a CTAS

(1) By Steve Estes (Denzera) on 2024-11-11 16:29:06 [source]

Background: So I'm trying to write a SQL script I can execute through Python's interface that is UPDATE- and INSERT-heavy, and I'd like to get a log of the changes made each step of the way, for auditing purposes. Through the CLI, the behavior of ".changes on" is perfectly satisfactory for this purpose, because if I then do .read script.sql, every SQL command's result, in terms of rows affected, is sent to stdout. Likewise, if I run a SQL script that starts with .changes on through the command line, i.e. $ sqlite mydb.db < script.sql, it works and echoes each statement's result to stdout. But in Python, cursor.executescript() requires all statements be SQL statements (dot-commands will error), and it has no means to pass any sort of output-control dot-command to affect its connection. OK, arguably a Python deficiency, but SQLite offers a workaround! I can just use the built-in changes() function to echo the updated or inserted rows after each statement, e.g.:

SELECT concat('Updated rows: ', changes());

This mostly works, except...

Expected behavior:

Per the documentation, changes() should return "the number of database rows that were changed or inserted or deleted by the most recently completed INSERT, DELETE, or UPDATE statement".

If I run a CREATE TABLE AS SELECT (...) statement, it is functionally equivalent to a CREATE TABLE followed by an INSERT. I would expect changes() to return the number of rows inserted, which is what the equivalents in e.g. PostgreSQL do (though in some implementations it's considered a SELECT result, despite the INSERT behavior / effects).

Actual behavior:

After a CREATE TABLE AS SELECT, the changes() function returns 0.

I suggest that this is an easily-correctable oversight - the changes() function should return the number of rows inserted by the CTAS statement, if called immediately afterwards.

Please advise if I'm misunderstanding something, but if not, this might be a bug.

Thanks much!

(2) By Aask (AAsk1902) on 2024-11-11 20:59:28 in reply to 1 [link] [source]

"the number of database rows that were changed or inserted or deleted by the most recently completed INSERT, DELETE, or UPDATE statement".

Your CTAS is not completed if/when you invoke changes() within the same query/sql.

If you invoke changes() immediately after your CTAS completes, it will return the number of rows inserted; see example below:

sqlite> drop table if exists tbltry;
sqlite> create table tbltry (name, value);
sqlite> /* Insert 3 rows */
sqlite> insert into tbltry values('Jan',1),('Feb',2),('Mar',3);
sqlite> /* Update 1 row */
sqlite> update tbltry set value = value * 100 where name = 'Feb' returning changes();
changes()
---------
3
sqlite> /* Expected 1 (only one row got updated) but got 3 as count of changes ... relates to the completed statement that made 3 inserts*/
sqlite> /* need to re-query changes() */
sqlite> select changes();
changes()
---------
1
sqlite>

(3) By Richard Hipp (drh) on 2024-11-11 21:19:53 in reply to 1 [link] [source]

The documentation says:

"... the number of rows modified by the most recent INSERT, UPDATE or DELETE ...."

And

"Only changes made directly by the INSERT, UPDATE or DELETE statement are considered...."

Your CREATE TABLE AS SELECT statement is not an INSERT, UPDATE, nor DELETE statement. It is a CREATE statement. And hence a strict reading of the documentation is that the rows added to the new table should not be counted.

I considered modifying the behavior to also count rows added by CREATE TABLE AS SELECT (see the branch chngcnt-create-as) but I then thought better of it. The documentation is clear about the current behaviro and the current behavior has been in the code for multiple decades. Who knows how many legacy applications will break if I change it, even if the change is considered to be an improvement. So, instead, I clarified the behavior with a new sentence in the documentation that will appear in the documentation at the next release.

(5) By Steve Estes (Denzera) on 2024-11-13 14:56:07 in reply to 3 [link] [source]

Thanks Richard. Yes, your latter post is a much more succinct statement / summary of the problem. Sorry to have been so wordy.

I respect your reasoning on it. The reason I thought it to be more of a bug than a documentation oversight is that in the current behavior, there is no way of getting a report back for how many rows were inserted by a CTAS statement. At least not from among the built-in functions. If changes() doesn't tell you, and there's no automatic console echo the way there is on (e.g.) Postgres, then it's essentially a silent operation, and only a post-hoc SELECT count(*) ... can really answer such a question (which is likely far more computationally intensive than what I presume would be a mere read-out from the INSERT portion of the CTAS). I understand the intent of the changes() function to be to enumerate the rows affected by the previous operation, if that operation wrote data in any way. Analogous to the "affected rows" functions in other RDBMSes. Which I why I felt it was incomplete in this respect.

You know far, far better than I do how many applications might be using this function in varying ways, but I would offer: because the current functionality is essentially non-functionality on the part of the changes() function, because it provides no information in that usage, there is probably no conceivable use case where someone is relying on the output of that function being and remaining 0 when called after a CTAS statement. So I would say not only would the change be an improvement, but that also, nobody out there in the vast universe of SQLite users would disagree and say it is a negative. You rightly give great weight to the latter consideration, but in this case it might yet clear that bar.

Hope that's worth chewing on. Thanks again!

(4) By Richard Hipp (drh) on 2024-11-11 21:22:19 in reply to 1 [link] [source]

The original post did not clearly state the problem. I interpreted the complaint in the original post to be that the SELECT statement in the following code returns 0 instead of 3:

CREATE TABLE t1 AS SELECT 1 AS x UNION ALL SELECT 2 UNION ALL SELECT 3;
SELECT changes();