How should RETURNING work with a cascading delete?

(1) By Richard Hipp (drh) on 2021-01-31 18:22:00 [link]

Consider the following hypothetical SQL:

>
~~~
PRAGMA foreign_keys=on;
CREATE TABLE t1(
  a INTEGER PRIMARY KEY,
  b INTEGER REFERENCES t1 ON DELETE CASCADE
);
INSERT INTO t1(a,b) VALUES(0,null),(1,0),(2,1),(3,2),(4,null);
DELETE FROM t1 WHERE a=0 RETURNING a;
~~~

The final DELETE should delete rows 0, 1, 2, and 3 due to the
ON DELETE CASCADE clause, and indeed it does.  

## Question:

Should the RETURNING clause return all four deleted rows, or
should it only return the specified row 0?

## Reasons to return all four rows:

  1.  If you have a RETURNING clause, presumably you want to see every
      row that has been deleted, regardless of whether or not that row
      was deleted directly or via a cascading delete.

  2.  If you don't return all rows, then the result of RETURNING can
      vary according to the order of the delete.  If the final statement
      is "DELETE FROM t1 WHERE a IN (0,1,2,3) RETURNING a;" then you
      might get "0 1 2 3" or you might get "0" or you might get something
      in between depending on the delete order and the timing of the
      cascading delete.

  3.  Triggers and other side effects fire on cascading deletes.
      Why should RETURNING be any different?

## Reasons to return just the one row 0:

  1.  The RETURNING clause cannot report on cascading deletes in
      other tables, since the argument to RETURNING must refer to columns
      in the primary table.  So if cascading deletes cannot reported in
      all tables, why report them in the primary table?  Shouldn't all
      tables be treated the same?

  2.  PostgreSQL 9.6 works this way.

(2) By Larry Brasfield (LarryBrasfield) on 2021-01-31 20:44:04 in reply to 1 [link]

(Speaking for myself only:)

The cascading delete case you present is an outlier. In the usual case, cascaded deletions will happen in an arbitrary set of tables, which only by coincidence includes the origination of the cascade. I see no reason that it should be made easier to get the cascaded results only for that one somewhat special case when so many other cascaded results are unavailable that way. Most foreign keys reference **other** tables. (I suspect that is why they are named "FOREIGN".)  Of course, this is just reiterating your point, "Shouldn't all tables be treated the same?" Yes, it is best that they are, to ease human conceptualization.

The consequence, "you might get something in between depending on the delete order and the timing", gives me a shuddering dread. SQL is best considered as a way of specifying set operations.  That "delete order" and "timing" dependence is a monkey wrench thrown into that eminently useful conceptualization.

(3) By Ryan Smith (cuz) on 2021-02-01 00:16:55 in reply to 1 [link]

Definitely the latter.

The main reason a programmer needs RETURNING is that he/she can ascertain which rows he/she caused to be added/deleted/touched directly and not as a consequence of a FK or TRIGGER or other Schemata, that is for the DB Schema's concern, and if perchance a programmer IS interested in all effects, that is the complicated case and then more queries are warranted.

The latter case is how it works in *most* engines and is expected to work *in the general case* (I can't speak for ALL engines or ALL expectations of course).



PS: It's very exciting seeing the question asked. :)

(4) By anonymous on 2021-02-01 07:11:25 in reply to 1 [link]

Prefer

>If you have a RETURNING clause, presumably you want to see every row that has been deleted, regardless of whether or not that row was deleted directly or via a cascading delete.

This gives me the option to <i>undo</i>.

(5) By anonymous on 2021-02-01 07:48:47 in reply to 1 [link]

Maybe:

> PRAGMA returning_indirectly_touched_rows=yes/no

?

(7) By Richard Hipp (drh) on 2021-02-01 12:52:15 in reply to 5 [link]

I was thinking something like:

>      ... RETURNING *; -- Postgres behavior

Versus:

>      ... RETURNING RECURSIVE *;  -- Also reports cascading changes

In other words, add the RECURSIVE keyword after RETURNING to cause
foreign-key updates to the same table to be reported, and use the 
traditional syntax without RECURSIVE to report only the rows that
are directly modified and omit changed by foreign key
constraints or auxiliary triggers.

This remains an option for the future.  For now, at least, the behavior
of SQLite has been shifted to align with Postgres.  Such was the
recommendation of the Postgres devs, in addition to the majority
recommendation of the responses on this forum thread.

(9) By ddevienne on 2021-02-01 13:09:17 in reply to 7

An opt-in way to get the more deterministic but less compatible *full side-effets* (in the same table) sounds great, as a future extension. The best of both worlds.

(6) By Holger J (holgerj) on 2021-02-01 12:14:30 in reply to 1 [link]

Do it the same way PostgreSQL does.

(8) By ddevienne on 2021-02-01 13:06:52 in reply to 1 [link]

I've wished for a way to find out the effects of CASCADE many times,  
but across the board, in any table, not just for self-referential FKs.

So I agree with others, that **direct effets** should be returned only,  
and that *same-table CASCASE side-effets* are out-of-bounds,  
for compatibility with PostgreSQL (and likely other RDBMSs).

Regarding your point 1.2, I guess that implies that the CASCADE is applied  
*depth-first* per-deleted-row, rather than accumulated for *post-processing*  
(kinda like a *breath-first delete*). That's unfortunate, and of course an  
implementation detail I guess. Could be a nasty *gotcha* indeed.

Still, compatibility matters more IMHO.  
The doc could just mention that pathological case perhaps.

BTW, option #1 would violate what [changes][1] returns, no?  
So that clearly points to option#2 IMHO, for internal SQLite consistency.

Also, what would be allowed after `RETURNING`? Any select-clause *expr*?  
Can I `RETURNING COUNT(*)` to *emulate* `sqlite3_changes()` for example?

[1]: https://sqlite.org/c3ref/changes.html

(10) By anonymous on 2021-02-01 15:43:35 in reply to 8 [link]

> I've wished for a way to find out the effects of CASCADE many times,
> but across the board, in any table, not just for self-referential FKs.

Perhaps [the session extension](https://sqlite.org/sessionintro.html) or [the pre-update hook](https://sqlite.org/c3ref/preupdate_count.html) could be useful?