Inconsistent values in trigger from UPSERT
(1) By anonymous on 2024-04-11 20:39:45 [source]
Let's say I have a table of some redundant BLOBs and now I want to deduplicate them. I create a new table t2
with a unique hash field and a trigger to detect hash collisions, however unlikely they may be:
CREATE TABLE t1 (id INTEGER PRIMARY KEY, hash BLOB NOT NULL, data BLOB NOT NULL);
INSERT INTO t1 VALUES(1,unhex(format('31%.62c', '0')),format('%.36760c', 'a'));
INSERT INTO t1 VALUES(2,unhex(format('31%.62c', '0')),format('%.36760c', 'a'));
INSERT INTO t1 VALUES(3,unhex(format('32%.62c', '0')),format('%.36760c', 'b'));
INSERT INTO t1 VALUES(4,unhex(format('32%.62c', '0')),format('%.36760c', 'b'));
CREATE TABLE t2 (id INTEGER PRIMARY KEY, hash BLOB UNIQUE NOT NULL, data BLOB NOT NULL);
CREATE TABLE collisions (oldhash, newhash, old, new);
CREATE TRIGGER t2_collision BEFORE UPDATE OF data ON t2
WHEN old.hash = new.hash AND old.data != new.data
BEGIN
INSERT INTO collisions (oldhash, newhash, old, new) VALUES (old.hash, new.hash, old.data, new.data);
END;
Now I try to move the data to the new table:
INSERT INTO t2 (hash, data) SELECT hash, data FROM t1 WHERE true ON CONFLICT (hash) DO UPDATE SET data = excluded.data;
SELECT * FROM collisions;
Surprisingly, I notice that while the data has been moved, there is a single collision: the old
field consists of 4008 a
characters followed by 32752 b
characters (no such value is present in the original t1
table), while the new
field is 36760 b
characters, and both hash fields contain hashes for the new
value.
I've tested this on fiddle (3.46.0 2024-04-08 11:50:07) and my local build (3.45.1). The counts are slightly different on my local build (there's 451 b
s instead of 4008) but the behavior is otherwise consistent.
I'm not sure why this is happening. Is it expected behavior? Am I misunderstanding something?
(2.1) By Richard Hipp (drh) on 2024-04-11 23:42:19 edited from 2.0 in reply to 1 [link] [source]
UPSERT is implements as a special kind of transient TRIGGER. So in order
for an UPSERT to fire a TRIGGER, you have to say:
PRAGMA recursive_triggers=ON;
Never mind the above. I was doing something wrong, apparently...
(3) By anonymous on 2024-04-11 23:56:28 in reply to 2.1 [link] [source]
Just in case my initial post wasn't clear: there are no collisions in the test data (t1
) so the trigger should never fire, but according to the content of the collisions
table it did fire once with the old.data
field set to a value that wasn't present in any of the tables.
(4) By Richard Hipp (drh) on 2024-04-12 01:39:13 in reply to 3 [link] [source]
This seems to be a malfunction in UPSERT. The error goes back to version 3.24.0 (2018-06-04) when UPSERT was first added. I'm trying to isolate the cause of the problem now. Probably it will be fixed (on trunk at least) within a day or two.
(5) By Richard Hipp (drh) on 2024-04-12 15:53:16 in reply to 1 [link] [source]
This problem should now be resolved on the latest trunk check-ins and on the latest check-ins for branch-3.45.
(6) By anonymous on 2024-04-12 16:26:57 in reply to 5 [link] [source]
Thank you for the quick fix. I can confirm the issue no longer occurs on the test data and my actual DB.