Trigger order help

(1) By pcreso on 2022-07-15 02:43:42 [link] [source]

Hi,

I have a situation requiring multiple insert triggers on the same table.

The different triggers are required as each has a different "when" clause to instantiate it at runtime.

I have some triggers which address possible null values in an insert into a table. These will apply business rules to populate these null fields with values derived from previous records.

Once these rules have been applied, a trigger will act on the now completed record, to insert or update records in another table.

All these triggers are after insert triggers on the same table.

How can I ensure that the triggers to address nulls are run BEFORE the trigger to populate the other table?

Does the order they are created imply a consistent (robust) order to their execution?

I have done some testing, and it seems the last trigger created is run first (LIFO order) but can I be assured this will always be the case, or not?

If not, I can try to implement a single trigger with all the possible operations, as I am assuming these will be carried out in the order written in the trigger.

Thanks,

Brent Wood

(2) By anonymous on 2022-07-15 09:20:36 in reply to 1 [link] [source]

The trigger to populate other tables can have a condition that checks that the record has already been properly transformed to the desired state by the previous triggers before it fires. e.g. check if certain values are no longer NULL

(3) By Simon Slavin (slavin) on 2022-07-15 09:59:09 in reply to 1 [link] [source]

As you suspect, it's not possible to figure out which order TRIGGERs will be triggered in. Also, as it's undocumented, it can change between different versions of SQLite.

How do you expect to be able to look up "values derived from previous records" ? Each trigger can contain only INSERT/UPDATE/DEELETE/SELECT commands. You can't pass a value between commands.

Perhaps you might provide an example.

(4) By pcreso on 2022-07-15 20:17:40 in reply to 2 [link] [source]

Yep, but that just means the trigger will fail, which is not what we require

(5) By pcreso on 2022-07-15 21:08:28 in reply to 3 [link] [source]

Thanks Simon, I'll explain the use case (there is a bit of detail, but you did ask :-)

We are viewing a live underwater camera at sea. QGIS is used to save GPS point & time to a spatialite table (called gps_points), also recording other fields, including transect no & point_type, and data about what can be seen on the video.

The point_type values available include an "s"; identifying the point at the start of the transect, and an "f"; defining the point at the end of the transect, and there are a few others.

An after insert trigger on the gps_point table tests the point_type (when ...), if "s" then it inserts a new record in the main station (transect) table, populating the start fields. Another after insert trigger tests for an "f" record, and if so it updates the station table, populating the finish fields. This works well, and the order the triggers fire is immaterial. The user will always enter an "s" record before an "f" one which takes care of the order they will actually do anything.

To simplify the workflow, we have been asked if the system can auto populate some of the fields in the gps_points table - eg: if the scallop_count field is null, populate with 0 (done with a default in the table create statement). Also, to avoid having to repetitively enter fields such as the station_no every time, to auto populate this with the last recorded station_no.

This can be done as a simple update sql in an after insert trigger (the example you asked about)-
update gps_point set station_no = (select station_no
from gps_point
where station_no not null
order by datetime(gps_time) desc
limit 1)
where station_no is null
and datetime(gps_time) = datetime(new.gps_time);

The where clause ensures that only the current record will be modified, and only when the appropriate field is null.

This approach can apply to several fields, and will avoid lots of typing, hence potential typo's, at a time when the user is already busy, and will really help simplify the workflow and improve data quality.

An after insert trigger can do this very well (I don't think a before insert trigger can be used to set the values to insert where they are null, but could be wrong?), but it needs to fire before the trigger that writes to the station table, so the station table gets the filled in values from the gps_point table, not the record with nulls that will be filled afterwards if the triggers fire in the wrong order.

I can work around this by having the trigger populate the nulls in the gps_point table, then have case statements in the triggers that populate the station table that also populates the nulls, so it will fix the data when it fires if the other trigger has not already done so. But I'm hoping (not expecting!) a more elegant approach with sequential triggers is possible. Your reply suggests this is not possible.

Does this clarify the situation? We are developing a few sqlite3 (spatialite) databases using triggers to improve field data capture systems. It has proven very successful to date.

Thanks for your advice, it is appreciated!!

Brent Wood

(6) By Simon Slavin (slavin) on 2022-07-16 02:09:47 in reply to 5 [source]

A wonderfully clear explanation.

So your problem is this. A new entry in the gps_points table may or may not create a new entry in the transect table. But you may then want to go back and modify the gps_points table from values in the transect table. And you can't do that if you can't depend on the entry in the transect table having already been created.

I don't think you can usefully do this with separate TRIGGERs. So I see two solutions.

Instead of multiple TRIGGERs, make one TRIGGER which has multiple SQL commands in, inside the BEGIN / END structure. Those will always be executed in the order you specified. Clever use of an appropriate primary key and INSERT OR IGNORE should allow you to make a new row only when you want to.

Alternatively, don't try to do this with TRIGGERs. Instead, create a small program your users can use to process the data you collected. This program creates the tables that they will actually read. One of the tables might look like a copy of the gps_points table, but have extra data in. You can, of course, write whatever logic you want in that code.

One approach frees your users from having to run another (small, fast) program to process your data.

The other approach can do arbitrarily complicated things with your data, and can easily be changed at a later date, after the data was collected. In other words, if your users decide they want more/different processing, or if you notice a bug, you can run the corrected logic on all your old datasets instead of having to figure out how to trigger the TRIGGERs again.

Since I'm thinking about your problem anyway, I'll add a recommendation which applies however you solve the above problem. Leave your gps_points table with just whatever data was collected. Don't mess with it in any TRIGGERs or any program used afterwards.

When you do the processing your users want, do by (using TRIGGERs or another program) to create new rows in one or more other tables. Copy across the values which were collected, and also set new values in new columns. Not only will this make debugging easier, but it will allow for a clear understanding of which data were automatically collected, and which were calculated from them. This simplifies your documentation, and also simplifies the writeup the (presumably) scientists have to make about how to reproduce their procedures. The drawback is that the database is bigger.

Hope that makes it worth while figuring out the the long text you posted. Good luck with the work.

(7) By pcreso on 2022-07-16 05:01:02 in reply to 6 [link] [source]

Thanks Simon,

I have it working nicely with multiple triggers. The (commented) sql file to create the tables & triggers is about 350 lines long... got a bit interesting.

The approach I have taken is to have one trigger updating the missing values in the gps_points table. This is straightforward.

The other triggers which insert & update records in the station table with data from the gps_points table have a case statement in the insert/update for each column which may be null, if not null, just use the new value - in this case they have probably already been updated by the other trigger.

If they are null, the case statement uses the same sql as in the other trigger, to replace the null with the correct value to write to the second table, this allows for the situation where the other trigger has not yet been invoked.

So both tables get the correct result irrespective of the order the triggers are fired off. A bit of redundancy, but a robust solution. Performance is near enough to instantaneous from a user perspective.

QGIS works brilliantly as the UI, spatialite as an pretty intelligent back end with triggers implementing the business rules.

One additional bit of complexity that works really well... There is a trigger_config table with one record per trigger, which stores a value that a trigger can use, as well as a flag determining whether the particular trigger is enabled. So the one database can be set up with different triggers, which can be turned off or on as each particular use case requires. Instead of a maintaining a dozen custom databases, each for a different type of survey, one (more complex but more flexible) database can be set up to support all use cases.

One example - a given survey may have a transect 0.2nm long. A trigger can generate a 0.2nm circle (buffer) around the start point. QGIS automatically refreshes this table in the map, so the user can see the vessel position (GPS) on the map, and simply by saving a start point, they have a range ring around that position to show when the required distance has been traversed. Different surveys can store a different distance in the config table, and disable the trigger if it is not relevant in a particular survey. The config table can be opened and edited in QGIS to control the system as required...

The combination of QGIS with a USB gps & spatialite with triggers is pretty powerful.

Thanks for your insights, they helped me decide how best to implement this functionality.

Brent