Conditions/vars in triggers possibility or stored procedures, on trees example

(1) By Vadim Goncharov (nuclight) on 2022-05-08 17:40:44 [source]

Lazily investigating alternative approach of storing trees to my prior reflections, I've found an article with combined scheme to storing trees https://www.depesz.com/2008/04/11/my-take-on-trees-in-sql/ which tries to simplify things by avoiding CTEs, and it uses PostgreSQL triggers to build helper data. Not that I'm quite interested in exactly this way of storing tree in SQL, but it arises interesting example of what trigger capabilities are.

Some of these triggers could probably be repeated in SQLite by just splitting one trigger to several triggers by using WHEN clause. But others has more complex conditions, even with variables - I don't have any clue is that doable in SQLite at all.

I know that SQLite does not have stored procedures because, as it is embedded database, application can register functions in application's language to run in same process with DB. But for triggers (and may be other standalone functions as well) here recursion problem arises: application calls SQL which calls application function which, in turn, call SQL again - is it possible? Won't we'll run into a problems with transactions, isolation, etc?¹

If it's possible, please show how would this triggers will look in SQLite with app's functions.

If it is not possible without stored procedures, then I vote up for TH1 suggestion as a language for stored procedures - because it is already written and tested in Fossil. And, much more importantly, because this is a PR/advocacy for Tcl which deserve to be more known, and SQLite popularity may do good here.

^{^} BTW have the same question for possible FTS compress() function implementation - is it possible to do SQL query inside such function to get dictionary from same DB?

(2) By M. Roller (MRoller) on 2022-05-08 22:22:52 in reply to 1 [link] [source]

You can implement trees in the style of depez in sqlite using triggers. He defines a second table containing all nodes related as ancestor and descendants, which is usually called the closure of the adjacency relation. This table can be managed by triggers listening to updates on the adjacency table.

Here is a sketch.

drop table if exists Tree;
create table Tree 
(
  Child not null
  , Parent 
  , constraint PK primary key (Child)
);


drop table if exists TreeClosure;
create table TreeClosure
(
  Descendant 
  , Ancestor 
  , Length
  , constraint PK primary key(Descendant, Ancestor)
);


-- Insert
drop trigger if exists TreeClosure_Insert_Leaf;
create trigger TreeClosure_Insert_Leaf
before insert on Tree
begin 

  select raise(FAIL, 'New parent must be a member of the tree.')
  where new.Parent is not null
  and not exists 
  (
    select * from Tree
    where Child = new.Parent
  );
  
  select raise(FAIL, 'Tree can have at most one root.')
  where new.Parent is null 
  and exists
  (
    select * from Tree where Parent is null
  );

  select raise(FAIL, 'New child must not be NULL.')
  where new.Child is null;
  
  select raise(FAIL, 'New child must not be a member of the tree.')  
  where exists 
  (
    select * from Tree
    where Child = new.Child
  );  

  insert into TreeClosure (Descendant , Ancestor , Length)
  values (new.Child , new.Child, 0)
  union all
  select new.Child , TC.Ancestor , TC.Length + 1
  from TreeClosure TC 
  where TC.Descendant = new.Parent;
  
end;  


-- Update
drop trigger if exists TreeClosure_Update_Parent;
create trigger TreeClosure_Update_Parent
before Update on Tree
begin 
  
  select raise(FAIL, 'Child can''t be updated.')
  where old.Child <> new.Child ;
 
  select raise(FAIL, 'Root can''t be updated.')
  where old.Parent is null;  
  
  select raise(FAIL, 'New parent must be a node.')
  where not exists 
  (
    select * from Tree
    where child = new.Parent 
  );
  
  select raise(FAIL, 'New parent can''t be a descendant of old child.')
  where exists 
  (
    select * from TreeClosure
    where Descendant = new.Parent and Ancestor = new.Child
  );    
  
  -- delete all paths containing the old edge
  delete from TreeClosure 
  where Descendant in 
  (
    select Descendant from TreeClosure where Ancestor = old.Child
  )  
  and Ancestor in
  (
    select Ancestor from TreeClosure where Descendant = old.Parent
  )
  ;

  -- insert all paths containing the new edge
  insert into TreeClosure(Descendant, Ancestor, Length) 
  select TC1.Descendant, TC2.Ancestor, TC1.Length + 1 + TC2.Length
  from TreeClosure TC1, TreeClosure TC2
  where TC1.Ancestor = new.Child
    and TC2.Descendant = new.Parent
  ;

end;


-- Delete
drop trigger if exists TreeClosure_Delete_Leaf;
create trigger TreeClosure_Delete_Leaf
before delete on Tree
begin 

  select raise(FAIL, 'Deleted edge must be a leaf.')
  where exists 
  (
    select * from Tree
    where Parent = old.Child
  );
  
  delete from TreeClosure 
  where  Descendant = old.Child;
  
end;

As far as I know the second argument of raise must be string literal, otherwise the error messages could be more specific.

(6) By anonymous on 2022-05-20 17:33:30 in reply to 2 [link] [source]

I've tried your sketch and it seems it is nearly not the same as depesz's one. First of all, there is no id in Tree table and Child is not clear how it should be used. Just a fresh (I've added one column to Tree)

INSERT INTO Tree (child, parent, codename) VALUES (1, 1, 'a');

immediately failed with

New parent must be a member of the tree

so it is not clear how to even begin to populate this tree. Also, depesz's article contains variants like "Moving some child object to become top-level object" and from text of triggers it seems multiple roots are not supported.

(7.1) By M. Roller (MRoller) on 2022-05-20 20:27:08 edited from 7.0 in reply to 6 [link] [source]

Yes, you can have more columns for each node. For the purpose of the sketch you could also use codenames as ID. The table Tree in the sketch just models the adjacency relation and nothing else.

The triggers will guarantee that you have a tree at all times. So the only node you can insert initially is the root.

insert into Tree(Child,Parent) values ('Root', null);

Once you have that, you can insert more children:

insert into Tree(Child,Parent) values ('1', 'Root');
insert into Tree(Child,Parent) values ('2', 'Root');
insert into Tree(Child,Parent) values ('1.1', '1');

The INSERT-trigger basically says: a new node must be attached to the existing tree, so the tree stays connected and has no circles.

Multiple roots are not allowed, because a tree is connected. You could also model a forrest, but then you need different triggers.

The UPDATE-trigger only allows to detach a child and attach it to another node. If you want to do things like add a new root as parent of root, you need to do two updates, and the adjacency half way between the two updates would no longer be a tree. So I don't think you could handle that with triggers.

EDIT (after a little bit more thought): Remove the check 'Tree can have at most one root.' from the INSERT-Trigger, and the check 'Root can''t be updated.' from the UPDATE-trigger. Now the adjacency relation models a forest, and you can attach the root of one tree to a node of another.

(3) By MBL (UserMBL) on 2022-05-09 07:48:55 in reply to 1 [link] [source]

I know that SQLite does not have stored procedures because, as it is embedded database, application can register functions in application's language to run in same process with DB. But for triggers (and may be other standalone functions as well) here recursion problem arises: application calls SQL which calls application function which, in turn, call SQL again - is it possible? Won't we'll run into a problems with transactions, isolation, etc?

is it possible to do SQL query inside such function to get dictionary from same DB?

Check C function context_db_handle how to use the db handle from inside of an extension function call's body.

I used the db from the extension function call's context to do several SQL statements and this works within the same transaction. My answer to both questions is YES, this is possible.

See my forum post in full length. The function 'call' takes the parameters and stores them with an 'insert' into the 'stack' table, which causes the triggers to be fired. The 'result', 'argc' and 'argv' functions are just helper functions to get the values out of the 'par#' columns of the record which is on top of the stack (the last inserted one) respective to fill the 'result' column of the record on top of the stack.

In my application the stack itself is part of my whole implementation - nowadays I probably would implement this as an attached in-memory database.

In case I gain some free time and have nothing else to do I may work on a table valued implementation using Lua... My experience with Lua is already very good in using it for simple function calls (with just one return value)...

See also forum post 92fb1fa2e8 from about 2 years ago with a similar question.

(4) By Vadim Goncharov (nuclight) on 2022-05-09 14:39:06 in reply to 3 [link] [source]

(as for your triggers version post above, I'll check that near to weekend)

Check C function context_db_handle how to use the db handle from inside of an extension function call's body.

I used the db from the extension function call's context to do several SQL statements and this works within the same transaction. My answer to both questions is YES, this is possible.

Well, good to see it's possible in C. However,

I will use Perl, and DBD::SQLite module does not seem to export this function :( Wonder if some other scripting language's bindings do.
still, for triggers it tend to be fragile if DB is not opened via native application with it's functions...

See my forum post in full length. The function 'call' takes the parameters and stores them with an 'insert' into the 'stack' table, which causes the triggers to be fired. The 'result', 'argc' and 'argv' functions are just helper functions to get the values out of the 'par#' columns of the record which is on top of the stack (the last inserted one) respective to fill the 'result' column of the record on top of the stack.

Well, I've read all thread there (and replied to some), but did not understood how your method works :/ And being so complex means it's hard to be done properly and fragility increases...

In case I gain some free time and have nothing else to do I may work on a table valued implementation using Lua... My experience with Lua is already very good in using it for simple function calls (with just one return value)...

Given that post's thread, it's unlikely for Lua to be imported to SQLite, thus it's impractical, and I'd vote for TH1, if not full Tcl (which also has more elegant language idea than Lua).

See also forum post 92fb1fa2e8 from about 2 years ago with a similar question.

OK, not quit understood, so asked a question there.

(5) By Donal Fellows (dkfellows) on 2022-05-10 13:44:14 in reply to 4 [link] [source]

I'd vote for TH1, if not full Tcl (which also has more elegant language idea than Lua)

Tcl would be a rather large step forward; its builds are quite a bit bigger than SQLite. OTOH, it would also provide a strong security model that could be used to make enabling things like stored procedures much more practical even in relatively hostile environments. (I'm pretty sure that safe interpreters are not part of the subset used in TH1.)

Side note: Some of Tcl's ideas were actually stolen from Lua. In particular, the non-recursive execution engine (which is what powers the coroutine and tailcall support of Tcl 8.6 onwards) was deeply inspired by Lua.