is it possible to update bunch of records in reverse order to avoid unique constrain violation

(1) By Vitalije (vitalije) on 2021-07-11 17:55:44 [link] [source]

Hello there! I am trying to use SQLite to store data model for an application which is basically a directed acyclic graph. The schema I am using is as follows:

CREATE TABLE outline
  ( ord integer primary key
  , p integer not null unique
  , level integer not null
  , gnx text not null
  , expanded bool not null default 0
  , marked bool not null default 0
  );

CREATE TABLE NDATA
  ( gnx primary key
  , h text not null default 'newHeadline'
  , b text not null default ''
  , ua text not null default '{}'
  );

The ord column should totally cover the subset of integers between 0 and number of nodes in the outline. There should be no gaps. The node whose ord is zero is the only node at level 0. Also the difference in levels between any two consecutive nodes must always be less than or equal 1 (* it could be less than zero, but never grater than one).

Vitalije

The subtree for any given gnx consists of the record with the given gnx followed by all the consecutive records which has level grater than the starting record with the given gnx. All occurrences of any given gnx must have the exact same subtree (except perhaps the differences in the starting level).

Now, I wrote sql commands for deleting any given node and its subtree, while maintaining all of the invariants described above, and I am pretty satisfied how fast they execute.

But I have a problem with the inserting nodes. To insert block of records somewhere inside the outline, first, I have to create a gap in the ord column.

I am trying to use update sql query like the following:

UPDATE outline set ord=ord+$size where ord>=$ord;

This kind of query works perfectly when I am deleting a block of consecutive records and then reducing the ord of the following records by the size of the deleted block. But when I try to use the same query to increase ord fields in order to create a gap, I am getting the error caused by the duplicates in the ord column. I tried to add order by ord desc to the update command, but it didn't solve the issue.

I know that I can increase the ord column by much larger number so that I don't break the unique constraint, and after inserting records, use again an update command to reduce the ord fields and eliminate the gap. But that seems to me like doing extra work, and I prefer if it would be possible to avoid renumerating records twice.

Is this possible? Am I doing something wrong?

Vitalije

(2) By Keith Medcalf (kmedcalf) on 2021-07-11 18:43:31 in reply to 1 [link] [source]

You mean like:

insert into outline (ord)
     select ord
       from outline
      where ord >= :emptyord
   order by ord desc
on conflict (ord) do update
        set ord = excluded.ord + 1;

which will make row with rowid :emptyord empty by incrementing the ord of that row and all rows following by 1.

(3.1) By Vitalije (vitalije) on 2021-07-11 19:14:04 edited from 3.0 in reply to 2 [link] [source]

Sorry, but I don't fully understand your query, but let me try to give you some data example. Let's say I have following values in the outline table:

ord |     p              | level |    gnx              |e|m
  0 |-4718826733835183985|      0|hidden-root-vnode-gnx|0|0
  1 | 1675942981690648659|      1|p001                 |0|0
  2 | 6829706290674704198|      2|p002                 |0|0
  3 |-5145489778970476800|      3|f004.txt             |0|0
  4 |-2019115627057265899|      4|g199                 |0|0
  5 | 2348433534929785603|      5|g197                 |0|0
  6 | 1765716150889595062|      6|g193                 |0|0
  7 |-3808691387856526814|      7|g192                 |0|0
  8 | 6630894544698555974|      8|g187                 |0|0
  9 | 6206133159931849095|      9|g175                 |0|0
 10 | 6441447125954489345|      6|g195                 |0|0
 11 | 4495563729056918305|      7|g190                 |0|0
 12 |-5195855217166601041|      8|g183                 |0|0
 13 | -143853921924943551|      9|g161                 |0|0
 14 |  829245143664686939|      9|g170                 |0|0

And I want to insert a block of records:

level | gnx | e | m
    7 |g187 | 0 | 0
    8 |g175 | 0 | 0

at the ord 11

So, first, I need to update records with ords 11-14 to become 13-16.

UPDATE outline SET ord=ord+2 WHERE ord >= 11 ORDER BY ord DESC;

I would expect after this query the outline table to look like:

ord |     p              | level |    gnx              |e|m
  0 |-4718826733835183985|      0|hidden-root-vnode-gnx|0|0
  1 | 1675942981690648659|      1|p001                 |0|0
  2 | 6829706290674704198|      2|p002                 |0|0
  3 |-5145489778970476800|      3|f004.txt             |0|0
  4 |-2019115627057265899|      4|g199                 |0|0
  5 | 2348433534929785603|      5|g197                 |0|0
  6 | 1765716150889595062|      6|g193                 |0|0
  7 |-3808691387856526814|      7|g192                 |0|0
  8 | 6630894544698555974|      8|g187                 |0|0
  9 | 6206133159931849095|      9|g175                 |0|0
 10 | 6441447125954489345|      6|g195                 |0|0
 13 | 4495563729056918305|      7|g190                 |0|0
 14 |-5195855217166601041|      8|g183                 |0|0
 15 | -143853921924943551|      9|g161                 |0|0
 16 |  829245143664686939|      9|g170                 |0|0

But, when this query run, it tries to change the ord=11 into 13, but there is already a record with the ord=13 in the outline and it violates unique constrain. If it tried to update record with the ord=14 first into 16, then ord=13 into 15, and so on, there will be no violation of the unique constrain. However, I can't find the way to convince SQLite to perform updates in the descending order of ord.

I hope, this example make it a bit clearer. The values in the p column are just random() values. I need them to keep track of specific nodes in the outline, regardless of possible insertions/deletions above/before which would change the ord values.

(4.1) By Keith Medcalf (kmedcalf) on 2021-07-11 19:38:17 edited from 4.0 in reply to 3.1 [link] [source]

insert into outline (ord)
     select ord
       from outline
      where ord >= 7
   order by ord desc
on conflict (ord) do update
        set ord = excluded.ord + 2;

then insert your two new rows. You may have to do:

insert into outline
     select *
       from outline
      where ord >= 7
   order by ord desc
on conflict (ord) do update
        set ord = excluded.ord + 2;

to get around your NOT NULL constraints.

(5) By Keith Medcalf (kmedcalf) on 2021-07-11 19:50:30 in reply to 4.1 [link] [source]

sqlite> CREATE TABLE outline
   ...>   ( ord integer primary key
   ...>   , p integer not null unique
   ...>   , level integer not null
   ...>   , gnx text not null
   ...>   , expanded bool not null default 0
   ...>   , marked bool not null default 0
   ...>   );
sqlite> insert into outline values (1,1,1,'a',1,1);
sqlite> insert into outline values (2,2,2,'a',1,1);
sqlite> select * from outline;
┌─────┬───┬───────┬─────┬──────────┬────────┐
│ ord │ p │ level │ gnx │ expanded │ marked │
├─────┼───┼───────┼─────┼──────────┼────────┤
│ 1   │ 1 │ 1     │ a   │ 1        │ 1      │
│ 2   │ 2 │ 2     │ a   │ 1        │ 1      │
└─────┴───┴───────┴─────┴──────────┴────────┘
sqlite> insert into outline
   ...>      select *
   ...>        from outline
   ...>       where ord >= 2
   ...>    order by ord desc
   ...> on conflict (ord) do update
   ...>         set ord = excluded.ord + 1;
sqlite> select * from outline;
┌─────┬───┬───────┬─────┬──────────┬────────┐
│ ord │ p │ level │ gnx │ expanded │ marked │
├─────┼───┼───────┼─────┼──────────┼────────┤
│ 1   │ 1 │ 1     │ a   │ 1        │ 1      │
│ 3   │ 2 │ 2     │ a   │ 1        │ 1      │
└─────┴───┴───────┴─────┴──────────┴────────┘
sqlite>

(6) By Vitalije (vitalije) on 2021-07-11 20:12:27 in reply to 5 [link] [source]

Thanks, it works perfectly.

(7.4) By Keith Medcalf (kmedcalf) on 2021-07-11 20:53:04 edited from 7.3 in reply to 6 [source]

And here is how you "close the gaps" in outline so that ord starts at 1 and goes up one by each:

begin immediate;

-- renumber rows going up for ord > 0 closing gaps
with r(old, new) as materialized
     (
       select ord as old,
              row_number() over (order by ord) as new
         from outline
        where ord > 0
     )
insert into outline
     select *
       from outline
      where ord in (select old from r where old != new)
   order by ord
on conflict (ord) do update
        set ord = (select new from r where old == excluded.ord)
;

-- renumber rows going down to make space for ord < 1 rows
with r(old, new) as materialized
     (
      select ord as old,
             row_number() over (order by ord) + (select count(*) from outline where ord < 1) as new
        from outline
       where ord > 0
     )
insert into outline
     select *
       from outline
      where ord in (select old from r where old != new)
   order by ord desc
on conflict (ord) do update
        set ord = (select new from r where old == excluded.ord)
;

-- set final row numbers for rowid < 1 rows, renumber going down
with r(old, new) as materialized
     (
       select ord as old,
              row_number() over (order by ord) as new
         from outline
        where ord < 1
     )
insert into outline
     select *
       from outline
      where ord in (select old from r where new != old)
   order by ord desc
on conflict (ord) do update
        set ord = (select new from r where old == excluded.ord)
;

commit;

NB: Fixed some errors.

(8.3) By Keith Medcalf (kmedcalf) on 2021-07-11 22:25:16 edited from 8.2 in reply to 6 [link] [source]

This one uses a temporary without rowid table to speed up the renumbering:

begin immediate;
-- make sure we have the correct temporary table r
drop table if exists temp.r;
create table temp.r (old primary key, new) without rowid;
-- for all _rowid_ > 0 compute new _rowid_ to close gaps
insert into temp.r
     select _rowid_ as old,
            row_number() over (order by _rowid_) as new
       from {table}
      where _rowid_ > 0
   order by _rowid_
;
-- update the _rowid_ in ascending order (move _rowid_ > 0 down to close gaps)
insert into {table}
     select *
       from {table}
      where _rowid_ in (
                        select old
                          from temp.r
                         where old != new
                       )
     order by _rowid_
  on conflict (_rowid_) do update
          set _rowid_ = (
                         select new
                           from temp.r
                          where old == excluded._rowid_
                        )
;
-- clear out temporary table
delete from temp.r;
-- compute the new _rowid_ for _rowid_ > 0 to leave space for _rowid_ < 1 rows
insert into r
     select _rowid_ as old,
            row_number() over (order by _rowid_) + (select count(*) from {table} where _rowid_ < 1) as new
       from {table}
      where _rowid_ > 0
   order by _rowid_
;
-- update the _rowid_ in descending order
insert into {table}
     select *
      from {table}
     where _rowid_ in (
                       select old
                         from temp.r
                        where old != new
                      )
     order by _rowid_ desc
  on conflict (_rowid_) do update
          set _rowid_ = (
                         select new
                           from temp.r
                          where old == excluded._rowid_
                        )
;
-- clear out temporary table
delete from temp.r;
-- compute final _rowid_ for rows with _rowid_ < 1
insert into temp.r
     select _rowid_ as old,
            row_number() over (order by _rowid_) as new
       from {table}
      where _rowid_ < 1
   order by _rowid_
;
-- renumber _rowid_ < 1 to correct location in descending order
insert into {table}
     select *
       from {table}
      where _rowid_ in (
                        select old
                          from temp.r
                         where old != new
                       )
   order by _rowid_ desc
on conflict (_rowid_) do update
        set _rowid_ = (
                       select new
                         from temp.r
                        where old == excluded._rowid_
                      )
;
-- get rid of our temporary table
drop table if exists temp.r;
commit;

replace {table} with the name of the table. It does not care the name of the integer primary key as long as it is accessible with the name _rowid_.

(9) By Keith Medcalf (kmedcalf) on 2021-07-12 22:14:51 in reply to 6 [link] [source]

Sample Python code: OrderedTable.py:


def OpenRowid(db, table, rowid, count=1):
    if count < 1:
        raise ValueError('Count must be >= 1')
    if db.cursor().execute(OpenRowid.test.format(table=table), (rowid, count)).fetchone()[0] != 0:
        db.cursor().execute(OpenRowid.sql.format(table=table), (rowid, count))

def RenumberTable(db, table):
    db.cursor().execute(RenumberTable.sql.format(table=table))


OpenRowid.test = '''
select count(*)
  from {table}
 where _rowid_ >= ?1
   and _rowid_ < ?1 + ?2
;
'''

OpenRowid.sql = '''
insert into {table}
     select *
       from {table}
      where _rowid_ >= ?
   order by _rowid_ desc
on conflict (_rowid_) do update
        set _rowid_ = excluded._rowid_ + ?;
'''

RenumberTable.sql = '''
begin immediate;

-- make sure we have the correct temporary table r

drop table if exists temp.r
;
create temporary table if not exists r
(
   old primary key,
   new
)
without rowid
;

-- for all _rowid_ > 0 compute new _rowid_ to close gaps

insert into temp.r
     select _rowid_ as old,
            row_number() over (order by _rowid_) as new
       from {table}
      where _rowid_ > 0
   order by _rowid_
;

-- apply _rowid_ updates in ascending order of _rowid_ (move _rowid_ down)

insert into {table}
     select *
       from {table}
      where _rowid_ in (
                        select old
                          from temp.r
                         where old != new
                       )
   order by _rowid_
on conflict (_rowid_) do update
        set _rowid_ = (
                       select new
                         from temp.r
                        where old == excluded._rowid_
                      )
;

-- clear temporary table

delete from temp.r;

-- for _rowid_ > 0 compute offset _rowid_ so rows with _rowid_ < 1 can be moved into place

insert into r
     select _rowid_ as old,
            row_number() over (order by _rowid_) + (select count(*) from {table} where _rowid_ < 1) as new
       from {table}
      where _rowid_ > 0
   order by _rowid_
;

-- apply _rowid_ updates in descending order of _rowid_ (move _rowid_ up)

insert into {table}
     select *
      from {table}
     where _rowid_ in (
                       select old
                         from temp.r
                        where old != new
                      )
   order by _rowid_ desc
on conflict (_rowid_) do update
        set _rowid_ = (
                       select new
                         from temp.r
                        where old == excluded._rowid_
                      )
;

-- clear temporary table

delete from temp.r;

-- compute final _rowid_ for rows with _rowid_ < 1

insert into temp.r
     select _rowid_ as old,
            row_number() over (order by _rowid_) as new
       from {table}
      where _rowid_ < 1
   order by _rowid_
;

-- apply _rowid_ updates in descending order of _rowid_ (move _rowid_ up)

insert into {table}
     select *
       from {table}
      where _rowid_ in (
                        select old
                          from temp.r
                         where old != new
                       )
   order by _rowid_ desc
on conflict (_rowid_) do update
        set _rowid_ = (
                       select new
                         from temp.r
                        where old == excluded._rowid_
                      )
;

-- clean up our temporary table

drop table if exists temp.r;

-- commit our transaction

commit;
'''

if __name__ == '__main__':
    import apsw
    db = apsw.Connection()
    db.cursor().execute('create table x(id integer primary key, v integer);')
    db.cursor().execute('insert into x select value*10, value from generate_series where start=1 and stop=10;')
    db.cursor().execute('insert into x select -(value-1)*10, -value from generate_series where start=1 and stop=10;')
    db.cursor().execute('delete from x where id=50;')
    db.cursor().execute('delete from x where id=-30;')
    for row in db.cursor().execute('select * from x;'):
        print(row)
    print()
    RenumberTable(db, 'x')
    for row in db.cursor().execute('select * from x;'):
        print(row)
    print()
    OpenRowid(db, 'x', 3)
    OpenRowid(db, 'x', -4)
    for row in db.cursor().execute('select * from x;'):
        print(row)
    print()
    db.cursor().execute('insert into x values (3, 103);')
    db.cursor().execute('insert into x values (-4, -104);')
    for row in db.cursor().execute('select * from x;'):
        print(row)
    print()
    RenumberTable(db, 'x')
    for row in db.cursor().execute('select * from x;'):
        print(row)
    print()

Produces:

>OrderedTable.py
Row(id=-90, v=-10)
Row(id=-80, v=-9)
Row(id=-70, v=-8)
Row(id=-60, v=-7)
Row(id=-50, v=-6)
Row(id=-40, v=-5)
Row(id=-20, v=-3)
Row(id=-10, v=-2)
Row(id=0, v=-1)
Row(id=10, v=1)
Row(id=20, v=2)
Row(id=30, v=3)
Row(id=40, v=4)
Row(id=60, v=6)
Row(id=70, v=7)
Row(id=80, v=8)
Row(id=90, v=9)
Row(id=100, v=10)

Row(id=1, v=-10)
Row(id=2, v=-9)
Row(id=3, v=-8)
Row(id=4, v=-7)
Row(id=5, v=-6)
Row(id=6, v=-5)
Row(id=7, v=-3)
Row(id=8, v=-2)
Row(id=9, v=-1)
Row(id=10, v=1)
Row(id=11, v=2)
Row(id=12, v=3)
Row(id=13, v=4)
Row(id=14, v=6)
Row(id=15, v=7)
Row(id=16, v=8)
Row(id=17, v=9)
Row(id=18, v=10)

Row(id=1, v=-10)
Row(id=2, v=-9)
Row(id=4, v=-8)
Row(id=5, v=-7)
Row(id=6, v=-6)
Row(id=7, v=-5)
Row(id=8, v=-3)
Row(id=9, v=-2)
Row(id=10, v=-1)
Row(id=11, v=1)
Row(id=12, v=2)
Row(id=13, v=3)
Row(id=14, v=4)
Row(id=15, v=6)
Row(id=16, v=7)
Row(id=17, v=8)
Row(id=18, v=9)
Row(id=19, v=10)

Row(id=-4, v=-104)
Row(id=1, v=-10)
Row(id=2, v=-9)
Row(id=3, v=103)
Row(id=4, v=-8)
Row(id=5, v=-7)
Row(id=6, v=-6)
Row(id=7, v=-5)
Row(id=8, v=-3)
Row(id=9, v=-2)
Row(id=10, v=-1)
Row(id=11, v=1)
Row(id=12, v=2)
Row(id=13, v=3)
Row(id=14, v=4)
Row(id=15, v=6)
Row(id=16, v=7)
Row(id=17, v=8)
Row(id=18, v=9)
Row(id=19, v=10)

Row(id=1, v=-104)
Row(id=2, v=-10)
Row(id=3, v=-9)
Row(id=4, v=103)
Row(id=5, v=-8)
Row(id=6, v=-7)
Row(id=7, v=-6)
Row(id=8, v=-5)
Row(id=9, v=-3)
Row(id=10, v=-2)
Row(id=11, v=-1)
Row(id=12, v=1)
Row(id=13, v=2)
Row(id=14, v=3)
Row(id=15, v=4)
Row(id=16, v=6)
Row(id=17, v=7)
Row(id=18, v=8)
Row(id=19, v=9)
Row(id=20, v=10)