Novice question concerning adding conditional incremental integer counter

(1) By Gary (1codedebugger) on 2021-06-14 02:49:04 [link] [source]

Hello, would you please tell me if it is possible in SQLite to add a new index_no column to this data that counts each row, skipping rows with strongs_no of 'punc2'?
I can add a new column and update it in a Tcl script or C code, but don't know how to start a counter in plain SQL or have it restart at 1 at each change in book, chapter, verse. By that I mean, when book_no=1 and chapter_no=1 and verse_no=3, the new index_no needs to start at 1 again and increment until verse_no 4 is reached.

I don't want to eliminate the 'punc2' rows, but just want another index_no that starts at 1 and increments sequentially ignoring all 'punc2' rows.  The purpose is to attempt to use the combination of book, chapter, verse, index to join with two other sources in place of the desired match column which is Hebrew and which I cannot get to work properly even after unicode normalization on all source tables. 

At this point, I don't need to add a new column to the table, but would like to use a CTE to see if the sources will match properly, but I'm completely stumped as to how to get a value in a select statement that can be incremented.  I've looked at recursion in a with statement but don't see how to restart the counter when move to a new verse.

Thank you for any guidance you may be able to provide.


book_no  chapter_no  verse_no  index_no  strongs_no      desired new index
-------  ----------  --------  --------  ----------      -----------------
1        1           2         1         H776                   1
1        1           2         2         H1961                  2
1        1           2         3         H8414                  3
1        1           2         4         punc2              
1        1           2         5         H922                   4
1        1           2         6         punc2       
1        1           2         7         H2822                  5
1        1           2         8         H5921                  6
1        1           2         9         H6440                  7
1        1           2         10        H8415                  8
1        1           2         11        punc2   
1        1           2         12        H7307                  9
1        1           2         13        H430                  10
1        1           2         14        H7363                 11
1        1           2         15        H5921                 12
1        1           2         16        H6440                 13
1        1           2         17        H4325                 14
1        1           2         18        punc2    
1        1           3          1        H430                   1

(2) By Keith Medcalf (kmedcalf) on 2021-06-14 03:30:04 in reply to 1 [link] [source]

Like this perhaps?

create table x
(
  book_no integer,
  chapter_no integer,
  verse_no integer,
  index_no integer,
  strongs_no text,
  desired_new_index integer
);

insert into x (book_no, chapter_no, verse_no, index_no, strongs_no) values
(1, 1, 2, 1, 'H776'),
(1, 1, 2, 2, 'H1961'),
(1, 1, 2, 3, 'H8414'),
(1, 1, 2, 4, 'punc2'),
(1, 1, 2, 5, 'H922'),
(1, 1, 2, 6, 'punc2'),
(1, 1, 2, 7, 'H2822'),
(1, 1, 2, 8, 'H5921'),
(1, 1, 2, 9, 'H6440'),
(1, 1, 2, 10, 'H8415'),
(1, 1, 2, 11, 'punc2'),
(1, 1, 2, 12, 'H7307'),
(1, 1, 2, 13, 'H430'),
(1, 1, 2, 14, 'H7363'),
(1, 1, 2, 15, 'H5921'),
(1, 1, 2, 16, 'H6440'),
(1, 1, 2, 17, 'H4325'),
(1, 1, 2, 18, 'punc2'),
(1, 1, 3, 1, 'H430');


update x as dst
   set desired_new_index = ni
  from (
        select _rowid_,
               row_number() over (partition by book_no, chapter_no, verse_no order by index_no) as ni
          from x
         where strongs_no != 'punc2'
       ) as src
 where src._rowid_ == dst._rowid_
;

select * from x;
┌─────────┬────────────┬──────────┬──────────┬────────────┬───────────────────┐
│ book_no │ chapter_no │ verse_no │ index_no │ strongs_no │ desired_new_index │
├─────────┼────────────┼──────────┼──────────┼────────────┼───────────────────┤
│ 1       │ 1          │ 2        │ 1        │ H776       │ 1                 │
│ 1       │ 1          │ 2        │ 2        │ H1961      │ 2                 │
│ 1       │ 1          │ 2        │ 3        │ H8414      │ 3                 │
│ 1       │ 1          │ 2        │ 4        │ punc2      │                   │
│ 1       │ 1          │ 2        │ 5        │ H922       │ 4                 │
│ 1       │ 1          │ 2        │ 6        │ punc2      │                   │
│ 1       │ 1          │ 2        │ 7        │ H2822      │ 5                 │
│ 1       │ 1          │ 2        │ 8        │ H5921      │ 6                 │
│ 1       │ 1          │ 2        │ 9        │ H6440      │ 7                 │
│ 1       │ 1          │ 2        │ 10       │ H8415      │ 8                 │
│ 1       │ 1          │ 2        │ 11       │ punc2      │                   │
│ 1       │ 1          │ 2        │ 12       │ H7307      │ 9                 │
│ 1       │ 1          │ 2        │ 13       │ H430       │ 10                │
│ 1       │ 1          │ 2        │ 14       │ H7363      │ 11                │
│ 1       │ 1          │ 2        │ 15       │ H5921      │ 12                │
│ 1       │ 1          │ 2        │ 16       │ H6440      │ 13                │
│ 1       │ 1          │ 2        │ 17       │ H4325      │ 14                │
│ 1       │ 1          │ 2        │ 18       │ punc2      │                   │
│ 1       │ 1          │ 3        │ 1        │ H430       │ 1                 │
└─────────┴────────────┴──────────┴──────────┴────────────┴───────────────────┘

(3) By Keith Medcalf (kmedcalf) on 2021-06-14 03:35:58 in reply to 2 [link] [source]

See https://sqlite.org/windowfunctions.html for details of the Window Functions.

(4.1) By Keith Medcalf (kmedcalf) on 2021-06-14 04:23:59 edited from 4.0 in reply to 2 [link] [source]

If you want this to be able to be run multiple times when the data changes, then use:

update x as dst
   set desired_new_index = ni
  from (
        select _rowid_,
               row_number() over (partition by book_no, chapter_no, verse_no order by index_no) as ni
          from x
         where strongs_no != 'punc2'
       union all
        select _rowid_,
               null as ni
         from x
        where strongs_no == 'punc2'
       ) as src
 where src._rowid_ == dst._rowid_
;

** No need to sort the update table, so it can use a UNION ALL **

(6) By Gary (1codedebugger) on 2021-06-14 04:07:07 in reply to 4.0 [link] [source]

Thank you. At first, I didn't understand this but see that, if rows change strongs_no values from not 'punc2' to 'punc2', they would not get set to null on a new update run without this union code. That is something I would have definitely missed, especially since have never used a window function before.

(8) By Keith Medcalf (kmedcalf) on 2021-06-14 04:30:31 in reply to 6 [link] [source]

The Window Query that generates the new row_number excludes rows with strongs_no == 'punc2' from the Window Function. You could use a FILTER clause but that does not work with "built-in" window functions, only aggregate window functions, so since those rows are excluded, you have to add them back in with a union.

The union can be a union all because the rowid in the update table (src) is used to lookup the row to be updated in dst, so they do not need to be in order.

(5) By Gary (1codedebugger) on 2021-06-14 03:53:30 in reply to 2 [source]

Thank you very much.  I came across this type of method in searching but didn't realize that row_number() was a window function and thought it was available only in sql server. I should've been smart enough to have just typed row_number in the SQLite search box.

The best I came up with was to count the number of 'punc2' rows in a subquery. It appears to work but I assume is not very efficient.

  select book_no, chapter_no, verse_no, strongs_no, index_no,
    (case when strongs_no = 'punc2' then null
          else
               index_no - (select count(*)
                           from bh_interlinear
                           where book_no=b.book_no
                           and chapter_no=b.chapter_no
                           and verse_no=b.verse_no
                           and strongs_no='punc2'
                           and index_no <= b.index_no)
     end ) as new_index
  from bh_interlinear b
  where b.book_no=1
    and b.chapter_no=1
    and b.verse_no < 4


┌─────────┬────────────┬──────────┬────────────┬──────────┬───────────┐
│ book_no │ chapter_no │ verse_no │ strongs_no │ index_no │ new_index │
├─────────┼────────────┼──────────┼────────────┼──────────┼───────────┤
│ 1       │ 1          │ 1        │ H7225      │ 1        │ 1         │
│ 1       │ 1          │ 1        │ H1254      │ 2        │ 2         │
│ 1       │ 1          │ 1        │ H430       │ 3        │ 3         │
│ 1       │ 1          │ 1        │ H853       │ 4        │ 4         │
│ 1       │ 1          │ 1        │ H8064      │ 5        │ 5         │
│ 1       │ 1          │ 1        │ H853       │ 6        │ 6         │
│ 1       │ 1          │ 1        │ H776       │ 7        │ 7         │
│ 1       │ 1          │ 1        │ punc2      │ 8        │           │
│ 1       │ 1          │ 2        │ H776       │ 1        │ 1         │
│ 1       │ 1          │ 2        │ H1961      │ 2        │ 2         │
│ 1       │ 1          │ 2        │ H8414      │ 3        │ 3         │
│ 1       │ 1          │ 2        │ punc2      │ 4        │           │
│ 1       │ 1          │ 2        │ H922       │ 5        │ 4         │
│ 1       │ 1          │ 2        │ punc2      │ 6        │           │
│ 1       │ 1          │ 2        │ H2822      │ 7        │ 5         │
│ 1       │ 1          │ 2        │ H5921      │ 8        │ 6         │
│ 1       │ 1          │ 2        │ H6440      │ 9        │ 7         │
│ 1       │ 1          │ 2        │ H8415      │ 10       │ 8         │
│ 1       │ 1          │ 2        │ punc2      │ 11       │           │
│ 1       │ 1          │ 2        │ H7307      │ 12       │ 9         │
│ 1       │ 1          │ 2        │ H430       │ 13       │ 10        │
│ 1       │ 1          │ 2        │ H7363      │ 14       │ 11        │
│ 1       │ 1          │ 2        │ H5921      │ 15       │ 12        │
│ 1       │ 1          │ 2        │ H6440      │ 16       │ 13        │
│ 1       │ 1          │ 2        │ H4325      │ 17       │ 14        │
│ 1       │ 1          │ 2        │ punc2      │ 18       │           │
│ 1       │ 1          │ 3        │ H559       │ 1        │ 1         │
│ 1       │ 1          │ 3        │ H430       │ 2        │ 2         │
│ 1       │ 1          │ 3        │ punc2      │ 3        │           │
│ 1       │ 1          │ 3        │ H1961      │ 4        │ 3         │
│ 1       │ 1          │ 3        │ H216       │ 5        │ 4         │
│ 1       │ 1          │ 3        │ punc2      │ 6        │           │
│ 1       │ 1          │ 3        │ H1961      │ 7        │ 5         │
│ 1       │ 1          │ 3        │ H216       │ 8        │ 6         │
│ 1       │ 1          │ 3        │ punc2      │ 9        │           │
└─────────┴────────────┴──────────┴────────────┴──────────┴───────────┘

(7) By Keith Medcalf (kmedcalf) on 2021-06-14 04:13:05 in reply to 5 [link] [source]

The Window Function is probably more efficient.

In both cases you will need the appropriate indexes. (book_no, chapter_no, verse_no, index_no, strongs_no) for the Window Function, and (strongs_no) if you want to add the null values union.

You will also need the (book_no ...) index for your query as well.

If you really want the maximum speed for the Window Function )with the union) you probably want indexes thus:

create unique index xx1 on x (book_no, chapter_no, verse_no, index_no) where strongs_no != 'punc2';
create index xx2 on x (strongs_no) where strongs_no == 'punc2';