The opposite of AUTOINCREMENT

(1) By anonymous on 2021-03-23 15:53:56 [link] [source]

Suppose you want to keep ids small and always (re-)use the first unused id for a new row. This is not complex:

    select 1 as newid
        where newid not in (select thingid from thing)
    union all select thingid+1 as newid from thing
        where newid not in (select thingid from thing)
    limit 1;

Now suppose that new things arrive in batches and you want to find the next N ids in one operation. Efficiency requirements:

Scan all existing ids at most once.
Use temporary space proportional to N rather than the number of existing ids or the number of unused ids.

This does get complex:

    with recursive
    result (newid) as
        (select candidate from
            (select 1 as candidate
                where candidate not in (select thingid from thing)
            union all select thingid+1 as candidate from thing
                where candidate not in (select thingid from thing)
            limit ?1)
        union all select newid+1 as candidate from result
            where candidate not in (select thingid from thing)
        order by candidate
        limit ?1)
    select newid from result;

Can anybody suggest something simpler?

(2) By Jim Morris (jimmorris33215) on 2021-03-23 17:09:47 in reply to 1 [link] [source]

An alternative idea is to create a table to hold freed ID values and use a delete trigger to add to that table. Then you can "pop" the values from there with little overhead.

(3) By Gerry Snyder (GSnyder) on 2021-03-23 17:17:28 in reply to 1 [link] [source]

If the real requirement is just to keep the id small, can't you just let SQLite do its thing?

Do you know that that is not good enough?

Gerry Snyder

(4) By Warren Young (wyoung) on 2021-03-23 17:33:49 in reply to 1 [link] [source]

Before you embark on such a project, you have to be absolutely sure your referential integrity is solid, else you can end up with old records referring to other old deleted records that actually get tied to new records with the old IDs.

In other words, when the referred-to records get deleted, everything referring to them must also be deleted.

(5) By Ryan Smith (cuz) on 2021-03-23 21:18:47 in reply to 1 [source]

Can anybody suggest something simpler?

Simpler? Not by much, but more useful - definitely. I have done this for some project, not really an SQLite project but I did use SQLite to keep track of the Index keys used.

Let me first state that I'm with Jim - best is a trigger and ID table.

The query I use is a little complicated, but very fast, and most importantly, 100% suggests the best index-gap to use for the range to be inserted (which is what I needed, but may or may not be useful to you).

I've dusted it off and made a small script to demonstrate:

  -- SQLite version 3.30.1  [ Release: 2019-10-10 ]  on  SQLitespeed version 2.1.3.11.
  -- ================================================================================================

CREATE TABLE t(ID INT PRIMARY KEY, Val TEXT);


-- Making a small table with gaps from index 4 to 9 (6 IDs) and 12 to 15 (4 IDs)

INSERT INTO t(ID,Val) VALUES
 ( 1, 'A1')
,( 2, 'A2')
,( 3, 'A3')
,(10, 'A10')
,(11, 'A11')
,(16, 'A16')
,(18, 'A18')
;

-- Query to find the best Starting INSERT ID listed in order of suitability, best at the top.
-- Here I manually added a "3" binding in the first CTE as the RequiredSize
-- You can use LIMIT 1 to only see the best option, but I all in to see the working:

WITH AllGaps(cID, GapSize, RequiredSize) AS (
    SELECT ID, (MAX(ID) OVER w1) - (MIN(ID) OVER w1) - 1,  3
      FROM t
     WHERE ID+1 NOT IN (SELECT ID FROM t) OR (ID>1 AND ID-1 NOT IN (SELECT ID FROM t))
    WINDOW w1 AS (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
     ORDER BY ID
), BestGaps(cID, GapSize) AS (
    SELECT cID+1, GapSize FROM AllGaps
     WHERE GapSize >= RequiredSize AND cID+1 NOT IN (SELECT ID FROM t)
    UNION ALL
    SELECT IFNULL(MAX(ID),0)+1, 0xFFFFFFFF FROM t
    ORDER BY GapSize ASC, cID+1 ASC
)
SELECT cID FROM BestGaps -- LIMIT 1
;

  --  cID     
  -- ----
  --  12 
  --   4 
  --  19 

-- Indeed, the best gap for inserting only 3 items is the 4 available indexes starting at 12, then the gap of 6 IDs starting at 4.
-- Note that the next AutoInc ID is always the last item, in case no suitable gaps are found.


-- =========================================================
-- In this next run I changed that to Required Size to 5:

WITH AllGaps(cID, GapSize, RequiredSize) AS (
    SELECT ID, (MAX(ID) OVER w1) - (MIN(ID) OVER w1) - 1,  5
      FROM t
     WHERE ID+1 NOT IN (SELECT ID FROM t) OR (ID>1 AND ID-1 NOT IN (SELECT ID FROM t))
    WINDOW w1 AS (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
     ORDER BY ID
), BestGaps(cID, GapSize) AS (
    SELECT cID+1, GapSize FROM AllGaps
     WHERE GapSize >= RequiredSize AND cID+1 NOT IN (SELECT ID FROM t)
    UNION ALL
    SELECT IFNULL(MAX(ID),0)+1, 0xFFFFFFFF FROM t
    ORDER BY GapSize ASC, cID+1 ASC
)
SELECT cID FROM BestGaps -- LIMIT 1
;

  --      cID    
  -- ------------
  --       4     
  --      19     

-- Indeed now the only viable insert positions are the 6-ID gap starting at 4 and the ID at next autoinc ID: 19.


-- =========================================================
-- Now I'm just making a table with a million entries to test speed

WITH AX(AID) AS (
    SELECT 25 UNION ALL SELECT AID+1 FROM AX WHERE AID < 1000000
)
INSERT INTO t(ID,Val)
SELECT AID, 'A'||AID FROM AX
;


-- And here making some more gaps in the very large table

DELETE FROM t
 WHERE ID BETWEEN 5000 AND 5020   -- Make gap of Size 21
    OR ID BETWEEN 8000 AND 8002   -- Make Gap of Size 3
    OR ID BETWEEN 9000 AND 9010   -- Make Gap of Size 11
;


-- =========================================================
-- Again finding the most suitable gap with Size 3:

WITH AllGaps(cID, GapSize, RequiredSize) AS (
    SELECT ID, (MAX(ID) OVER w1) - (MIN(ID) OVER w1) - 1,  3
      FROM t
     WHERE ID+1 NOT IN (SELECT ID FROM t) OR (ID>1 AND ID-1 NOT IN (SELECT ID FROM t))
    WINDOW w1 AS (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
     ORDER BY ID
), BestGaps(cID, GapSize) AS (
    SELECT cID+1, GapSize FROM AllGaps
     WHERE GapSize >= RequiredSize AND cID+1 NOT IN (SELECT ID FROM t)
    UNION ALL
    SELECT IFNULL(MAX(ID),0)+1, 0xFFFFFFFF FROM t
    ORDER BY GapSize ASC, cID+1 ASC
)
SELECT cID FROM BestGaps -- LIMIT 1
;

  --    cID   
  -- ---------
  --    8000    -- This is the best gap, exactly 3 long, starting at ID 8000
  --     12     -- 2nd best Size, etc.
  --     4    
  --     19   
  --    9000  
  --    5000  
  --  1000001   -- Next Autoinc ID


-- =========================================================
-- Again finding the most suitable gap with Size 5:

WITH AllGaps(cID, GapSize, RequiredSize) AS (
    SELECT ID, (MAX(ID) OVER w1) - (MIN(ID) OVER w1) - 1,  5
      FROM t
     WHERE ID+1 NOT IN (SELECT ID FROM t) OR (ID>1 AND ID-1 NOT IN (SELECT ID FROM t))
    WINDOW w1 AS (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
     ORDER BY ID
), BestGaps(cID, GapSize) AS (
    SELECT cID+1, GapSize FROM AllGaps
     WHERE GapSize >= RequiredSize AND cID+1 NOT IN (SELECT ID FROM t)
    UNION ALL
    SELECT IFNULL(MAX(ID),0)+1, 0xFFFFFFFF FROM t
    ORDER BY GapSize ASC, cID+1 ASC
)
SELECT cID FROM BestGaps -- LIMIT 1
;

  --    cID   
  -- ---------
  --     4        -- Indeed Best gap.
  --     19   
  --    9000  
  --    5000  
  --  1000001 

  --    Item Stats:  Item No:           8             Query Size (Chars):  569
  --                 Result Columns:    1             Result Rows:         5
  --                 VM Work Steps:     16000174      Rows Modified:       0
  --                 Sort Operations:   2             Table-Scan Steps:    999947
  --                 Prepare Time:      -- --- --- --- --.----
  --                 Query Run:         0d 00h 00m and 00.535s
  --                 Full Query Time:   0d 00h 00m and 00.535s
  --                 Script Time:       0d 00h 00m and 02.177s
  --                 Query Result:      Success.
  -- ------------------------------------------------------------------------------------------------


-- =========================================================
-- Here I added your original query, which also runs very fast, and produce
-- exact available ranges, but without the option to request the best gap of
-- an arbitrary size.

with recursive
    result (newid) as
        (select candidate from
            (select 1 as candidate
                where candidate not in (select ID from t)
            union all select ID+1 as candidate from t
                where candidate not in (select ID from t)
            limit 50)
        union all select newid+1 as candidate from result
            where candidate not in (select ID from t)
        order by candidate
        limit 50)
    select newid from result;


  --     newid   
  -- ------------
  --       4     
  --       5     
  --       6     
  --       7     
  --       8     
  --       9     
  --      12     
  --      13     
  --      14     
  --      15     
  --      17     
  --      19     
  --      20     
  --      21     
  --      22     
  --      23     
  --      24     
  --     5000    
  --     5001    
  --     5002    
  --     5003    
  --     5004    
  --     5005    
  --     5006    
  --     5007    
  --     5008    
  --     5009    
  --     5010    
  --     5011    
  --     5012    
  --     5013    
  --     5014    
  --     5015    
  --     5016    
  --     5017    
  --     5018    
  --     5019    
  --     5020    
  --     8000    
  --     8001    
  --     8002    
  --     9000    
  --     9001    
  --     9002    
  --     9003    
  --     9004    
  --     9005    
  --     9006    
  --     9007    
  --     9008    

  --    Item Stats:  Item No:           9             Query Size (Chars):  495
  --                 Result Columns:    1             Result Rows:         50
  --                 VM Work Steps:     8000959       Rows Modified:       0
  --                 Sort Operations:   0             Table-Scan Steps:    999947
  --                 Prepare Time:      -- --- --- --- --.----
  --                 Query Run:         0d 00h 00m and 00.475s
  --                 Full Query Time:   0d 00h 00m and 00.475s
  --                 Script Time:       0d 00h 00m and 02.653s
  --                 Query Result:      Success.
  -- ------------------------------------------------------------------------------------------------

DROP TABLE t;

  --   Script Stats: Total Script Execution Time:     0d 00h 00m and 02.696s
  --                 Total Script Query Time:         0d 00h 00m and 02.649s
  --                 Total Database Rows Changed:     1000018
  --                 Total Virtual-Machine Steps:     71002402
  --                 Last executed Item Index:        10
  --                 Last Script Error:               
  -- ------------------------------------------------------------------------------------------------

  -- 2021-03-23 22:40:31.218  |  [Success]    Script Success.
  -- 2021-03-23 22:40:31.218  |  [Success]    Transaction Rolled back.
  -- -------  DB-Engine Logs (Contains logged information from all DB connections during run)  ------
  -- [2021-03-23 22:40:28.703] APPLICATION : Script D:\Documents\FindIDGaps.sql started at 22:40:28.703 on 23 March.
  -- ================================================================================================

Hope something in there might be useful. Good luck!