Best way to handle unique column needs

(1) By Goravani (DasGoravani) on 2020-05-05 21:37:51 [link] [source]

I have an existing application that I have brought over into SQLite, or I'm at early stages of doing so.

Right now I'm dealing with the issue of making my old unique key fields be that again but in SQLite.

So I have 50 tables which have such a column.. used to be the unique column in old database, the column has values in the data from the old database.. now I need that field to be the same.. a unique column numeric integer 32 bit value, and thus I need it to auto fill itself with a value, either overwriting my already there values, or starting after the highest present value

Can you think of a way to do the above in SQLite?

I see that I can fetch the last rowid after an insert, so if I was starting from scratch I would fetch that and save it in a column to use as my unique identifier. I thought having the used to be our unique field set to primary key, that it would get the rowid value in it, but actually I'm getting the reverse, I'm getting the old field values in selects where I ask for the rowid

So I'm confused as to how to get my column of choice to be a unique numeric column since the rowid won't come into it at present.. I save a new record and my numeric primary key returns null or zero even after fetching back.. it doesn't have the rowid in it.. will an auto incrementing restraint help me ? I'd have to remake my tables to do that.. from what I understand

Any help on making my old column still be a unique numeric identifier column appreciated. Or other comments as I am new.

(2) By Larry Brasfield (LarryBrasfield) on 2020-05-06 00:09:12 in reply to 1 [link] [source]

Your problem seems to be easily solved.

For your new tables, be sure to declare the unique integer value column as: " mumbleId INTEGER PRIMARY KEY AUTOINCREMENT, " with mumbleId chosen more intelligently. This will be required to be unique as a consequence of being the primary key, and, per this Rowid Tables documentation, will be an alias for the underlying rowid. (You may not care about this, but it improves efficiency somewhat.) Then, simply insert your values, specifying the mumbleId column values as the same unique numeric integer values from your old DB. The autoincrement tracking will take care of tracking the highest value inserted so that, if ever an insert is done without specifying the mumbleId value, it can be generated. That would be the only circumstance where you would need to "fetch the last rowid after an insert."

If this does not work for some reason, or does not solve your issue, please show what goes wrong and be more specific about what shortcoming you perceive. (I had a difficult time understanding what problem you actually encountered.) It would help to show DDL and queries, together with results that seem awry.

One issue that was unclear in your post: We are discussing a primary key, right? And it is already referenced from other tables, I would expect. So you have an interest in preserving those linkages, right?

(3) By Gunter Hick (gunter_hick) on 2020-05-06 13:48:54 in reply to 1 [link] [source]

The sqlite3_last_insert_rowid() function is not useful for querying the highest rowid of a table. First, because it is an attribute of the Connection. If you have not inserted into a rowid table since opening the connection, it will be 0. Second, for the value to have any meaning, you must have exclusive use of this connection, so that no other thread may sneak in and insert into a different table, thus clobbering your value. Third, it only records one rowid; a statement that inserts more than one row will only return the last rowid.

(4) By Keith Medcalf (kmedcalf) on 2020-05-06 14:49:47 in reply to 2 [link] [source]

Skip the "autoincrement" unless you actually need it. You probably don't. 99.9999999999999% of the time autoincrement is used when it is not required because "it sounds nice" and not because it fulfills any useful purpose (other than consuming CPU and I/O for no useful purpose).

Avoid that trap. Do not use autoincrement unless you need it, and you probably don't.

(5) By Ryan Smith (cuz) on 2020-05-06 15:33:40 in reply to 1 [link] [source]

It's a bit hard to follow the exact requirement, but here are some ways of doing what I /think/ might be (at least partly) what you wanted to do.

It starts by creating an example old table that had a unique key and then fills it with a bit of data.

Then makes a new table with 2 Unique keys, one primary, one simply another standard unique key.

It then copies the old values to the new table using one direct copy and one Window-function example way of filling unique keys...

Then, imagining the new system is now running on its own, adds its own further rows with the PK automatically incrementing (Note: without "Autoincrement" being specified), and shows a way to fill out the second unique column with Unique keys after-the-fact. (Of course it should ideally be filled by the application directly).

  -- Processing SQL in: D:\Documents\SQLiteAutoScript.sql
  -- SQLite version 3.30.1  [ Release: 2019-10-10 ]  on  SQLitespeed version 2.1.3.11.
================================================================================================

CREATE TABLE Old_Table(OldUniqueID INTEGER PRIMARY KEY, OldField2, OldField3);

INSERT INTO Old_Table(OldField2, OldField3) VALUES
  ('John Smith','Tennis')
 ,('Joan Smith','Football')
 ,('James Smith','Snooker')
 ;

SELECT * FROM Old_Table;

  --  OldUniqueID|OldField2    |OldField3 
  -- ------------|-------------|----------
  --       1     |John Smith   |Tennis    
  --       2     |Joan Smith   |Football  
  --       3     |James Smith  |Snooker   


CREATE TABLE New_Better_Table(
  NewUniqueID INTEGER PRIMARY KEY,
  NewSecondaryUniqueKey INT UNIQUE,
  NewDefaultInt INT DEFAULT 5,
  NewField4,
  NewField5
);


INSERT INTO New_Better_Table(NewUniqueID, NewSecondaryUniqueKey, NewField4,NewField5)
SELECT OldUniqueID, row_number() OVER (PARTITION BY 1), OldField2, OldField3
  FROM Old_Table
;

SELECT * FROM New_Better_Table;

  --             |NewSecondar-|NewDefau-|             |          
  --  NewUniqueID| yUniqueKey |  ltInt  |NewField4    |NewField5 
  -- ------------|------------|---------|-------------|----------
  --       1     |      1     |    5    |John Smith   |Tennis    
  --       2     |      2     |    5    |Joan Smith   |Football  
  --       3     |      3     |    5    |James Smith  |Snooker   


-- Brand new rows adding (with PK incrementing automatically):

INSERT INTO New_Better_Table(NewField4,NewField5) VALUES
  ('John Jones','Badminton')
 ,('Joan Jones','Soccer')
 ,('James Jones','Pool')
 ;


SELECT * FROM New_Better_Table;

  --             |NewSecon-|         |             |           
  --             |daryUniq-|NewDefau-|             |           
  --  NewUniqueID|ueKey    |  ltInt  |NewField4    |NewField5  
  -- ------------|---------|---------|-------------|-----------
  --       1     |1        |    5    |John Smith   |Tennis     
  --       2     |2        |    5    |Joan Smith   |Football   
  --       3     |3        |    5    |James Smith  |Snooker    
  --       4     |NULL     |    5    |John Jones   |Badminton  
  --       5     |NULL     |    5    |Joan Jones   |Soccer     
  --       6     |NULL     |    5    |James Jones  |Pool       


-- After-the-fact key adding:

WITH NMAX(PK,NewKey) AS (
    SELECT MAX(NewUniqueID)+1, MAX(NewSecondaryUniqueKey)+1
      FROM New_Better_Table
     WHERE NewSecondaryUniqueKey IS NOT NULL
    UNION ALL
    SELECT PK+1, NewKey+1 FROM NMAX WHERE PK < 1000 -- however much needed
)
UPDATE New_Better_Table SET
  NewSecondaryUniqueKey = (SELECT NewKey FROM NMAX WHERE PK = NewUniqueID)
 WHERE NewSecondaryUniqueKey IS NULL
;


SELECT * FROM New_Better_Table;


  --             |NewSecondar-|NewDefau-|             |           
  --  NewUniqueID| yUniqueKey |  ltInt  |NewField4    |NewField5  
  -- ------------|------------|---------|-------------|-----------
  --       1     |      1     |    5    |John Smith   |Tennis     
  --       2     |      2     |    5    |Joan Smith   |Football   
  --       3     |      3     |    5    |James Smith  |Snooker    
  --       4     |      4     |    5    |John Jones   |Badminton  
  --       5     |      5     |    5    |Joan Jones   |Soccer     
  --       6     |      6     |    5    |James Jones  |Pool

HTH! - Cheers, Ryan

(6) By Larry Brasfield (LarryBrasfield) on 2020-05-06 15:42:20 in reply to 4 [link] [source]

I respond, not to argue the point [a], but to clarify for the OP why I recommended using autoincrement for what I perceived as his problem.

[a. I might quibble with "99.9999999999999% of the time ..." because I am sure that my own useful usages alone would invalidate that figure. ]

My assumptions regarding the OP's problem are: (1) That his "old unique key fields" were serving as the primary key for their respective tables; (2) That he does not much care what the key values are after the transition to SQLite; (3) That after the transition the DB will still be live, with new rows being added sometimes which will need new (primary) key values [b]; and (4) That there are link relations among the old primary key values and their use as foreign keys in the rest of the old data. [c]

[b. I inferred this assumption from the OP's statement, "I need it to auto fill itself with a value", which nearly encapsulates the most common use case for the autoincrement feature. ]

[c. That last PK/FK assumption was unclear, so I prompted for its clarification. ]

It was only assumption 3, founded on the OP's "need it to auto fill" claim, that led to my recommending use of autoincrement. If, in fact, there is another good source of unique primary key values in the incoming new data, (or there is to be no new data), where the primary key values will always be inserted, (or never be needed), then the "AUTOINCREMENT" is just excess baggage, a waste of typing, storage and CPU cycles. However, if the OP would have to do any work or have the DBMS do any work to derive new primary key values, then AUTOINCREMENT is an efficient way to get that done. (And, the OP's use case may fall among the 0.0000000000001% or maybe higher fraction of cases where the feature is well used.)

(7) By Keith Medcalf (kmedcalf) on 2020-05-06 17:07:16 in reply to 6 [link] [source]

There is insufficient information in the original request to make any sort of meaningful comment of any sort. No information as to the "content" of the so-called magical primary key is provided.

Secondly, the following declaration provides for automatically incrementing unique pseudokeys (rowid) per table row:

create table t
(
  id integer primary key,
  ... other columns ...
);

If a record is inserted and the id is NULL, then it will be assigned a value 1 greater than all the id's currently in the table, unless the table is empty, in which case the value 1 will be used, and unless it overflows a 64-bit signed integer in which case an unused value will be chosen at random, and it no value can be found then "out of space" will be returned.

Using this declaration instead:

create table t
(
  id integer primary key autoincrement,
  ... other columns ...
);

is exactly and precisely the same thing EXCEPT that extra CPU and I/O are expended to change the "currently" into "ever" and so that an 64-bit integer overflow results in "out of space" rather than re-using a free row number.

There is almost ZERO likelihood that using "autoincrement" is needed other than for its "feel good" effects. There are extremely rare cases where it is needed but that is the very rare exception rather than the rule.

Note that for a ROWID table the only actual primary key is the INTEGER PRIMARY KEY or the rowid, whether explicitly stated on not. A declaration of a PRIMARY KEY (that is not an INTEGER PRIMARY KEY) is merely an alternate way of spelling UNIQUE which is merely an alternate way of declaring a UNIQUE INDEX.

There is no way to declare an "arbitrary" column (that is not the INTEGER PRIMARY KEY) to contain incrementing values. Only the rowid (integer primary key) is assigned automatically. You could, however, create triggers to "emulate" incrementing values in any column.

(8) By Keith Medcalf (kmedcalf) on 2020-05-06 17:12:13 in reply to 7 [source]

Sorry, the original does specify that the columns are of type integer. Therefore declaring them as INTEGER PRIMARY KEY fulfills all the requirements.

(9) By Goravani (DasGoravani) on 2020-05-06 18:27:18 in reply to 1 [link] [source]

I’m more informed now.

Some answers refer me back to the trick of having your primary key be an integer, and then it will reflect the rowed. However in my attempts I have found that it doesn’t return any number. When I save a record then fetch it back that field is zero. That’s my first problem. The main trick used for this purpose doesn’t seem to work. I realize that it should, but I find I only get zeros. That’s problem one. Which leads to my main problem still needing resolution… that is I need for new records a number to be assigned in the old column which contains 4000 records that already have a number assigned, which may be used for stored connections with other tables. So I want to preserve the old, and have new records get a new number. Higher than the last number used.

To achieve this the one poster Larry Brasfield appears to be most on top of it. However, he says some things that confuse me. He says to recreate my tables and use MUMBLEID AND AUTOINCREMENT.

My first question is what does MUMBLEID do or mean?

Then I’m not clear.. on new records, do I have to pack the unique field myself, or will it get packed automatically? One line refers to me fetching the last rowid and makes it seem like I have to do that to pack the field with that data.. which I can do, but I thought auto increment would do it for me. Or do you mean I have to fetch it only because that is my unique number and it will be inserted but if I want it sooner than a new fetch I need to fetch it after an insert.

My users are only single user always.. so fetching the last rowid inserted does make sense in my case.

So I’m still not clear on if a number will be inserted for me, into that field, such that if I fetch the record back after an insert, that field will have a number in it. That is my aim. That when I save a NEW record a number will be put into that column such that when I fetch the record back it will have a number in that column that is unique.

Without full knowledge I got the FEELING that Larry;s answer was the closest… that mumbleID and auto increment are the answer.

If however just having an INTEGER PRIMARY KEY field, which I already have, would do the trick, yes that column I’m interested in is already the Integer Primary Key.. if it would pack itself with a unique value that respects the other values already in that column in that table, then I would be set, but I just get back zeros in that column when I fetch after an insert.

So there’s my need, unique on top of data already there, verse the problem of the primary key that is allready there not returning a value, but that system may not respect the data I already have in many records within the tables in question.

I hope that clarifies my needs and situation.

Thank you all for responding, I really appreciate it.

Das Goravani

On May 5, 2020, at 5:09 PM, Larry Brasfield <noreplyf16e6c3f5@sqlite.org> wrote:

Forum post by LarryBrasfield on 2020-05-06 00:09:12
https://sqlite.org/forum/forumpost/ccf3f59754

Your problem seems to be easily solved.

For your new tables, be sure to declare the unique integer value column as: " mumbleId INTEGER PRIMARY KEY AUTOINCREMENT, " with mumbleId chosen more intelligently. This will be required to be unique as a consequence of being the primary key, and, per [this Rowid Tables](https://sqlite.org/rowidtable.html) documentation, will be an alias for the underlying rowid. (You may not care about this, but it improves efficiency somewhat.) Then, simply insert your values, specifying the mumbleId column values as the same unique numeric integer values from your old DB. The autoincrement tracking will take care of tracking the highest value inserted so that, if ever an insert is done without specifying the mumbleId value, it can be generated. That would be the only circumstance where you would need to "fetch the last rowid after an insert."

(10) By Goravani (DasGoravani) on 2020-05-06 18:39:21 in reply to 1 [link] [source]

My app is Astrology, it is not a business app, it's always single user and desktop, so the CPU cycles needed for auto increment are probably not a problem.. this is not mission critical or multi user or LAN or WAN, it's non of that, it's local, on a desktop machine, Mac or Windows, single user.. at home in peace.. so to speak :-)

To be clear: I have about 50 tables that already have an Integer Primary Key column which is an old column that was used to contain my unique key values that connect records in other tables, if these would assign unique values higher than the preexisting values in that column, I would be fine already, but I'm getting zeros back from fetches after inserts.. it appears the field is not inserting any number, rowid et al.. I'm not getting it back.

So I remain with my problem, which my previous post makes clear. My data sizes in some tables reach a quarter million records, in most it's just a few thousand.

If I am to add any column constraints such as auto increment or mumbleID I realize I have to recreate my tables correct? I have to copy old aside, recreate, copy data back, delete unneeded copy, I have to do that to add these constraints correct? Wish that alter table could do it but whatever.

So before I recreate with mumbleId and auto increment just wanted to get one last reassurance that that is a good plan and find out what mumbleID means or does..

(11) By tom (younique) on 2020-05-06 20:59:06 in reply to 4 [link] [source]

I see it a bit differently. AUTOINCREMENT ensures that row IDs keep their chronological order. To me, this is a very important property in most cases.

(12) By Keith Medcalf (kmedcalf) on 2020-05-06 22:07:56 in reply to 11 [link] [source]

I do not see how you mean this. If the rowid of some record is greater than the rowid of some other record, then the rowid with the greater numerical rowid was added chronologically subsequent to the row with the lesser rowid unless you set the rowid's manually, in which case the "autoincrement" keyword makes no difference.

The only difference is the change that "one more than the current max rowid" beccomes "one more that the maximum rowid ever" and that if you run out of positive 64-bit signed integers you get an "out of space" error rather than trying to find an ununsed rowid before returning an error.

Save "user specified rowids" a greater rowid always means a later insert chronologically than all the other rowids that exist in the table.

(13) By Keith Medcalf (kmedcalf) on 2020-05-06 22:11:08 in reply to 10 [link] [source]

Your "integer primary key" is probably not spelled correctly. You must spell it exactly as "integer primary key".

The other possibility is that you are inserting rows and specifying the rowid. It is only automatically generated if it is NULL.

Please show the table definition of a table that is misbehaving.

(14.1) By Larry Brasfield (LarryBrasfield) on 2020-05-06 23:04:02 edited from 14.0 in reply to 9 [link] [source]

Responding to your issues and questions in order:

When I wrote 'mumbleId' in the position of a column name, and "with mumbleId chosen more intelligently", I could have elaborated that you might wish to use a column name for your primary key that is more meaningful. Some people just use 'id' or the like, but joins can be more conveniently written when foreign keys and primary keys have the same name. Unless you have a table named 'Mumble', 'mumbleId' is probably a poor choice. Regard it as a stand-in for better choices.

As Keith said, (and as I now agree, having relearned something about SQLite), there is likely no need to use AUTOINCREMENT, and certainly no such need evident in what you have posted so far. See ROWIDs and the INTEGER PRIMARY KEY and SQLite Autoincrement in particular. So just write something like (but differing from) "CREATE TABLE Something( somethingId INTEGER PRIMARY KEY, ...)". You can then insert your old data, taking care that the old unique integer key values are inserted as the (perhaps) newly named primary key column values, and SQLite will take care of generating more unique key values as you subsequently insert new data rows.

You keep writing things such as "but I just get back zeros in that column when I fetch after an insert" and "the primary key that is allready there not returning a value". Neither I nor anybody else who might help you with those problems has enough, from such summary or conclusory assertions, to divine what you are doing differently from what has a chance of working. So, I repeat my earlier advice: It would help to show DDL and queries, together with results that seem awry. This means showing your DDL and queries verbatim, rather than your opinion as to what effects they might or should have, and what actual results you observed rather than summaries that are likely to obscure what happened.

SQLite has a nice little shell with which you can demonstrate, with just a few lines of SQL, what you did and what happened. That, together with what you expected, will suffice to permit other forum participants to help you, by either adjusting your expectation(s) or informing you what to do differently.

(15) By Ryan Smith (cuz) on 2020-05-06 23:18:27 in reply to 9 [link] [source]

Your English is either a third language (like mine) or you are using a translator, but something is getting lost in translation.

The only real thing that would explain to us your problem is to post your database schema here.

Show your exact table definitions, and how you declared these unique keys that do not work. Maybe then we might understand and help you better.

To be clear: If you declare a column as "INTEGER PRIMARY KEY" in any table (except WITHOUT ROWID tables) in SQLite, it MUST return an incremental unique value for that column, else it is not SQLite or it is a spelling mistake as Keith guessed at.

So please, just post the schema (the SQL used to create the tables). We should be able to help easily then.