SQLite User Forum

Remove duplicate rows?
Login

Remove duplicate rows?

(1.1) By Gilles on 2022-08-20 14:48:59 edited from 1.0 [link] [source]

Hello,

Out of curiosity, is there a simpler way to remove duplicate rows?

CREATE TABLE tmp_books AS SELECT DISTINCT * FROM books;
DROP TABLE books;
ALTER TABLE tmp_books RENAME TO books;
CREATE INDEX title_idx ON books(title);

Thank you.

(2) By anonymous on 2022-08-20 13:19:31 in reply to 1.0 [source]

delete from books where rowid not in
    (select min(rowid) from books
        group by title,
            /* add the rest of the column names.  no, you can't use * here. */
    );

(Explanation of rowid.)

(3) By Gilles on 2022-08-20 14:49:21 in reply to 2 [link] [source]

Thank you.

(4) By anonymous on 2023-10-02 16:00:21 in reply to 2 [link] [source]

The same Code lefts one duplicate in my table. Any suggestions?

(5) By Spindrift (spindrift) on 2023-10-02 18:30:02 in reply to 4 [link] [source]

The only sensible suggestion is that it is not in fact a duplicate.

You may need to furnish us with more information.

Of course if it is really a single duplicated row, you could just remove one of them directly.

(6) By anonymous on 2023-10-02 22:57:14 in reply to 5 [link] [source]

The usual cause of confusion is that a value which looks like a number is actually a string. To unconfuse, select a quoting output mode.

(7) By Simon Slavin (slavin) on 2023-10-03 16:45:21 in reply to 4 [link] [source]

Try returning typeof(field). If they are both strings, make sure there are no spaces or returns at the beginning or end of the string.

(9.1) By Trudge on 2023-10-04 13:47:37 edited from 9.0 in reply to 4 [link] [source]

Whenever I'm dealing with output (string or numeric), I sometimes bookend the output with '-->' and '<--' so I can see if there are any leading or trailing spaces or other characters.

So in Perl I would say

print qq{-->$ThisString<--};

This shows me immediately any leading or trailing unwanted characters.

In the case of thousands of lines to verify / sanitize, I would output the whole result to a text file then use the appropriate tool to look for and make any changes (diff, sed, awk, perl, etc.).

(8.1) By punkish on 2023-10-03 19:11:56 edited from 8.0 in reply to 1.1 [link] [source]

while the other answers are great, I see nothing wrong with the above approach… it seems just as simple as any other way. Unless you have bazillions of rows and so, are resource-constrained, the above is as good a solution as any. And, you don't have to name all the columns individually.