SQLite User Forum

calculate the hamming distance between a table column signed integer and a provided signed integer
Login

calculate the hamming distance between a table column signed integer and a provided signed integer

(1) By anonymous on 2024-03-15 17:06:41 [link] [source]

i am trying to calculate the hamming distance between a column in my table and i value which i will provide, but i am finding it difficult to get this working. here are somethings i have tried. ##

SELECTR * from table order by (hash | -122323332756) - (hash & -122323332756) limit 5;

the results from here just repeats one particular row multiple times

(2) By Kees Nuyt (knu) on 2024-03-15 19:46:01 in reply to 1 [link] [source]

Cannot reproduce.

CREATE TABLE T1 (
        id   INTEGER PRIMARY KEY NOT NULL
,       hash INTEGER
);
INSERT INTO T1 (hash) VALUES
        (4562457367)
,       (7890677417)
,       (6723590123)
,       (1092843711)
,       (9085027854)
,       (7655468684);
SELECT * from T1 order by (hash | -122323332756) - (hash & -122323332756) limit 5;
id  hash
--  ----------
5   9085027854
3   6723590123
6   7655468684
2   7890677417
1   4562457367

Tested with sqlite3 CLI 3.46.0 2024-03-11 20:34:02 7805844bc4 (32-bit) trunk

(3) By jose isaias cabrera (jicman) on 2024-03-15 20:26:25 in reply to 2 [link] [source]

Cannot reproduce.

however, if you insert the same values thrice,

INSERT INTO T1 (hash) VALUES
        (4562457367)
,       (7890677417)
,       (6723590123)
,       (1092843711)
,       (9085027854)
,       (7655468684);
INSERT INTO T1 (hash) VALUES
        (4562457367)
,       (7890677417)
,       (6723590123)
,       (1092843711)
,       (9085027854)
,       (7655468684);
INSERT INTO T1 (hash) VALUES
        (4562457367)
,       (7890677417)
,       (6723590123)
,       (1092843711)
,       (9085027854)
,       (7655468684);

this happens,

sqlite> SELECT * from T1 order by (hash | -122323332756) - (hash & -122323332756) limit 5;
┌────┬────────────┐
│ id │    hash    │
├────┼────────────┤
│ 5  │ 9085027854 │
│ 11 │ 9085027854 │
│ 17 │ 9085027854 │
│ 3  │ 6723590123 │
│ 9  │ 6723590123 │
└────┴────────────┘
VM-steps: 250
Run Time: real 0.000 user 0.000000 sys 0.000000

which could be what he may be experiencing. So, he may have multiple values repeated. Dunno.

(4) By Adrian Ho (lexfiend) on 2024-03-16 13:01:21 in reply to 1 [source]

the results from here just repeats one particular row multiple times

That would be the expected result if there were multiple rows with the same hash, or for which (hash | -122323332756) - (hash & -122323332756) resolves to the same value. (Sidenote: My CS days are decades behind me, but I don't recall ever calculating Hamming distance with that odd formula.)

What's the output of:

-- identical hashes
SELECT hash, count(*) AS c FROM table GROUP BY hash ORDER BY c DESC LIMIT 5;

-- identical distances
SELECT (hash | -122323332756) - (hash & -122323332756) AS d, count(*) AS c FROM table GROUP BY d ORDER BY c DESC LIMIT 5;

?