SQLite search with text vs numeric field
(1) By SM (surabhi.nitsurat) on 2020-04-24 11:25:09 [link] [source]
Does sqlite search performance get impacted if I do search on a numeric bigint or a fixed length 128 char string text field, assuming the table size is constant in both the cases and the fields are not primary keys? I assume the answer would be yes, if so, is there any benchmarks on how much would be the performance degrade or any rough estimates? How will the performance differ if index is created in both the cases?
(2) By Keith Medcalf (kmedcalf) on 2020-04-24 21:54:36 in reply to 1 [source]
I have no idea what a "numeric bigint" is. However, the native integer type in SQLite is an 8-byte signed integer. A 128-byte string is 16 times longer than an 8 byte signed integer.
Therefore, one would assume that "searching" 128-byte "things" will take 16 times longer than searching 8 byte "things" simply because the 128-byte "thing" is 16 times bigger than the 8-byte "thing".
That one would even need to ask such a question is surprising.
As to the performance of an index that would depend on the depth and breadth of the index. However, given an equivalent page size, you can fit 16 times more 8-byte keys on a page than you can fit 128-byte keys -- so one would expect the search time taken with 128-byte keys to increase at a rate 16 times faster than it does with 8-byte keys simply because the index will have higher breadth and depth by a factor of 16, and will require 16 times the I/O.
I should think that this would also be obvious.
If you want to know the particulars for your specific data and implementation, you will have to actually give it a try.
(3) By Bill Wade (wwade) on 2020-04-27 12:24:56 in reply to 2 [link] [source]
Index depth probably won't increase by nearly that much.
Fan out depends mostly on the usable record size / (size of the key plus the size of a record pointer (4 bytes in SQLite, I believe)). That reduces the ratio to about 11.
Depth depends on the log of the fan-out, if the index record size is 1k or more, the depth difference probably won't exceed a factor of four.
Of course an index that is 11 times as large is going to have more cache misses, and predicting the effect of that is black magic, so as Keith wisely said, "give it a try."