SQLite User Forum

Text vs blob in terms of storage
Login

Text vs blob in terms of storage

(1) By FruityMike on 2022-10-09 00:01:20 [link] [source]

Lets say we have two similar tables in two different databases:

CREATE TABLE contract (
    address TEXT NOT NULL CHECK(length(address) = 42),
    PRIMARY KEY(address)
)

CREATE TABLE contract_test (
    address BLOB NOT NULL CHECK(length(address) = 20),
    PRIMARY KEY(address)
)

Basically the first table hold hex strings as text like 0x9a049f5d18C239EfAA258aF9f3E7002949a977a0 while the second table bytes as blob. Because the representation of these bytes take up more space then bytes themselves (42 > 20) I assumed that it would be advantageous to migrate data from the first to the second table. There are ~2.23 million of these entries per table. But for some reason I don't see a difference.

-rw-r--r--   1 baklazanas baklazanas 4.8G Oct  9 00:47 contract.db
-rw-r--r--   1 baklazanas baklazanas 4.8G Oct  9 02:04 contract_test.db

Should I be aware of some sort of a flag that would turn it advantageous, is there a optimization that I'm not aware of or is it an expected result?

(2) By SeverKetor on 2022-10-09 00:05:22 in reply to 1 [link] [source]

Have you run VACUUM on the second (and first, I suppose) DB?

(3) By FruityMike on 2022-10-09 00:08:02 in reply to 2 [link] [source]

no, whats that?

(4) By SeverKetor on 2022-10-09 00:11:40 in reply to 3 [link] [source]

https://sqlite.org/lang_vacuum.html

TL;DR: it rebuilds the DB, getting rid of free pages which shrinks the DB, among other things

(5) By FruityMike on 2022-10-09 00:44:49 in reply to 4 [link] [source]

wow, it seems that solves it. I assumed that things like that are taken care behind the scenes. Result after running vacuum:

-rw-r--r-- 1 baklazanas baklazanas 217M Oct 9 03:42 contract.db -rw-r--r-- 1 baklazanas baklazanas 122M Oct 9 03:42 contract_test.db

(7) By Adrian Ho (lexfiend) on 2022-10-09 01:46:34 in reply to 5 [link] [source]

I assumed that things like that are taken care behind the scenes

I'm sure some kinds of housekeeping can (and are) done by the SQLite library behind the scenes, but it's a library running in your application's context. Automatically triggering heavyweight housekeeping like a VACUUM would literally freeze your application in its tracks. That's also why "stop-the-world" garbage collection in memory management is avoided like the plague in realtime games and other performance-sensitive applications.

Even database servers like PostgreSQL, that hide the impact of housekeeping from clients just by running "somewhere else", only do a limited-scope VACUUM automatically, and only when certain DBA-defined thresholds are crossed. If a DBA triggers a VACUUM during peak periods, all users will likely notice.

(6) By FruityMike on 2022-10-09 00:45:21 in reply to 4 [link] [source]

Thanks for the tip :)

(8) By Chris Locke (chrisjlocke1) on 2022-10-09 06:44:17 in reply to 3 [source]

If you've a database and add 300 MB of data, your database would be approximately 300 MB size. If you delete 100 MB of data, the database size is not reduced by 100 MB, but SQLite will use that 100 MB of free space. So if you add another 300 MB of data, the database size will increase by 200 MB.

Sizes are approximate and 'real-world' figures aren't that accurate ;)