SQLite Forum

In perspective of disk usage, is the sizeof 'a' < INT ?

In perspective of disk usage, is the sizeof 'a' < INT ?

(1) By anonymous on 2021-12-02 14:26:11 [link] [source]

The 2 strings, "audio" and "video", assuming there's an '\0', makes 6 bytes. If INT is 4 bytes, then 50% difference. Is it worth using foreign keys for those strings with single column tables?

I'm wondering when, if ever, storing strings will use less space than storing an integer.

(2) By Ryan Smith (cuz) on 2021-12-02 16:03:12 in reply to 1 [link] [source]

It is always worth using foreign keys for columns where the values are not unique. It almost always saves space if using with an integer foreign key, but there is more to foreign keys.

  • A Foreign Key is a constraint, it has functionality and can facilitate auto updating, auto deletion and the like.
  • A Foreign Key provides a way to dereference a value, and if that value needs to change (such as you want to use 'Video' in stead of 'video'), then you need to change only 1 item in 1 table.
  • A Foreign Key provides data integrity, avoiding such things as misspelling some entries or finger errors (since only listed values are accepted, or if not constrained, you can still easily see and rectify the wrong ones).

Having a smaller space-saving footprint is just the cherry on top.

(3) By Gunter Hick (gunter_hick) on 2021-12-02 16:12:15 in reply to 1 [link] [source]

See https://sqlite.org/fileformat.html for a detailed description of row format.

A short text will have 1 byte overhead and does not store the terminator, so 'audio' takes up 6 bytes.

Integers need 1 to 9 bytes depending on the value, so yes, somtimes a large value integer takes up more storage than a short string.

If a column is designed to only take on a member of a set of distinct values, then it is usually a good idea to store them in a dictionary of sorts and refer to them by number.

(4.2) By Bill Wade (billwade) on 2021-12-03 15:39:00 edited from 4.1 in reply to 1 [source]

Moving the strings out of your primary table will almost certainly make that table smaller. Of course you probably want the strings to be somewhere, and you'll probably want indexes to make accessing those strings convenient and that will have some overhead.

The two main situations where keeping the string inline could save space are:

  1. The total size of your primary table is only a few pages even with inline strings. The cost of adding a new table or index is measured in pages, so for a small amount of data, a single table will win.

  2. Your strings don't need to be indexed, and duplicated strings are rare. In that case moving the strings out of your primary table shrinks it, but the savings are lost because the string is still somewhere in the database, and so is at least one integer referencing the string.