can I insert an 8GB file into sqlite?

(1) By Andrew (andykais) on 2021-05-22 16:46:12 [link] [source]

Hi, I am entirely new to virtual file systems. I want to build a local web app that uses sqlite as its file system. This is for a number of reasons:

portability
ease of migration
avoiding folder count limits
this article which I found very intriguing https://www.sqlite.org/fasterthanfs.html

The database will be used to store media files (images, videos and audio) and relevant metadata and tags. File sizes can vary from 100KB to 30GB. (I put 8GB in the title just for an easy to digest question)

Again, possibly a very naiive approach, but I plan to write this in typescript with the npm package better-sqlite3. Without using the blob_open, blob_read and blob_write functions, I can access large files stored in sqlite using substr on BLOBs piece-wise (for streaming to a web page). However the part I am unsure of is if I am even able to store large files in sqlite. Attempting to change the SQLITE_MAX_LENGTH, I saw this comment:

The hard limit is the ability of a 32-bit signed integer to count the size: 2^31-1 or 2147483647.

which pretty much sounds like I am limited to ~2GB max file storage. In addition, even if files were under 2GB, I dont see any way to concatenate BLOBs or TEXT into a row without reading the whole column into memory first. Even reading 2GB into memory is pretty resource intensive and seriously limits the amount of things my program can do concurrently. Can I ask for some guidance on this here? Thank you very much in advance!

(2.1) By Warren Young (wyoung) on 2021-05-22 17:28:44 edited from 2.0 in reply to 1 [link] [source]

portability

You think a single file with potentially terabytes of content is portable?

I mean, houses are portable by the same logic. All you have to do…all you have to do… is lift them up on massive hydraulic jacks and put them on a container ship to move them across continents. What's the problem?

I think you'll find that a pile of tens of thousands of gigabyte-scale files are more easily portable for the same reason that when moving, leaving the house behind and packing up on the house's contents makes more sense.

I'm not telling you that you shouldn't use SQLite for terabyte-scale databases. I'm just telling you that if you expect to need to move it around a lot, you'll find it easier to do on a pile-of-files so you can retry broken transfers, use rsync sanely, etc.

this article which I found very intriguing

Part of what's going on in that article is that the entire DB can get loaded into RAM, and the OS will treat it as a single file, which changes how caching works relative to thousands of separate files.

I doubt its results are very relevant to your particular application.

avoiding folder count limits

Those limits in filesystems aren't arbitrary: they exist to keep the filesystem fast. It takes too much time to index arbitrary millions of directory entries, so they set a finite limit to encourage you to make use of hierarchy to keep index sizes sane.

The very same forces exist inside a SQLite database. It's just C code, not magic. Millions of DB rows have an indexing overhead, too.

I plan to write this in typescript

…and you're concerned with efficiency?

Decide: do you want the program to be fast to execute, or do you want it to be fast for you to write? You can't get both.

I'm not saying it's wrong to write your service in Typescript. For network-facing applications, the latencies and bandwidth limitations of the network often make the inefficiency of the remote network service irrelevant, which is why we have so much scripting on the server side. However, you'd best be clear about this and stop trying to worry about server-side efficiency.

Having said that, realize that when you start talking about scaling the application, efficiency can start mattering again. The tech giants are rewriting Perl, Python, PHP, etc. services in C++, Go, Rust, etc. in order to save money on data centers. Even counting Silicon Valley engineering salaries, it can pay off.

Another common situation is writing efficient services so you can avoid paying a lot in cloud fees. Back when everything was on-premises, taking 10x the development time to get a service to run in 1% of the total system resources versus 10% made no sense, but now that we're loading many customer loads onto each machine in cloud architectures, that 10x inefficiency cost goes directly into hosting fees.

It's the same tradeoff as for the big guys: is it cheaper to pay for the engineering to get it to be 10x faster or is it cheaper to just pay 10x in hosting fees?

Both approaches have their points. You need to think clearly about which approach matters in your particular situation.

sounds like I am limited to ~2GB max file storage.

Yes, for a single chunk.

So, you could chunk the files, storing 2 GiB or less per chunk. I'd actually suggest something more on the order of 1 MiB. That's enough to saturate any network buffers, so that you're sure to have the second megabyte of data up in RAM before you finish sending the first. Once you reach that point, there's no point making the chunks bigger, if the point is to serve them over the network.

Or, you can do as iTunes, Plex, etc. do and store the actual files outside the DB, keeping only the metadata in the DB itself.

(3) By anonymous on 2021-05-22 20:49:40 in reply to 2.1 [link] [source]

Actually it’s 1GB not 2GB. And yes, chunks of 1MB worked well in my case (with 16KB pages).

(12) By Andrew (andykais) on 2021-05-23 02:53:11 in reply to 3 [link] [source]

(with 16KB pages)

can you elaborate on what you mean here? Is there a configuration option which specifies page sizes? Or are you referring to the number of rows your database had?

(13) By Larry Brasfield (larrybr) on 2021-05-23 03:28:44 in reply to 12 [link] [source]

See the page_size pragma.

(5) By Andrew (andykais) on 2021-05-22 22:23:25 in reply to 2.1 [link] [source]

You think a single file with potentially terabytes of content is portable?

I think I was more excited about how storing everything in the database means better coupling, and better guarantees. E.g. any code related to handling missing files attached to sqlite rows just goes away if I have a foreign key constraint in my database. Likewise copying the database just means copying a single file, and insuring that the checksums match. This is in contrast to copying the database, then walking tens of thousands of files and insuring that their checksums match with the originals.

Decide: do you want the program to be fast to execute, or do you want it to be fast for you to write? You can't get both.

Efficiency was never a large goal of this. Apologies for bringing up the "faster than fs" article. It was simply where I found inspiration for sticking a file system in sqlite. This will be a single-user local web application. That means low concurrent access to the database. What I need is fast IO and fast http transfers. Both of which are possible in javascript using libraries that leverage lower level C code. Javascript just happens to be a more agile and portable language (for me) for what I want to build.

So, you could chunk the files, storing 2 GiB or less per chunk. I'd actually suggest something more on the order of 1 MiB.

Ok cool, this is definitely something I know how to build, but I feel like if theres a concern around indexing millions of files in a database, then that problem is exacerbated by storing each file in 1MiB chunks in the database. In the case of an 8GB file, a single file would create 8000 rows. It would only take 125 files to reach 1 million rows. If sqlite row access speeds start to break down at around 1 million rows, does that mean this idea just isn't feasible at all? I do expect the number of files stored to be somewhere in the hundreds of thousands, and the total storage size to be hundreds of gigabytes.

(7) By Warren Young (wyoung) on 2021-05-22 22:47:42 in reply to 5 [link] [source]

copying the database just means copying a single file, and insuring that the checksums match.

This is fine in principle, but when the size of the DB gets so big that it takes days and days to transfer, what happens when there's an interruption?

Even something like zfs send doesn't copy the entire filesystem in a single copy: it collects the changed blocks and sends only those blocks. And it's now resumable. How do you get the same features with SQLite?

What I need is fast IO and fast http transfers

Write speed isn't going to differ greatly between the cases, and as for reading, I doubt you're going to find anything faster than sendfile(2), which doesn't work for files buried inside a DB.

You'll notice that the "faster than FS" article entirely ignores this aspect of things.

I feel like if theres a concern around indexing millions of files in a database, then that problem is exacerbated by storing each file in 1MiB chunks in the database.

Yes, TANSTAAFL.

the number of files stored to be somewhere in the hundreds of thousands, and the total storage size to be hundreds of gigabytes.

That gives an average file size of 1 MB, which feels like a bait-and-switch relative to the original thread title.

If all we're talking about are a few rare exceptions that break the 1 GiB barrier, then why worry about chunking the files at all? It's only when the vast majority of files are larger than that (e.g. a digital video storage archive) that you have to worry about the cost of chunking, the 1 GiB limit, etc.

(8) By mlin (dnamlin) on 2021-05-22 23:08:30 in reply to 7 [link] [source]

copying the database just means copying a single file, and insuring that the checksums match.

This is fine in principle, but when the size of the DB gets so big that it takes days and days to transfer, what happens when there's an interruption?

There are plenty of resumable, chunked transfer tools to deal with this; consider aria2 and BitTorrent for example. For better or for worse, it's done all the time in several scientific "big data" disciplines that have a taste for colossal individual files. See also HDF5, not to be confused with HDFS =)

What I need is fast IO and fast http transfers. Both of which are possible in javascript using libraries that leverage lower level C code.

We should note that Node.js JIT-compiles your script into native machine code. It opens a pretty nice speed/abstraction "sweet spot" between static compiled executables and interpreted scripts, which has been a big contributor to its success.

(9) By Warren Young (wyoung) on 2021-05-22 23:38:47 in reply to 8 [link] [source]

There are plenty of resumable, chunked transfer tools to deal with this

As long as you're willing to shut all conns to the DB down and leave them down while the transfer continues, sure.

Resumability doesn't let you start the app up, modify the DB some, shut it down, and resume the transfer. That'll almost guarantee a corrupt DB at the end of the transfer. It also doesn't let you copy the DB once, do some work on the DB, and then copy only the changes, as zfs send will, given suitable snapshots.

Indeed, snapshotting the DB under ZFS and then doing a zfs send is a much better plan than using userspace-level resumable transfer tools on a SQLite DB. You might lose the last few modifications in uncommitted transactions, but it's no worse than kill the app while it runs. The journal gets replayed, and the DB is back in action again.

it's done all the time in several scientific "big data" disciplines that have a taste for colossal individual files.

Are these individual files random-access, modified internally as SQLite DBs are, or are they written once, then possibly appended to only?

Node.js JIT-compiles your script into native machine code.

Ah, that explains why VSCode feels about as fast as 4.3 BSD vi over a 9600 bps VT220 connection, then.

JIT isn't magic. Dynamic languages have unavoidable costs, no matter how well optimized, which is why statically-compiled languages are still being created and used.

(10) By mlin (dnamlin) on 2021-05-23 01:28:14 in reply to 9 [link] [source]

First you asked,

when the size of the DB gets so big that it takes days and days to transfer, what happens when there's an interruption?

So I answered, then new requirements appeared about copying the database whilst write transactions are going against it. Not everybody needs that, but a filesystem-independent solution lies in driving SQLite's online backup API in a background thread, then transfer the consistent backup. No doubt one can come up with yet-unstated requirements that won't work for, but, we've covered a lot of ground already!

Node.js JIT-compiles your script into native machine code.

Ah, that explains why VSCode feels about as fast as 4.3 BSD vi over a 9600 bps VT220 connection, then.

9600bps is an order of magnitude faster than vi was designed for. =)

(11) By Andrew (andykais) on 2021-05-23 02:44:59 in reply to 7 [link] [source]

If all we're talking about are a few rare exceptions that break the 1 GiB barrier, then why worry about chunking the files at all? It's only when the vast majority of files are larger than that (e.g. a digital video storage archive) that you have to worry about the cost of chunking, the 1 GiB limit, etc.

This is true. Its likely that the majority of files will not exceed the 1GB limit. Though I can imagine ~100 files over a GB being pretty standard (e.g. storing a whole movie collection in the database).

That gives an average file size of 1 MB, which feels like a bait-and-switch relative to the original thread title.

Again, apologies. I really am not intending to deceive anyone here. I did not know which parts of this idea would be relevant and which would not. My concern really was with the large blob inserts (and now blob reads, since according to comments here, substr will read the whole blob into memory), which I did not know how to accomplish.

(14) By Kevin Youren (KevinYouren) on 2021-05-26 21:17:38 in reply to 11 [source]

Andrew,

I suggest you search and read the Sqlite website for SQLAR, for SQLite Archive Files.

It will very nicely store photos, etc.

The schema is:

sqlite> .schema
CREATE TABLE sqlar(
  name TEXT PRIMARY KEY,
  mode INT,
  mtime INT,
  sz INT,
  data BLOB
);

As it is an Sqlite database, you can quite happily add any extra tables you want. 

For example, a table containing the names your camera uses for stored photos together with a description you can relate to. Similar to the idea of writing on the back of paper/cardboard photos. "Grandma with grandkids", etc.

Storing stuff over 1Gb is possible, by splitting the file into smaller junks.

That is, a bit of programming required for splitting and recombining.

sqlite> select name, mode, mtime, sz from sqlar;
fred.part1|33204|1621820988|913975706
fred.part2|33204|1621822838|912242597
fred.part3|33204|1621827849|909762547
fred.part4|33204|1621832248|907433793
fred.part5|33204|1621835715|205364403


I used a similar technique in the 90s, transferring files between Melbourne (Australia) and San Ramon (California) nightly. Megabytes in those days.

(15) By Kevin Youren (KevinYouren) on 2021-10-19 23:24:01 in reply to 11 [link] [source]

Andrew,

is there any more progress with your requirement?

The largest user type file I have is panlex_lite.zip which is 2.7G,

and it contains a 7.2 GB sqlite database.

(4) By mlin (dnamlin) on 2021-05-22 21:17:36 in reply to 1 [link] [source]

Without using the blob_open, blob_read and blob_write functions, I can access large files stored in sqlite using substr on BLOBs piece-wise (for streaming to a web page).

Although a BLOB value can be quite large, I don't think SQLite is capable (I'd be happy to be corrected!) of super-efficiently seeking to a random offset within it. That is, substr() or sqlite3_blob_read() with a random offset k is liable to take O(k) time (although it may be faster than that in some cases). This is because of how SQLite internally splits the blob into a linked-list of fixed-size pages in the database file, which it has to traverse in order to find the desired offset.

That's another reason, in addition to those you go on to mention, why you probably want an application-level mechanism for splitting a huge file into BLOB chunks stored across multiple rows. I have an idea for a virtual table extension to handle this in a general way, but haven't gotten to it!

(6) By anonymous on 2021-05-22 22:32:07 in reply to 1 [link] [source]

You can do substr on blob columns, but SQLite will always read the whole blob into memory.