SQLite Archiver

Artifact Content

Artifact e039153c413534cc80bee681fd02de082d66fad4:

SAR - SQLite Archiver

This repository contains sources for a proof-of-concept "SQLite Archiver" program. This program (named "sar") operates much like "zip", except that the compressed archive it builds is stored in an SQLite database instead of a ZIP archive.

The motivation for this is to see how much larger an SQLite database file is compared to a ZIP archive containing the same content. The answer depends on the filenames, but 2% seems to be a reasonable guess. In other words, storing files as compressed blobs in an SQLite database file results in a file that is only about 2% larger than storing those same files in a ZIP archive using the same compression.


On unix, just type "make". The SQLite sources are included. The zlib compression library is needed to build.


To create an archive:

    sar ARCHIVE FILES...

All files named in FILES... will be added to the archive. If another file with the same name already exists in the archive, it is replaced. If any of the named FILES is a directory, that directory is scanned recursively.

To see the contents of an archive:

    sar -l ARCHIVE

To extract the contents of an archive:

    sar -x ARCHIVE [FILES...]

If a FILES argument is provided, then only the named files are extracted. Without a FILES argument, all files are extracted.

All commands can be supplemented with -v for verbose output. For example:

    sar -v ARCHIVE FILES..
    sar -lv ARCHIVE
    sar -xv ARCHIVE

File are normally compressed using zlib prior to being stored as BLOBs in the database. However, if the file is incompressible or if the -n option is used on the command-line, then the file is stored in the database exactly as it appears on disk, without compression.


The database schema looks like this:

      name TEXT PRIMARY KEY,  -- name of the file
      mode INT,               -- access permissions
      mtime INT,              -- last modification time
      sz INT,                 -- original file size
      data BLOB               -- compressed content

Both directories and empty files have sar.sz==0. Directories can be distinguished from empty files because directories have sar.data IS NULL. The file is compressed if length(sar.blob)<sar.sz and is stored as plaintext if length(sar.blob)==sar.sz.