Hot backup database in WAL mode by coping

(1) By anonymous on 2021-11-16 11:00:13 [link]

Hello,

Since our database can grow quite a bit, we'd like to avoid an extra copy of the main database file. Hence we like to avoid using the SQLite Online Backup API. The [Backup documentation page](https://www.sqlite.org/backup.html) mentions the old option of coping the SQLite database while open and holding a read lock. Is that method safe when using the WAL mode?

From what I understand, `BEGIN IMMEDIATE` should be sufficient to get the lock, then copying the database file as well as the `-wal` file should lead to a coherent backup, is this correct?

As an optimization, we were planning on using `PRAGMA wal_checkpoint(TRUNCATE);` before taking the lock. In this case we could even omit the `-wal` file. Afaict, on restore we'll have to make sure that a `-wal` file will be removed (if present) to avoid having a non-matching `-wal` file.

There is one small concern: From what I understand this does not guarantee that another transaction sneaks in between creating the checkpoint and taking the shared lock. Is there a way to make sure that the two operations are not interrupted by another command?

Best regards,
Stefan

(2) By Simon Slavin (slavin) on 2021-11-16 14:05:16 in reply to 1 [link]

Don't do that.  Opening SQLite files using anything but the SQLite API leads to unexpected locks and errors.  Plus you don't know how to ensure that a SQLite database file and a WAL file are 'compatible'.

If you're trying to capture the changes to a file, you might find this useful:
<https://www.sqlite.org/sessionintro.html>

Alternatively, call your own function instead of the SQLite one to execute changes.  Your function logs the SQL command, then calls the SQLite to execute it.

(3) By anonymous on 2021-11-16 18:56:37 in reply to 2 [link]

> Don't do that. Opening SQLite files using anything but the SQLite API leads to unexpected locks and errors. Plus you don't know how to ensure that a SQLite database file and a WAL file are 'compatible'.
>
> If you're trying to capture the changes to a file, you might find this useful: https://www.sqlite.org/sessionintro.html

No, it is not the intention to capture the changes only.

The goal is to make a complete database backup using tar & compression. We would like to avoid making a local copy first...

We of course would backup always both files, the main db file and `-wal` file. So we always would have a matching pair.

> Don't do that. Opening SQLite files using anything but the SQLite API leads to unexpected locks and errors. 

What kind of problematic locks is a simple copy or tar creating? Does that mean that the method described in https://www.sqlite.org/backup.html is no longer safe to use?

> Historically, backups (copies) of SQLite databases have been created using the following method:
> 
> 1. Establish a shared lock on the database file using the SQLite API (i.e. the shell tool).
> 2. Copy the database file using an external tool (for example the unix 'cp' utility or the DOS 'copy' command).
> 3. Relinquish the shared lock on the database file obtained in step 1.

(4) By Keith Medcalf (kmedcalf) on 2021-11-16 20:01:45 in reply to 3 [link]

Your "historical" method will work just fine, provided that you follow the following rules:

1.  Process 1 is used solely for the purpose stated and nothing else.  
2.  Process 2 is used solely for the purpose stated and nothing else.  
3.  Process 1 and Process 2 are separate processes (though they may be sibling children of the same parent process which **does not** access the database/files in any way).  
4.  Once process 2 has completed then (and only then) may process 1 relinquish its lock and terminate.  
5.  No **"frikking about"** is undertaken against the database files (which must consist of at least the main database file and any log files) such as deleting the log.

(5) By anonymous on 2021-11-16 20:36:45 in reply to 4 [link]

Thanks for your clarification, very much appriciated!

> Your "historical" method will work just fine

It is not my method, that is a quote from the [Documentation about the Backup API](https://www.sqlite.org/backup.html) :)

> 1. Process 1 is used solely for the purpose stated and nothing else.

So process 1 is a Python process and uses the database before and after. Before the Backup start it would execute `BEGIN IMMEDIATE` and hold that connection open while Process 2 is backing up the files. Process 2 continues to query the database during the backup.

> "5." No "frikking about" is undertaken against the database files (which must consist of at least the main database file and any log files) such as deleting the log.

Database file and `-wal` get stored straight into a tar file, and, if restored, both are placed back to the original place. Restoring is only done while Process 1 is shutdown.

My second concern: Does executing `PRAGMA wal_checkpoint(TRUNCATE); BEGIN IMMEDIATE;` in a single statement make sure that the `-wal` file is fully committed, restarted and truncated when taking the backup? My concern are threads: While we use the database in Serialized mode, I am not sure if the internals of SQLite guarantee that the above two commands are executed in sequence or if they potentially could be separated (e.g. by another statement from another thread).

(6) By anonymous on 2021-11-16 21:02:04 in reply to 4 [link]

Just realized that executing 2 statements at a time is only really a CLI feature. Is there a way using the API to guarantee the two statements are executed in sequence?

(7) By Keith Medcalf (kmedcalf) on 2021-11-16 23:12:28 in reply to 5 [link]

Which word of **`not used for any other purpose`** is causing confusion?

(8) By Keith Medcalf (kmedcalf) on 2021-11-16 23:21:26 in reply to 5 [link]

> My second concern: Does executing PRAGMA wal_checkpoint(TRUNCATE); BEGIN IMMEDIATE; in a single statement make sure that the -wal file is fully committed, restarted and truncated when taking the backup? My concern are threads: While we use the database in Serialized mode, I am not sure if the internals of SQLite guarantee that the above two commands are executed in sequence or if they potentially could be separated (e.g. by another statement from another thread).

Yes.  Each statement is independent and executed in the order received.

If you have 1 million connections to the database and on one of those connections you issue one command followed by another command, it is almost guaranteed that one of the other 999,999 connections will issue a command that, by haappenstance, occurs between the two commands issued on the other connection.

If you have 2 connections to the database and on one of those connections you issue one command followed by another command, it is guaranteed that the other connection may execute a command which is processed between the two commands given on the other connection unless you have taken steps to prevent that from happening.

If you need a guarantee that the two statements are executed side by each with no possibility of something happening in betwixt, then **you** need to ensure that you have guaranteed this.

(9) By Keith Medcalf (kmedcalf) on 2021-11-16 23:29:42 in reply to 6 [link]

No.

All "statements" are executed as presented, one after each, on the connection on which the statements were presented.  If you have another connection, it may submit a "statement" for execution at any time including between statements executed on another connection.  This is called **`concurrent access`**.

(10) By anonymous on 2021-11-17 00:27:00 in reply to 7 [link]

> > 1. Process 1 is used solely for the purpose stated and nothing else.
> 
> So process 1 is a Python process and uses the database before and after. Before the Backup start it would execute BEGIN IMMEDIATE and hold that connection open while Process 2 is backing up the files. Process 2 continues to query the database during the backup.

Sorry, I meant "Process 1 continues to query the database during the backup."

> Which word of not used for any other purpose [stated] is causing confusion?

I guess the confusion stems from what "any other purpose [stated]" exactly refers to.

Of course I want to be able to continue doing work in Process 1. From what I understand, the whole idea of an Online Backup is being able to at least read from the database while doing the backup... Otherwise, what is the purpose of that whole endeavor? I could just close the database connection.

The linked Documentation says:

>  This procedure works well in many scenarios and is usually very fast. However, this technique has the following shortcomings:
> 
> * Any database clients wishing to write to the database file while a backup is being created must wait until the shared lock is relinquished. 

That is a clear statement about writes. Reading the database works during holding that lock, and I don't see a reason why it should affect the Backup Process 2 is doing during that time.

(11) By Simon Slavin (slavin) on 2021-11-17 00:48:24 in reply to 6 [link]

How do you have the CLI execute 2 statements at a time ?

(12) By anonymous on 2021-11-17 00:52:12 in reply to 9 [link]

Ok, I was almost suspecting that this would be the case.

As far as I can tell its no big deal: Even if a write sneaks in between `PRAGMA wal_checkpoint(TRUNCATE)` and `BEGIN IMMEDIATE`, the database will be coherent as long as we save the database file and the `-wal`. The `-wal` will just not be 0 bytes/restarted really, but that is merely intended as an optimization step before the backup.

(13) By anonymous on 2021-11-17 00:56:58 in reply to 11 [link]

By concatenating and using semicolons like so:

```
sqlite> PRAGMA wal_checkpoint(TRUNCATE); BEGIN IMMEDIATE;
```

But as kmedcalf stated, a command from another connection could be executed in between. And I guess that connection could be in the same process from another thread (in serialized mode) or from another process.

(14) By Larry Brasfield (larrybr) on 2021-11-17 01:15:08 in reply to 13 [link]

Those 2 statements are executed one after the other by the CLI shell.

(15) By Keith Medcalf (kmedcalf) on 2021-11-17 02:07:13 in reply to 10 [link]

> That is a clear statement about writes. Reading the database works during holding that lock, and I don't see a reason why it should affect the Backup Process 2 is doing during that time.

Yes, and only the main database file needs to be backed up.  It will be consistent with the last checkpoint before the lock was taken.  The copy must be performed by a separate process (not thread).

(16) By anonymous on 2021-11-17 10:09:04 in reply to 15 [link]

> It will be consistent with the last checkpoint before the lock was taken.

We have established earlier, there is no built-in way to guarantee creating a last checkpoint *and* taking the lock without another command being executed in between.

That is why I consider just backing up `-wal` as well (and restoring). It should still be consistent, correct?

> The copy must be performed by a separate process (not thread).

Ok, yes coping is going to be performed by a separate processes.

Out of curiosity, what is the problem there? Can opening another file descriptor for the same file disturb SQLite in some ways?

(17) By Simon Slavin (slavin) on 2021-11-17 15:06:08 in reply to 13 [link]

That does not execute the two statements at the same time.  It executes them in the order shown.  One copy of the CLI, running in one process, can do only one thing at a time.

(18) By Keith Medcalf (kmedcalf) on 2021-11-17 18:12:14 in reply to 16 [link]

> That is why I consider just backing up -wal as well (and restoring). It should still be consistent, correct?

Yes.  You will have to take and hold an "intent to write" (BEGIN IMMEDIATE) to prohibit any other connection from writing to the wal file.

> Out of curiosity, what is the problem there? Can opening another file descriptor for the same file disturb SQLite in some ways?

This is an Operating System issue.  The Operating System tracks file locks by process (POSIX/Linux) rather than by handle (Windows).  This means that on an Operating System which claims it is POSIX compliant, opening and then closing a file using the platform calls will cause all locks for that file held by the current process to be released without notification of this state of affairs.

On Windows the "files" may remain "locked" even though the process is terminated or the file is closed, and the only way to fix that is to cold-start the Operating System (a reboot).  This state of affairs comes to pass because of the way that locks are tracked on Windows.

Some other Operating Systems do things differently by other methods and may have other varying behaviours and design features (and design flaws).

This is documented (at least for the case of POSIX locking) and the SQLite3 library contains code that attempts to work-around the design flaw.

It would be perscpicacious, knowing about the frailty and bad design decisions, to avoid the circumstance in which the issue might arise -- especially when it is so simple to do so.

To ensure that the backup method will work properly and without causing the database to become corrupted, a process that is using the SQLite3 database exclusively via the SQLite3 API must be used to lock out other sqlite3 instances from the database while *a separate process* must be used to copy the files so as to not render the database locking fubar.

(19) By Scott Robison (casaderobison) on 2021-11-17 18:33:00 in reply to 1 [link]

Have you considered `VACUUM INTO 'filename'`?

(20) By anonymous on 2021-11-19 16:50:18 in reply to 19 [link]

Yes I have. But it has the same problem as the online Backup API: I end up needing twice the disk space of the SQLite database.

(21) By anonymous on 2021-11-19 16:53:48 in reply to 18 [link]

> This means that on an Operating System which claims it is POSIX compliant, opening and then closing a file using the platform calls will cause all locks for that file held by the current process to be released without notification of this state of affairs.

Thanks for the clarification.

> To ensure that the backup method will work properly and without causing the database to become corrupted, a process that is using the SQLite3 database exclusively via the SQLite3 API must be used to lock out other sqlite3 instances from the database while a separate process must be used to copy the files so as to not render the database locking fubar.

Ok, all clear now.

Thanks for your time and detailed response, very much appreciated!

(22) By Larry Brasfield (larrybr) on 2021-11-19 16:57:55 in reply to 20 [link]

> ... needing twice the disk space of the SQLite database.

I do not see why anybody should believe that. Your "hot backup database" is going to consume at least the same disk space as a VACUUM INTO target database, and probably more. So where do you get 2 x?  I get "1 x or less".

(23) By Scott Robison (casaderobison) on 2021-11-19 17:58:54 in reply to 20 [link]

If you have a file X that is 1 MiB in size, and you make a copy of it, you now have 2 x 1 MiB files.

Why does the method of copy matter? If you use copy or cp or backup API or VACUUM, I am not sure I understand the difference. Is it just because of "live DB + journal" overhead you briefly have twice as much as you want?

(24) By Simon Slavin (slavin) on 2021-11-19 18:20:56 in reply to 1 [link]

In the light of later posts to this thread, can we start again ?  If I understand correctly, you have two requirements:

1) Make a backup copy of the database.
2) Do not use as much filespace for the backup as the database file takes.

How do you expect to do both of these things ?  Unless you have a lot of indexes, SQLite is a compact file format.  You can't do both of these things at once.  Or did I misunderstand one of your requirements ?

(25) By anonymous on 2021-11-19 18:33:48 in reply to 24

I guess their idea was to pipe the copy into a compression tool right away.

I am wondering though, if using a specific  filesystem is a possibility, could that be achieved with something like ZFS (a COW filesystem)? Where the copying process is fast and the resulting file is using minimal desk space in reality?

(26) By anonymous on 2021-11-19 21:26:27 in reply to 22 [link]

I have a database of say 3 GiB and plan to make a gziped backup of it. Using `VACUUM INTO` creates a copy of the original, so I need another 3 GiB of space. Plus when creating the backup additionally the gziped size for the backup itself... On an embedded device that can matter.

If I can gzip the active database directly, the main database can grow much more before I ran into space problems. Plus it saves some write cycles on the flash media.

(27) By anonymous on 2021-11-19 21:36:25 in reply to 25 [link]

> I guess their idea was to pipe the copy into a compression tool right away.

Yes that is exactly the use case. I mean the backup is nothing fancy really, just `tar czf mybackup.tar.gz path/to/sqlite/`

> I am wondering though, if using a specific filesystem is a possibility, could that be achieved with something like ZFS (a COW filesystem)? Where the copying process is fast and the resulting file is using minimal desk space in reality?

As for the active/main database, using a compressed file system might hurt performance too much. But it probably would be an option for the backup location, if that is a dedicated partition (it is not in my case, at least right now).

Another idea I had is to use a named pipe (`mkfifo`): The Backup API could then use the pipe as target (or the `VACUUM INTO path/to/pipe` command). On the other side of the pipe would be the backup tool which reads the stream, compresses it and writes it into the compressed backup file. Not sure if that really works, but it should in theory as far as I can tell.

In the end the first approach just seems very straight forward and was close to what has been in use so far.

(28) By Scott Robison (casaderobison) on 2021-11-19 22:38:09 in reply to 27 [link]

Based on what you're describing, a "better" solution (for some fuzzy value of better) sounds like running a .dump to a pipe that can feed into a compression program. Do you have SQLite shell access or does this have to be internal to an application? If it must be internal, you could replicate what .dump does.

(29) By Simon Slavin (slavin) on 2021-11-20 01:28:03 in reply to 28 [link]

That was my solution: pipe the output from the SQLite CLI's <code>.dump</code> command into a compactor.  Problem is, because of how efficient compactors work, even that will temporarily use about the same amount of space as the original file.

With regard to a file system with compression built-in, we should take the concept to its limit and imagine doing it on Apple's Filesystem APFS.  In that system two sectors with identical contents can be collapsed into the same sector, and two files with the same contents can be collapsed into the same file.  If you call the OS API to duplicate a file, all that happens is another directory entry is created, pointing to the original data.  (It's more complicated than this, but you get the idea.). If one of the two files is changed the change is stored as deltas until the OS decides that it's not saving any space by doing so.

Using that, you can create as many copies of the original file as you want, and use hardly any extra space.  But does it do what the OP wanted ?  Does this qualify as a backup in their case ?  If there's a disk failure, you lose both files.

(30) By anonymous on 2021-11-20 01:52:19 in reply to 29 [link]

This is not about the file system having compression or not, this is a feature or copy-on-write filesystems.

And for the OPs use case, the generated file, can be compressed and sent over the wire. And this can be done outside of the locking step.

So benefits are, faster backup, async compression relative to locking the SQLite file, no wasted space on disk, only very little extra storage required compared to copying the whole file verbatim in a none COW filesystem.

Not only that, if space is really that much of a concern, I would compress into the target location directly, after NFS mounting it over the network, the local desk will have very little data written in the process.