SQLite Forum

SQLite temp file is removed but still kept by java process

SQLite temp file still kept by java process after close a connection

(1.2) By Artur (aaukhatov) on 2020-10-09 06:53:32 edited from 1.1 [link]

SQLite's temp file is removed but still kept by java process, it uses disk space and can lead for excessive disk usage. But we close the connection of the db file.

Is it a bug of the sqlite-jdbc driver?

(2) By Keith Medcalf (kmedcalf) on 2020-10-08 19:02:42 in reply to 1.1 [link]

Define "temp file", "removed" and "still kept".

Or better yet show your work that leads you to ask the question.

(3) By anonymous on 2020-10-09 08:13:58 in reply to 1.2 [link]

What do you mean? Is the temp file removed or is it kept? Cannot be both.

(4) By Artur (aaukhatov) on 2020-10-09 11:19:40 in reply to 3 [link]

Unfortunately both.
The temp file is deleted, but it still takes a disk space.

lsof | grep deleted
33558748 /mnt/sqlite.tmp/xxxx.tmp (deleted)

(5) By Keith Medcalf (kmedcalf) on 2020-10-09 12:06:29 in reply to 4 [link]

LSOF lists open files so the file cannot be deleted because it is open.  

It has been marked for deletion and will be deleted by the Operating System when it is closed.

(6) By Artur (aaukhatov) on 2020-10-09 13:43:07 in reply to 5 [link]

The connection was closed

(7) By Keith Medcalf (kmedcalf) on 2020-10-09 19:33:52 in reply to 6 [link]

If the file was closed then it would not be listed by lsof which lists open files.

(8) By Scott Robison (casaderobison) on 2020-10-09 20:33:36 in reply to 1.2 [link]

I think what people are trying to say is this:

We do not believe this is a bug in SQLite. No one is reporting such a problem, and given that SQLite is used in billions and billions of installations, it probably would have come up.

As a consequence of that, you should probably direct your question to the sqlite-jdbc team to see if they can help.

That being said, it certainly is possible that SQLite has some issue that no one but you has discovered up until now. Unfortunately, without a program that uses only the C API to reproduce the issue (not jdbc), it will be difficult or impossible to fix.

Please contact the jdbc project. It is very likely that they open files in the background as part of their jdbc to SQLite translation layer, and do not expect anyone to try to delete a file once it has been opened.

Note: https://github.com/xerial/sqlite-jdbc/issues/80 talks about deleting open files and jdbc not cooperating in some way. It was a simple Google search for "sqlite jdbc keeps deleted file open" and it was a few items from the top of the page. That does not mean it is exactly the cause of your problem, but it illustrates that other people have seen similar issues when using jdbc.

It is also possible that the fault is elsewhere, maybe some logic problem in your own code that you can't see at the moment. But I would check with the jdbc people next while continuing to review my own code.

(9) By Rowan Worth (sqweek) on 2020-10-12 02:54:34 in reply to 5 [link]

> LSOF lists open files so the file cannot be deleted because it is open.
> It has been marked for deletion and will be deleted by the Operating System when it is closed.

On unix systems you can certainly unlink files while they're open - any processes which have an open file descriptor can still read/write the data and technically it's still on disk, but it's gone from the filesystem and cannot be opened any longer.

I'm almost certain you know this Keith, so I'm not sure why you're being so unnecessarily difficult here. Commonly we refer to such files as "deleted," which is coincidentally the same nomenclature lsof uses to flag file descriptors referring to inodes which are no longer reachable via the filesystem.

(10) By Keith Medcalf (kmedcalf) on 2020-10-12 03:08:59 in reply to 9 [link]

Sorry for the loose language.  Translate "deleted" as "freed":

 > LSOF lists open files so the file cannot be freed because it is open.

 > It has been marked for deletion/unlinked/uncataloged/released (terminology varies by Operating System) and will be freed by the Operating System when it is closed.

Note that not only Unix behaves in this fashion.  Many Operating Systems behave in this fashion including Linux, Windows, OS/2, MVS, OS/VS and DOS/VS to name but a few.

(11) By anonymous on 2020-10-12 09:03:09 in reply to 8 [link]

Our colleague has already created an issue in the SQLite JDBC tracker https://github.com/xerial/sqlite-jdbc/issues/547.

It's possible that the problem is with JDBC or with our own code. But the reason to write about it here is that the strange file is a native SQLite temporary file in the SQLite temp directory. As far as I know, neither the JDBC driver nor our code manages these files at all. All that we do is set the temp_store_directory pragma when the connection is established.

(12) By Artur (aaukhatov) on 2020-10-12 09:05:36 in reply to 8

We have created an issue [https://github.com/xerial/sqlite-jdbc/issues/547](https://github.com/xerial/sqlite-jdbc/issues/547)

(13) By Gunter Hick (gunter_hick) on 2020-10-12 14:03:10 in reply to 12 [link]

You have an awful lot of processes accessing the same temporary file.

Maybe you are opening an SQLite connection, creating a transaction that requires a temporary file (see https://sqlite.org/tempfiles.html) and never completing that transaction.

And then go on to fork off a bunch of subprocesses that inherit the open file descriptors (see https://sqlite.org/howtocorrupt.html)

(14) By Rowan Worth (sqweek) on 2020-10-12 23:27:25 in reply to 13 [link]

I believe the output of lsof in the github issue is showing a bunch of threads within the same java process, rather than separate processes. Each thread of course shares the same file descriptor table, and it appears that lsof makes no attempt to be smart here and simply repeats the details of each file descriptor for every thread.

The size column is actually "SIZE/OFF" - the fact that it's changing throughout the output might tell us that the tempfile is associated with a transaction that is still actively doing something?

Here's the lsof column header, for reference:

COMMAND     PID   TID           USER   FD      TYPE             DEVICE   SIZE/OFF                NODE NAME