SQLite Forum

PRAGMA temp_store_directory
Login

PRAGMA temp_store_directory

(1) By Slava G (slavago) on 2020-05-03 14:01:22 [link] [source]

I have java application with 10 threads, each thread works on different DB (each thread on one db). When I create connection to the DB, I pass to it PRAGMA temp_store_directory with folder where db is located , so temp file will be placed there. Recently reading on that PRAGMA I found :

When the temp_store_directory setting is changed, all existing temporary tables, indices, triggers, and viewers in the database connection that issued the pragma are immediately deleted. In practice, temp_store_directory should be set immediately after the first database connection for a process is opened. If the temp_store_directory is changed for one database connection while other database connections are open in the same process, then the behavior is undefined and probably undesirable.

But I don't really understand if this connection to the same DB or any connection within same process will be broken ? Also found that

Changing the temp_store_directory setting is not threadsafe. Never change the temp_store_directory setting if another thread within the application is running any SQLite interface at the same time. Doing so results in undefined behavior. Changing the temp_store_directory setting writes to the sqlite3_temp_directory global variable and that global variable is not protected by a mutex.

So, here it's confuses even more. And last is that deprecated in SQLite

This pragma is deprecated and exists for backwards compatibility only. New applications should avoid using this pragma. Older applications should discontinue use of this pragma at the earliest opportunity.

So, not sure how to work if that is deprecated, as I have 10 connections to 10 different DB and how I set temp folder for each connection separately ?

Thanks

(2) By Keith Medcalf (kmedcalf) on 2020-05-03 14:52:24 in reply to 1 [link] [source]

So, not sure how to work if that is deprecated, as I have 10 connections to 10 different DB and how I set temp folder for each connection separately?

Have ten processes with one connection per processes. The "temp directory" concept is a per-instance (of the computer) or per-process concept on every single Operating System ever devised so far.

Or you could write a new OS that maintains a temp folder per thread.

However, presently, no Operating System does this. Most of them have a single temp directory per process and its location is inherited from its parent process by default, although the parent process can "change" the diaper (temp store location) for its baby processes to pee in. SQLite3 allows you to override the current process diaper location, but only for the current process (and everything it contains) that is SQLite3 related.

But I don't really understand if this connection to the same DB or any connection within same process will be broken ? Also found that

Since there is one temp directory active PER PROCESS at one time, changing the TEMP STORE DIRECTORY affects all connections in that process from the time the change is made.

Changing the temp store directory setting is not threadsafe. Never change the temp store directory setting if another thread within the application is running any SQLite interface at the same time. Doing so results in undefined behavior. Changing the temp store directory setting writes to the sqlite3tempdirectory global variable and that global variable is not protected by a mutex.

So, here it's confuses even more.

Seems pretty straight forward. The temp store directory is a process wide setting. It is not threadsafe. If you change the temp store directory while at the same time also using it in another thread, you will release the hounds of hell who will promptly eat your program and drag you down into the hellfires for all eternity.

It is simply being somewhat more technical to avoid offending those with delicate sensibilities or weak constitutions.

Putting all that aside, why in the multiverse do you think you need a separate temp directory for each connection? Is it a PHB thing?

(3) By Slava G (slavago) on 2020-05-03 15:03:48 in reply to 2 [link] [source]

Thanks Keith, I have 1 process, with 10 threads, PRAGMA temp directory passed upon creating connection. Each thread has it's own folder where DB is located. Connection itself is not shared between threads. So in this case temp folder is on connection on specific DB or it's globally for all connections for ALL DB's in the process ?

I'm not sure I got your comment about temp folder for OS and each thread.

Using different temp folder for each connection I wanted to solve any possible interfere between DB's as they all have same name.

Thanks

(4.1) By Keith Medcalf (kmedcalf) on 2020-05-03 15:32:18 edited from 4.0 in reply to 3 [link] [source]

There is only ONE temp store directory per process. So whatever you set last in the process is what will be used. (Note this does not change the actual temp store directory which is under control of the Operating System, but rather changes the location where the SQLite3 library will write its temp files -- it will not change where the rest of your program writes its temp files).

If there were interference between temp files then it would not be much of a temp file system. What experimental OS are you using that you cannot trust its' handling of temp files?

Note that I use an Experimental OS from Microsoft which has never been able to manage its temp files since MSDOS 1.0 though to and including the current release of Windows. Its problem however is not a failure to manage the temp file naming, but rather a failure of ability to discard the excrement. Consequently ever since about 1979 a "diaper pump" has been needed to keep the excrement in check lest the baby drown in its own excrement. For each version since MSDOS 1.0 the number of temp file locations that must be regularly pumped out (and the frequency of the pumpings required) has grown exponentially.

I would consider an application that creates yet another "bilge" in need of pumping when it could be using the already existing sewer system to be a big vote against ever allowing the use of such as application.

(5) By Slava G (slavago) on 2020-05-03 15:25:36 in reply to 4.0 [link] [source]

Thanks !!

I'm not using any experimental OS :) CentOS it is :) But, from https://www.sqlite.org/tempfiles.html it's not clear all temp files and transient indices files names convention, it's clear for WAL, Journals and so. But for other it's not that clear, and possible can cause interference between temp files if they're having same names. So this I wanted to avoid. Nothing special or exeperimental.

Slava

(6) By Keith Medcalf (kmedcalf) on 2020-05-03 18:35:10 in reply to 5 [link] [source]

Ah. I see the problem.

You are confusing files which may be temporarily part of the database

  • Rollback journals
  • Master journals
  • Write-ahead Log (WAL) files
  • Shared-memory files
    and must live in very specific locations and must not be tampered with in any way if they exist (because they are an integral part of the database) with actual "temporary files" that are fleeting in nature and of no significance outside of their specific use at the moment, and which are automatically deleted (well, that is inaccurate, because they are temporary files so they may never actually be created):
  • Statement journals
  • TEMP databases
  • Materializations of views and subqueries
  • Transient indices
  • Transient databases used by VACUUM

The "temp store directory" only affects the location where these fleeting files are written. This is because, for example, the default temp directory (/tmp perhaps) may only allow a few megabytes of space. This might preclude being able to vacuum a database containing more that a few hundred kilobytes of data, for example. Or perhaps prevent a query that needs a statement journal larger than available /tmp space from running.

Thus you can move the "temp store directory" to a bigger filesystem if need be.

True "temporary files" are just that, they are temporary. The operating system will clean them up (in theory). In *nix, a random file is basically opened and then deleted. The file exists as an inode with no name until the file is closed, at which point the inode is freed. Similarly on other OSes whatever mechanism is used to denote that the file is "temporary" are used to designate the file as being temporary so that when it is closed it ceases to exist. The Operating System generates random "tempfile names" for temporary use and the Operating System guarantees that the namespace is collision free.

So the only reason you would want to change the temp store directory is because you want it to be bigger, otherwise it does nothing at all and there is only one "temp store directory" per process linkage of the SQLite3 library.

(7) By Slava G (slavago) on 2020-05-03 18:42:35 in reply to 6 [source]

Thanks Keith, Now it's clear. Indeed now the only reason to change temp dir is to larger partition. So, as that PRAGMA is deprecated, it's better use TMPDIR or so ? Or can I use PRAGMA in mycode, but only once before any connection is made ? I saw that JDBC drive can set temp folder, but it's uses PRAGMA as well, so what is better way programatically set that folder ?

(8) By Keith Medcalf (kmedcalf) on 2020-05-03 19:31:56 in reply to 7 [link] [source]

While the PRAGMA is deprecated, I doubt that DRH will be getting rid of it anytime soon. If you are going to use the PRAGMA, call it before you start any threads or open any connections.

However, I would prefer the usage of the TMPDIR or SQLITE_TMPDIR environment variables as they would seem to be the least likely to change in the future and this then becomes part of the setup of the application environment rather than a part of the application itself, and hence easier to document in case someone else needs to touch it in the future ... and I would use SQLITE_TMPDIR for the same reasons as it is more obvious that it has limited scope ...

(9) By Slava G (slavago) on 2020-05-03 19:34:33 in reply to 8 [link] [source]

Great, Thanks !!

(10) By Slava G (slavago) on 2020-05-04 06:18:41 in reply to 9 [link] [source]

Last question - is there any option to get value of sqlite3_temp_directory ? for the log purposes.

Thanks.

(11) By Keith Medcalf (kmedcalf) on 2020-05-04 15:45:46 in reply to 10 [link] [source]

I presume you mean the directory being used for temp files, of which the sqlite3_temp_directory global is merely the highest priority override ... looking at the code for interfaces it would appear not.

The actual "temp location" in use does not actually appear to be stored anywhere in either the win or unix VFS but determined and used dynamically when required.

Maybe you can create a temp object (such as a database with an empty string as the name) and use one of the APIs on the connection handle to find the underlying OS filename (which would divulge the temp store location at that time), but even this I am not sure of.

(12) By Slava G (slavago) on 2020-05-06 08:01:20 in reply to 11 [link] [source]

Thanks Keith.