Windows: opening an unaccessible database may take an excessive amount of time
(1) By anonymous on 2025-03-26 13:28:28 [source]
On Windows, sqlite retries some I/O operations that return particular error codes, assuming that these errors are transient. The retry count & delays should be limited to values specified by SQLITE_WIN32_IOERR_RETRY
and SQLITE_WIN32_IOERR_RETRY_DELAY
, which are 10 & 25 by default. The default values amount to about 1.3s of total delay if all attempts fail.
A particular loop in winOpen
looks like this:
do{
h = osCreateFileW((LPCWSTR)zConverted,
dwDesiredAccess,
dwShareMode, NULL,
dwCreationDisposition,
dwFlagsAndAttributes,
NULL);
if( h!=INVALID_HANDLE_VALUE ) break;
if( isReadWrite ){
int rc2, isRO = 0;
sqlite3BeginBenignMalloc();
rc2 = winAccess(pVfs, zUtf8Name, SQLITE_ACCESS_READ, &isRO);
sqlite3EndBenignMalloc();
if( rc2==SQLITE_OK && isRO ) break;
}
}while( winRetryIoerr(&cnt, &lastErrno) );
The winAccess
operation also performs this retry-loop logic, so each call to winAccess
can take up to 1.3s, the whole loop taking about 15 seconds before it accepts defeat.
This scenario can be very easily reproduced with latest sqlite on Windows. Simply create a directory, create a sqlite database inside, then remove all Windows permissions to the enclosing directory. Attempting to perform any operation on the database with the sqlite commandline tool will take about 15 seconds to return an error.
I'm not really sure what is the point of checking the readability of the file at this point. I guess that it was supposed to break the loop early in case the file tests as readable, but the effect is opposite.
(2) By Richard Hipp (drh) on 2025-03-26 13:36:57 in reply to 1 [link] [source]
Those retry loops are there to work around virus checkers, which can prevent calls to winAccess() and similar from succeeding until the virus checker has finished its scan of the file. So if we turn the retry loop off, or if we reduce the timeout, then then we get lots of failures caused by virus checking.
What is your suggested fix? How do we delay these operations when they are blocked by an active virus checker, but not delay them when the file simply does not exist? How do we tell the difference?
(3) By anonymous on 2025-03-26 13:42:47 in reply to 2 [link] [source]
I understand the reason for the retry loop around CreateFile. I'm still figuring out the point of testing read-access when trying to open the database readwrite. I've found a commit message that says:
In the Windows VFS, when trying to open a database file read/write, if it fails check to see if the file exists and is read-only and immediately fall back to a read-only open attempt, rather than running the AV retry loop.
I'm convinced that one retryloop should not be nested in another. Since the check looks like it was inteded to be an optimization, then maybe it would be better to call it only on the first iteration?
(4) By Richard Hipp (drh) on 2025-03-26 14:46:44 in reply to 3 [link] [source]
Can you please try checkin 2025-03-26t14:45Z and report back whether or not this clears you problem?
(5) By anonymous on 2025-03-26 15:06:37 in reply to 4 [link] [source]
Thanks for the patch. I've tried a similar approach in the meantime, but without the NORETRY option, I've simply added the cnt==0 condition.
Here are the results:
- original sqlite: 20 seconds
- with the patch: 6 seconds
- my simpler approach without NORETRY: 7.5 seconds
So this is a definite improvement.
The remaining delay is caused by the fact that when winOpen fails in read-write mode, it makes another attempt in read-only mode. If this could be avoided somehow, it would halve the total delay again, but I expect this would require more extensive changes. Like making both attempts (R/W & R/O) within a single retry-loop.
Anyway, thanks for the quick response.
(6) By Richard Hipp (drh) on 2025-03-26 15:52:57 in reply to 5 [link] [source]
Can you please try again with checkin 2025-03-26t15:51z and report back if there is any further improvement?
(7) By anonymous on 2025-03-26 16:50:09 in reply to 6 [link] [source]
The second patch works as expected - the runtime is down to around 3 seconds.
But I wonder this change it isn't too aggressive. The access check is now done only once (cnt==0 and NORETRY), so it is not resistant to AV interference. If the file is marked read-only, but the first GetFileAttributesExW call fails, it would never be repeated, and the fallback to read-only mode would not be triggered.
I think it would work better without the NORETRY flag, since the read-only test would be done robustly (but still only once). This would slow down the previous scenario (file not accessible at all), but you can't have the cake and eat it...
(8) By Richard Hipp (drh) on 2025-03-26 17:06:48 in reply to 7 [link] [source]
Yet another patch to try, please: checkin 2025-03-26T17:05Z.
(9) By Andrzej Szombierski (anszom) on 2025-03-26 21:21:13 in reply to 8 [link] [source]
Ok, this one seems to work just as well as the previous one (around 3 seconds), without any obvious drawbacks. Thanks!