[Windows] SQLite dll hooking is nearly impossible due to recursive calls

(1) By Dave (drabelink) on 2020-07-03 12:08:37 [link] [source]

Hi,

I'm creating a tool (Windows) which is able to hook into the sqlite3.dll to monitor the actions from the main application.

Using function hooking the tool can detect when the application calls any of the exported functions from the sqlite3.dll and display the data.

What I have discovered is that the sqlite functionality inside the dll is calling its exported functions also. This is to me very strange that an exported function is used by the dll itself instead of calling internal functionality.

An example:

When the main application calls this from the sqlite3.dll

sqlite3_prepare16_v2 having this statement:

"CREATE TABLE Persons
(
id INTEGER PRIMARY KEY,
first_name TEXT NOT NULL,
last_name TEXT NOT NULL,
country_id INTEGER
);
COMMIT;"

The tool is able to detect this call and display the statement.

But then, while the initial sqlite3_prepare16_v2 is being executed, sqlite3 will call itself on the interface which is:

sqlite3_prepare_v2 having this statement:

"SELECT*FROM"main".sqlite_master ORDER BY rowid"

Then followed by sqlite3_step and sqlite3_finalize.

After that sqlite does another 	sqlite3_prepare_v2 having this statement:

"SELECT*FROM"main".sqlite_master WHERE tbl_name='Persons' AND type!='trigger' ORDER BY rowid"

followed with a couple of sqlite3_column_text calls.

All those internal calls are also detected by the tool, obviously because the functions are hooked.

So the tool gets a huge amount of internal calls which should not be known by the outside. The internal actions may be valid but why are those calls done on the provided interface and not internal?

Any trace tool will not be able to get any sensible information from sqlite when all this unrelated data is presented as one big pile of unknown calls.

So my question is why sqlite3.dll is doing this?

And how to solve this problem?
Is there another way to hook into sqlite3.dll to get the actual actions from the client of the dll?

(2) By Warren Young (wyoung) on 2020-07-03 12:24:56 in reply to 1 [source]

And how to solve this problem?

Have you already rejected SQLite's existing tracing and profiling mechanisms? And if so, why?

DLL hooking is a technique that was created in a world where most DLLs didn't have source code available, so it was the next best option available. Unless your tracing facility must work on DLLs that others provide only in binary form, you'd probably be able to get around your current limitations by modifying SQLite itself, since that would give you access to the internal state variables that let you suppress output when SQLite is just calling back into itself.

(3) By Larry Brasfield (LarryBrasfield) on 2020-07-03 13:55:46 in reply to 1 [link] [source]

This is to answer questions not addressed by Warren's appropriate response.

why sqlite3.dll is doing this?

Mmmm. Because the API is useful enough to be suitable for recursive use? I'm pretty sure that's it, combined with the fact that the 'lite' part of the project/product name is a seriously intended design objective. I urge you to accept that reality; it is not going to change to suit ideas of modularity that might pertain to other, larger projects.

how to solve this problem?

If you have control of the client, it would be relatively easy to write preprocessor macros to redirect SQLite API calls into instrumentation functions of your choice and devising, which would call the "real" API to get the work done without such instrumentation.

Is there another way to hook into sqlite3.dll to get the actual actions from the client of the dll?

It is also relatively easy to create a "shim" DLL or linked code which presents the same interface to a client as another DLL would have, if directly loaded by the client. This can be done in an automated manner for simple instrumentation. (I have done this, so I know it is quite doable.) This can amount to creation of an alternative object (.obj), replacing the usual .lib produced during DLL creation which typically just contains a jump table that is rewritten during dynamic load.