Question about Multiple database connections and loadable extensions.

(1) By Gary (1codedebugger) on 2020-12-23 18:05:03 [link] [source]

Hello. At first, this may not sound like a SQLite-specific question but, ultimately, it is about how to handle multiple database connections and loadable extensions; but I don't know how else to ask it without providing the boring background.

I'm using a C program compiled with the SQLite amalgamation as a component of a web extension in place of the indexedDB API. The extension provides something termed a native-messaging API that facilitates communication between the browser extension and a locally stored application. The extension is installed in the browser, and the only active component of it is a background script that is associated with an empty, hidden background web page. The script listens for URL loads and when one matches the match pattern provided in the extension manifest, the script acts upon the page. My set up is very simply. The background script listens only for a specifically named, locally stored HTML file, which is an empty HTML "container" having an empty body element with a data attribute that is a GUID. When the user loads the page, the background script injects a small content script into the page that grabs the GUID from the body element and passes it to background script in a request to establish a runtime communication port with it. If the GUID is valid, the background script establishes the port and then injects a CSS file and the main JS script, which builds the UI and appends it to the body; and POOF there it is. To a novice like me, it is kind of impressive that it takes place so quickly. Once the background script is informed that the main JS has successfully loaded, it requests a communication port with the C/SQLite program through the native-messaging API; and, if it is not open, it opens it.

Now we're getting to the question. From the extension UI, the user can open the local HTML file in as many tabs as they desire and, in each tab, open the same or different SQLite databases, where each database is a collection of study modules built by users. There are at least two possible ways to handle the native application. There could be one instance for all tabs, just as there is only one background script for all tabs; or each tab can a separate instance of the C/SQLite program.

Here is where I start to get confused. Each tab has a unique ID, which is used by the background script to know which tab it is communicating with. That same ID could be used in a single instance of the C program to track which database each tab has open and build the prepared statements that will be used frequently by the extension for each database when it is first opened. Is it okay/possible for one C program to maintain a set of SQLite database pointers and prepared statements in this manner? If loadable extensions that aren't part of the core SQLite are employed as statically linked, does having multiple databases cause an issue with the entry points in some manner? I don't really grasp how that works to know if sqlite3_auto_extension() can manage that in this scenario.

If, instead, each tab has its own instance of the C program, should the loadable extensions be dynamically linked such that each instance can use the same separate library file, or is statically linking better since all will still be in one executable file?

Regardless of the number of instances of the C/SQLite program employed, if a user opens the same database in more than one tab and, if it were possible that one query could run long enough for the user to switch tabs and attempt to run a second query against the same database, is it still simply a matter of handling SQLITE_BUSY?

Lastly, is it accurate that nothing in this single-user, multiple-tab scenario could ever be related to multiple threads and SQLite can be compiled without the thread-safe option?

Thank you.

(2) By Keith Medcalf (kmedcalf) on 2020-12-23 18:31:47 in reply to 1 [link] [source]

Firstly I think you need to differentiate between "database" (meaning a file stored on disk) and a "connection" to one or more "database files".

You can have multiple "connection"s to one "database", one "connection" that is attached to "multiple database files", and multiple "connection"s to the one "database", or any combination of "connections" and "database files" that you desire.

From your description it is unclear whether you mean "connections" or "database files" and where and how many you are using of each.

For example you say:

From the extension UI, the user can open the local HTML file in as many tabs as they desire and, in each tab, open the same or different SQLite databases, where each database is a collection of study modules built by users.

Do you mean that each tab may connect to the same of different "database files" using the same "connection" or that each tab uses "one connection" to access some "database file", which may by happenstance be the same "database file" as is being used by a different "connection"?

I would presume that you mean one "connection" per tab, and that each connection may be attached to one or more "database files". Whether a particular "database file" is being accessed by more than one database connection is irrelevant.

With that in mind, whether you have the a "per tab" multiple processes handling one connection (tab) each or one process handling all the connections (tabs) is quite irrelevant.

As for the last question, unless you have something that guarantees single-thread and single-thread access, then you should assume that multiple access on multiple threads will occur. Especially since typical browsers nowadays implement individual tabs as either separate threads or separate processes, you would have to take "special action" to ensure entry on a single thread. It also raises the question of why you would want to compile SQLite3 as single-threaded in the first place when that is not the default and will provide no significant benefit and create significant peril if you fail to implement your serialization properly?

(4.1) By Gary (1codedebugger) on 2020-12-23 22:34:07 edited from 4.0 in reply to 2 [link] [source]

Thank you for responding to my question.

By database, I mean the actual file on disk and I did mean one connection per tab but, after reading your response, perhaps that is not only not the best method but doesn't even make sense. A connection, as I used the term, really means nothings, does it? I was just confusing myself further. Would you please let me know if the summary below is consistent with your comments?

The tabs and the extension script have no such concept as a connection to a database. The C program has to open a database file through sqlite3_open_v2 to establish an instance of the sqlite3 structure as a database handle. All the tabs have is a semi-convoluted communication port with the C/SQLite program through the web extension's background script.

Even if a user wanted to work with ten different modules stored within the same database file, one module per tab, that still means nothing because the C program requires only one database pointer (instance of sqlite3 structure) to that particular file to process all incoming requests across all ten tabs.

That alone still doesn't fully define a connection because that same sqlite3 structure can attach up to seven databases through the single pointer or database handle.

I might want to have one database handle per database file, regardless of how many tabs may make requests through it. That way, I wouldn't have to be concerned with exceeding SQLITE_LIMIT_ATTACHED if a user tried to open everything at once. If a user wanted to copy a module from one database file to another database file, that would be a reason to temporarily attach another database to an existing handle.

If this is accurate, then only one instance of the C/SQLite program should ever be required to handle requests from as many tabs the user opens in the browser and for as many database files they wish to work with concurrently; and I was making it more complicated than it is.

Regarding the multiple threads, I must have misunderstood what I read. I thought the SQLite documentation reads that they should be avoided when possible. I don't understand how it all ties together in this scenario of a single PC and requests coming to a single instance of a C program. But, since I don't understand, it would be best to leave it alone then.

Thank you.

(3) By Keith Medcalf (kmedcalf) on 2020-12-23 18:42:02 in reply to 1 [link] [source]

As for loadable extensions, they are loaded (as in dynamic linking), per process but initialized per connection. The simplest method is to have your extension loaded by some method (even statically linked) and then put its initializer on the auto_extension list so that it is initialized for every connection.

You will note that there are defines in the SQLite3 code where you can link to additional initializer/shutdown routines where you can put the auto_extension links so that they are created when the library initializes (SQLITE_EXTRA_INIT and SQLITE_EXTRA_SHUTDOWN) in order to have the extension registered on every connection opened in that process instance.

And yes, you will have to deal with the possibility of getting an SQLITE_BUSY response.

(5) By Gary (1codedebugger) on 2020-12-24 06:34:52 in reply to 3 [link] [source]

I haven't been able to figure out how to statically link the carray extension following Item 6 of the loadable extension instructions. I've been trying for several hours. Maybe I'm just too stupid but for me it's been like looking up a word in the dictionary and not understanding a word in the definition of it, and then looking that word up, and having the same issue again and again until it brings you back to the first word you looked up.

I did get it to work following your instructions in this answer regarding appending the source code to the amalgamation file and then the core_init function example you provided after it. Now I can use carray in a parameterized query and bind a pointer to array as a single parameter.

Could you please help me to understand the method in the instructions, using carray as an example?

In item 6 of the instructions for statically linking an extension, do you still compile the carray.c file to a dll? If not, what is done with it?

Using which files exactly is the -DSQLITE_CORE compile-time option used? Just on sqlite3.c or carray.c or both?

In my C file, should carray.c be included as in this answer. I put the carray.h file in my code.

When sqlite3_auto_extension() is used near the beginning of the main function in my code, how is the association between sqlite3_carray_init established with the extension? Is it correct that sqlite3_auto_extension() takes the place of sqlite3_load_extension()?

I tried compiling sqlite3.c using DSQLITE_CORE to a .o and carray.c to a dll, both of which did not error, and then compiling those files with my C file and the error was that the sqlite3_carray_init did not exist, which I tried in sqlite3_auto_extension( (void*) sqlite3_carray_init ) in main().

The instructions mention using sqlite3_db_config(db,SQLITE_DBCONFIG_ENABLE_LOAD_EXTENSION,1,NULL) but I'm confused there, too, because I thought sqlite3_auto_extension() worked for all database handles, and sqlite3_db_config requires a handle.

I apologize for being a pain in the backside and stupid about this, but I'd appreciate some additional guidance. Perhaps appending the code to the amalgamation is a better option, but I'd like to understand how to use both methods if I can.

Thank you.

(6) By Keith Medcalf (kmedcalf) on 2020-12-24 08:04:01 in reply to 5 [source]

A basic overview of how the extensions in SQLite3 works is this. Basically the code for an extension has a bunch of functions that are called by sqlite3 at appropriate times and have to conform to certain interfaces. What you are interested in is how they are initialized. Here is how this works.

The extension can be compiled to reside anywhere. It can be "part of the sqlite3 amalgamation code" (by appending it there), it can be linked with the sqlite3 code, it can be linked as a stand-alone DLL, or it can be part of you application (which uses sqlite3).

However the extension code is made available it has to be initialized meaning "registered" on each connection (handle returned by sqlite3_open*). This is done by executing the registration function for the extension with the db handle as a parameter, for each connection handle that you want to be able to use that extension.

There are several ways you can accomplish this.

If the extension is compiled and linked as a separate DLL, then you can use the sqlite3_load_extension to load the DLL into a specific connection. sqlite3_load_extension will load the DLL and then call the initialization function with the connection handle as a parameter. The initialization function name must be a visible (dllexport on Windows) that consists of the bare DLL filename (without the path and extension) prepended with sqlite3_ and with _init appended. You do this for each connection on which you want to use the extension.

If the extension is linked into your application (or otherwise already in memory because it is statically linked somewhere) then you must call the initializer function, sqlite3_carray_init in this case, passing it the connection handle for each connection on which you want the extension registered, before the extension can be used on that connection.

If the extension is linked statically somehow you can also, after the sqlite3 library is initialized (sqlite3_initialize) use sqlite3_auto_extension API to append the registration function to the list of registration functions that are executed automatically when a new connection is created.

There is a hook in the standard sqlite3.c code which, at the end of the initialization will call a function of your choosing specified by the preprocessor symbol SQLITE_EXTRA_INIT. So, by appending the code for the extension to the amalgamation (sqlite3.c) file and then appending a core_init function, and compiling with SQLITE3_EXTRA_INIT=core_init you are making the carray.c code available and also providing a hook into the end of the initialization which will make the registration function be added to the auto_extension registration list for every connection.

The SQLITE_CORE symbol merely tells the extension headers that calls to sqlite APIs are to occur directly rather than indirectly because the extension has "direct" access to sqlite3 and not "indirect" access (ie, is linked statically rather than a DLL).

Does this help?

(7) By Gary (1codedebugger) on 2020-12-24 21:54:49 in reply to 6 [link] [source]

Thank you for taking the time to write the explanation; and, yes, it helps a lot.

But, apart from appending to the amalgamation file, I still can't get it to work.

I can't get my code to recognize sqlite3_carray_init when placing it in sqlite3_auto_extension(). The sqlite3_initialize() appears first in the code.

I tried switching to the sqlite3ext.h header file but, when I did, it errored:
sqlite3ext.h 569:40 sqlite3_api undeclared
#define sqlite3_auto_extension sqlite3_api->auto_extension. and for ->open_v2, prepare_v2, close.

I compiled as gcc -O2 -c sqlite3.c carray.c -DSQLITE_CORE -DSQLITE3_ENABLE_JSON=1 and then gcc -O2 sqlite3.o carray.o myfile.c -o myfile.exe.

In main(), I have: sqlite3_initialize(); sqlite3_auto_extension( (void*) sqlite3_carray_init );

If the sqlite3.h header is used, then only sqlite3_carray_init is not recognized.

I'm sure I'm doing something stupid. I prefer appending to the amalgamation file and just wanted to learn this method for future reference. Thanks.

(8) By Keith Medcalf (kmedcalf) on 2020-12-24 23:26:53 in reply to 7 [link] [source]

Ok. So the define SQLITE_CORE is needed to tell the extension that calls to sqlite3_* entry points are "direct" (ie, if you look in the sqlite3ext.h header, it makes the compiler use the sqlite3.h header instead of the indirection table in sqlite3ext.h). That is basically all it does.

Second, the symbols are not seen between files, so you need to declare your sqlite3_carray_init entrypoint in myfile.c otherwise the compiler will have no idea what this symbol is when compiling the myfile.c file.

myfile.c

#include <sqlite3.h>

extern void sqlite3_carray_init(void);

void main(void)
{
   sqlite3_initialize();
   sqlite3_auto_extension(sqlite3_carray_init);
   ... whatever else you want...
}

and compile with

gcc -DSQLITE_CORE myfile.c sqlite3.c carray.c -o myfile.exe

plus whatever other options you want.

This should generate an executable that has the extension statically linked and adds the auto_extension after the library is initialized, so that it is registered on every connection you open.

The extern declaration of the symbol sqlite3_carray_init should not matter as the C compiler and linker are simply doing fixups and (unlike C++ or other snotty languages) do not care what a pointer points at, merely that it points.

(9.1) By Gary (1codedebugger) on 2020-12-27 02:47:42 edited from 9.0 in reply to 8 [link] [source]

Thank you. I was missing the extern void sqlite3_carray_init(void);, and t all works now. Thanks for spelling it out for me.