SQLite Forum

Need help building a custom amalgamation
Login

Need help building a custom amalgamation

(1) By anonymous on 2020-07-31 12:42:30 [link] [source]

I'm very new to makefiles and compiling libraries but figured I could ask for some help here. I want a VERY minimum sized sqlite library for use in an embedded system.

The only commands I will be running are sqlite3_open() sqlite3_close and sqlite3_exec() which will be used to CREATE TRIGGER.

Would anyone be able to help me with steps/compile flags to make this library?

(2) By Warren Young (wyoung) on 2020-07-31 13:56:54 in reply to 1 [link] [source]

I'm very new to makefiles and compiling libraries

This is an absolutely foundational skill as a software developer, particularly to one doing anything "embedded." You cannot excuse the lack of this skill if you want to push forward. Get it done.

As well, your question does not show any evidence that you've even found the front page of the software project yet. If you had, you would already have an amalgamation and would now be asking how you can get it from X bytes to X÷M bytes, where M is your desired reduction factor > 1.

I want a VERY minimum sized sqlite library

Quantify that. At the very least, give me X and M above.

in an embedded system.

Your question is couched in terms of resource capacity problems, but "embedded" tells us nothing about what those resource requirements are. I can make a case for a 4-CPU 28-core-per 1 TB RAM monstrosity being used as an "embedded" system, since that word has more to do with deployment and management than it does with resource requirements.

If you think I'm picking ridiculous examples to make a point, you're right, but then answer me this: is a Raspberry Pi "embedded"? And if you can imagine cases where the answer is "yes," as I can, then how do you make any sense of your question given that SQLite rattles around like a pebble in a milk jug in relative terms to a Raspberry Pi's resource space?

The only commands I will be running are sqlite3_open() sqlite3_close and sqlite3_exec()

...which drags in the SQL parser, which drags in the VFS layer, which drags in the memory manager...

There are large chunks of SQLite you can leave out with OMIT options and such, but there is a definite lower limit to how far you can take this. But again, you have given us no limits, so we cannot evaluate whether any extra work is needed.

(3) By anonymous on 2020-07-31 14:46:34 in reply to 2 [link] [source]

I'm sorry if my question wasn't phrased well or didn't have enough information, I've only been coding for a few months and just want to do whatever I can to learn... I'll try my best to give more information. Currently I have downloaded the canonical source and copied the Makefile.linux-gcc into a parallel directory named build. I have added the following compile flags:

BCC = gcc -g -O0 \
-DSQLITE_OMIT_ANALYZE \
-DSQLITE_OMIT_ATTACH \
-DSQLITE_OMIT_AUTHORIZATION \
-DSQLITE_OMIT_AUTOINCREMENT \
-DSQLITE_OMIT_AUTOMATIC_INDEX \
-DSQLITE_OMIT_BLOB_LITERAL \
-DSQLITE_OMIT_BTREECOUNT \
-DSQLITE_OMIT_CASE_SENSITIVE_LIKE_PRAGMA \
-DSQLITE_OMIT_CAST \
-DSQLITE_OMIT_CHECK \
-DSQLITE_OMIT_COMPILEOPTION_DIAGS \
-DSQLITE_OMIT_COMPLETE \
-DSQLITE_OMIT_DECLTYPE \
-DSQLITE_OMIT_DEPRECATED \
-DSQLITE_OMIT_FLAG_PRAGMAS \
-DSQLITE_OMIT_INCRBLOB \
-DSQLITE_OMIT_INTROSPECTION_PRAGMAS \
-DSQLITE_OMIT_LOAD_EXTENSION \
-DSQLITE_OMIT_LOOKASIDE \
-DSQLITE_OMIT_PAGER_PRAGMAS \
-DSQLITE_OMIT_PRAGMA \
-DSQLITE_OMIT_PROGRESS_CALLBACK \
-DSQLITE_OMIT_QUICKBALANCE \
-DSQLITE_OMIT_REINDEX \
-DSQLITE_OMIT_SHARED_CACHE \
-DSQLITE_OMIT_TCL_VARIABLE \
-DSQLITE_OMIT_TMPDB \
-DSQLITE_OMIT_TRACE \
-DSQLITE_OMIT_UTF16 \
-DSQLITE_OMIT_VACUUM \
-DSQLITE_OMIT_VIEW \
-DSQLITE_OMIT_VIRTUAL_TABLE \
-DSQLITE_OMIT_WAL \
-DSQLITE_OMIT_WSD \
-DSQLITE_OMIT_UNTESTABLE
(And the same flags for TCC)

I'll be honest, all of these were chosen without really knowing what I was omitting, I just read through the brief description and if it didn't sound like I needed it, I added it.

I then run make sqlite3.c inside the build directory where the Makefile is and end up with a sqlite3.c file that is around 8.1 MB. The source code that I'm currently working on and will use this library in is also about 8 MB. I would really prefer not to double the size of this code just for a library that I use three functions from. I don't have an exact M value for you so I'm sorry if I can't be more specific there.

I just know there's probably a large part of this library I don't need and I'm not sure how to get rid of it. Using the above compile flags the .c file seems to be the same size as the standard amalgamation.

Sorry if this entire question is naive, I'm really just trying my best to learn and even after reading the documentation over and over again I'm still struggling. Thanks in advance for any help

(4) By Richard Hipp (drh) on 2020-07-31 14:58:29 in reply to 3 [link] [source]

I then run make sqlite3.c inside the build directory where the Makefile is and end up with a sqlite3.c file that is around 8.1 MB.

Are you concerned about source code size, or object code size? Those are independent variables.

I was under the impression that you were concerned mostly about object code size - the space consumed on the embedded device. Am I wrong? The source-code size does not really matter in that case. All of your OMIT options will cause much of that 8.1MB of sqlite3.c source to be commented out. The object code, which is what uses memory on your device, should be less than 500KB.

Note further that the amount of memory that a library uses is not the same as the size of the *.o file for the compiled library. The *.o file contains a lot of other stuff that may or may not get loaded into memory when the program runs. To find the amount of memory that the library will use, run the "size" command on the *.o file:

    size sqlite3.o

Probably it is the output of the "size" command that you are concerned about. The number of bytes of source code and the number of bytes in the *.o file are correlated to the amount of memory used, but only loosely so. So you cannot use the source code and *.o file sizes as definitive measurements of memory usage on the device.

(5) By anonymous on 2020-07-31 15:07:26 in reply to 4 [link] [source]

I guess I'm not quite sure which one I care about.

My workflow is:
1) Compile the binary using CMake
2) Upload the binary to a remote machine

I want the size of the binary to be small to ensure a fast upload, so I guess I'm mostly concerned about object size?

(6) By Stephan Beal (stephan) on 2020-07-31 15:21:29 in reply to 5 [link] [source]

I want the size of the binary to be small to ensure a fast upload, so I guess I'm mostly concerned about object size?

Don't worry so much about the size of sqlite3.c. As a point of comparison, the Fossil SCM project makes heavy use of sqlite and imports the latest sqlite3.c on a regular basis. The 64-bit binary for fossil is, as of this moment, right at 4.3MB, including all of fossil's (not inconsiderable) code and its own copy of sqlite.

[stephan@st3v3:~/fossil/fossil]$ size fossil
   text	   data	    bss	    dec	    hex	filename
4212051	  82704	  18136	4312891	 41cf3b	fossil
[stephan@st3v3:~/fossil/fossil]$ size bld/sqlite3.o 
   text	   data	    bss	    dec	    hex	filename
1035169	  18056	   1272	1054497	 101721	bld/sqlite3.o

i have another project which uses its own copy of sqlite and its "size" is only 1.3MB (built with many of sqlite3's extra features enabled), considerably smaller than the 8MB sqlite3.c.

If your embedded environment only has 512kb of memory you will run into problems, but if it's got a "reasonably modern" amount of memory (at least a few tens of MB of RAM and a few spare MB of long-term storage), you'll almost certainly have no resource-related problems which lead back to sqlite.

(7) By Richard Hipp (drh) on 2020-07-31 15:27:00 in reply to 5 [link] [source]

Are you compiling an instance of the sqlite3 command-line tool to be uploaded? In that case, be sure to compile with -Os and run "strip" on the binary, before you upload it.

So you are mostly concerned with upload bandwidth, or with the size of the file after it lands on the remote machine?

Note that the command-line tool contains lots and lots of extensions that I don't think you can reasonable omit. If the point of this exercise is to run a single "CREATE TRIGGER" statement on the remote machine, you would do well to build a custom C program that does just that one CREATE TRIGGER, link it against SQLite, and upload that.

You might already have an SQLite shared library on the remote machine. So you might not even need to link it against SQLite as it might be able to use the existing shared library. Your upload could be as small as 5K or 10K. But only you can determine this, because only you know your circumstances and the problem you are trying to solve.

(8) By anonymous on 2020-07-31 15:31:16 in reply to 7 [link] [source]

I think this will need to be my solution. The remote machine already runs sqlite, I just want my C program to be able to CREATE TRIGGER. So sounds like I should just create a program, possibly referencing the .c file for implementation, to create the trigger?

(9.1) By Larry Brasfield (LarryBrasfield) on 2020-07-31 16:27:25 edited from 9.0 in reply to 8 [link] [source]

Yes, that is the suggestion. From the SQLite C API you will use one of the sqlite3_open functions, the sqlite3_exec function, and the sqlite3_close function.

To use the shared library, you will want to link against the shared library rather than "referencing" the SQLite .c file. Compilation will reference sqlite3.h, and linking will include a platform-dependent file which is, or stands in for, the shared library in loadable/executable form.

(10) By Warren Young (wyoung) on 2020-07-31 16:37:23 in reply to 8 [source]

The remote machine already runs sqlite

In what form? There are at least three things that phrase could mean:

  1. It's embedded into some other application, which means your app must carry a second copy, presumably also embedded.

  2. Special case of #1, it's embedded into a copy of the SQLite shell, sqlite3, and the existing users are making calls through the shell to get anything done. Yet, if this is the case, then I don't see why you'd be talking about writing your own C program at all. You'd be using the same shell-based mechanisms.

  3. It's a shared library, so both your program and this other one can share it, which then calls into question why you're worried about space at all. It's paid for already.

If you have two C programs both needing SQLite and space is at a premium, you should be trying for #3.

I just want my C program to be able to CREATE TRIGGER.

Why, exactly?

A trigger without INSERT, DELETE or UPDATE statements is useless.

I assume this other program is modifying the DB, and you're trying to react to that, but that then calls into question what you're trying to do when the DB gets modified.

I ask because if we take your quoted post literally, then you don't need to write a custom C program at all. You could upload a copy of sqlite3 built for the remote target, create the trigger by typing it into the shell, and then you're done, right? I assume that isn't right; thus the probing questions.

(11) By anonymous on 2020-08-06 14:14:12 in reply to 7 [link] [source]

Hey I'm back, so I did some more research on shared libraries and it turns out the remote machine has a /usr/lib/libsqlite3.so file.

So in this case, is it possible to not have any sqlite source code in my repo/compiled binary and instead add linking flags to link against that library once it's run on the remote machine?

(12) By Warren Young (wyoung) on 2020-08-06 14:16:56 in reply to 11 [link] [source]

(13) By anonymous on 2020-08-06 14:39:45 in reply to 12 [link] [source]

Is there a 32 bit version of the .so for linux?

(15) By Adrian Ho (lexfiend) on 2020-08-07 07:10:46 in reply to 13 [link] [source]

Yes, but there's no guarantee that a random 32-bit libsqlite.so would actually work on your remote machine. For starters, Linux runs on more 32-bit architectures than just Intel x86, especially in the embedded arena.

Even if it's the same architecture, a library built on your typical Linux distro probably won't work on a musl-based remote OS.

So unless you're planning to drop Yet Another SQLite Library on your remote machine, you really should be using what's already available. In fact, if your remote machine is significantly different architecturally from your local one, you should be building your program on the remote box for maximum success.

(14) By anonymous on 2020-08-06 17:16:42 in reply to 1 [link] [source]

If the only thing you want to do is add a trigger to the database schema, then I suppose it might work to lock the database, create a record for the trigger, add it to the btree of the schema table, increment the schema cookie number and file change counter, and then unlock it. But another question is, do you need WAL mode?