SQLite

The Chronicles of SONAME
Login

The Chronicles of SONAME

(1.2) By Stephan Beal (stephan) on 2024-11-22 16:28:35 edited from 1.1 [source]

This thread exists primarily to document the vagaries of "SONAME" in the libsqlite3.so deliverable, including:

  • Historical usage
  • Client-side calls for its inclusion, in response to (indirectly) removing it
  • Side effects of using it or not.
  • Current (3.48) planned usage

First off: what is soname?

To the best of my fallible understanding:

  • It's a marker in a DLL which helps dynamic linkers to choose the right file when linking. The marker is conventionally named libfoo.so.#, where # is an ABI version number. It is intended to be independent from the project's version number.

  • When a binary links, at make-time, against a DLL, and the DLL has an soname, the binary marks down that soname so that when the lib is dynamically resolved later on, it can determine whether the DLL is "compatible" with the one it originally linked to. The dynamic linker may (and arguably should) refuse to later link against a DLL which has a different soname than the one annotated in the binary. On Linux, the soname conventionally maps directly to a file name without path component - the path comes from the runtime linker, e.g. influenced by $LD_LIBRARY_PATH.

  • Ideally, the soname stays stable for so long as a library retains a compatible ABI. In this project, the ABI has never(?), within the v3 lifetime, introduced an incompatible ABI, and so its historical soname of libsqlite3.so.0 is still legitimate. (That argues strongly for continuing to set it, but doing so has unfortunate side-effects now that we no longer use libtool - more on this below.)

  • Wikipedia covers it and provides links to related helpful articles: https://en.wikipedia.org/wiki/Soname

A particularly good summary of soname's purpose can be found in forum:046133a7da9d4732, and we'll refer back to that post later because it helps us solve(?) the conundrum of whether or not to continue to set the soname.

How it all began

The topic of "soname" was first brought to my attention via an off-list email about the new build system and, almost in parallel, forum:12e0037b88dd3. Both users opine that the libsqlite3.so deliverable requires an soname. That opinion led to the following reaction on my part:

  • In 25+ years of building and using my own DLLs (without libtool), i'd never heard of soname, and certainly never used it, so it's demonstrably not necessary.

  • Even so, the concern has been raised twice in parallel, so let's look into it.

To be clear: setting the soname is not difficult - it's just a linker flag set when creating the DLL. It is, however, a platform-specific detail and we want to limit platform/target-specific, out-of-source special handling to its bare minimum.

In projects using the GNU Autotools, the soname is set automagically by libtool, whether one likes it or not.

What, precisely, is the problem?

In essence, client apps (on affected platforms) which link to a -lsqlite3 which was built by the canonical pre-3.48 build expect to be able to resolve -lsqlite3 to a file named libsqlite3.so.0 - the value stored in the soname tag in libsqlite3.so. This actually differs on OpenBSD, where the package maintainers apparently set the SONAME to libsqlite3.so.X.Y, where X.Y is some mysterious combination of values which have nothing to do with the library version number, e.g. X=Y on a handy OpenBSD installation are 37.26 for libsqlite3 3.44.2. The fact that this differs on that platform serves to emphasize how the soname is a platform-specific detail.

Those who raised this topic initially point out that not setting the soname will break countless clients which are linked against older libs which do set the soname. That argues strongly for setting the soname. Whether or not that concern is globally valid across all OSes is as yet unknown, but on Linux, at least, there's an option which eliminates that concern, and we'll get to that soon.

If we do set the soname then, because we no longer use libtool, the in-build-tree apps may link to DLLs outside the build tree, even if the build environment has $LD_LIBRARY_PATH set to include the build tree. That can be seen by doing:

$ ./configure --prefix=$HOME --all --dynlink-tools --soname=legacy # sets the soname to libsqlite3.so.0
...

$ make sqldiff
...
$ ldd sqldiff
	linux-vdso.so.1 (0x00007ffcb355f000)
	libsqlite3.so.0 => /home/stephan/lib/libsqlite3.so.0 (0x0000753ecf0b5000)
...

Note that second line of ldd output. The dynamic linker is looking for libsqlite3.so.0 because that's the soname which is in the ./libsqlite3.so which sqldiff linked to at build-time. Then, at run-time resp. dynamic-linking time, sqldiff is picking up a different libsqlite3.

If we build without the soname then that result changes:

$ ./configure --prefix=$HOME --all --dynlink-tools
...
$ ldd sqldiff
	linux-vdso.so.1 (0x00007ffe5dddc000)
	libsqlite3.so => ./libsqlite3.so (0x00007a96aa96e000)
...

N.B. that works because my $LD_LIBRARY_PATH has a prefix of ".", which i've always found to be a necessity when developing libraries. An alternative is to create a symlink to the build tree's libsqlite3.so somewhere in either the system's library path or in ~/lib, and add ~/lib to $LD_LIBRARY_PATH.

Yet another approach which lets us both set the soname and use the in-build-tree DLL is to add a symlink which matches the soname:

$ ./configure --prefix=$HOME --all --dynlink-tools --soname=legacy
$ ln -s libsqlite3.so libsqlite3.so.0
$ make sqldiff
$ ldd sqldiff
	linux-vdso.so.1 (0x00007ffeee3e0000)
	libsqlite3.so.0 => ./libsqlite3.so.0 (0x000076734c791000)

(Again, that works because my $LD_LIBRARY_PATH includes this dir.)

What, realistically, are our options?

To the best of my fallible knowledge, we have the following options:

  • Set the soname to libsqlite3.so.0. Side-effects:

    • Downstream clients linked to legacy builds continue to work as-is.
    • Linking to in-build-tree copies of libsqlite3.so breaks unless we use the symlink hack demonstrated above.
  • Set the soname to an updated value, namely libsqlite3.so.3. This, it turns out, has only down-sides. We break all existing links to the lib and we break linking to in-build-tree binaries which link to ./libsqlite3.so. Since the ABI has not changed, libsqlite3.so.0 is a perfectly legitimate (if slightly unsightly) value for the soname.

  • We don't set the soname, and instead add a symlink at install-time named libsqlite3.so.0 which points to libsqlite3.so.3.48.0. This, experimentation suggests, gives us the best of both worlds:

    • Downstream clients linked to legacy builds can resolve the soname. Because the ABI hasn't change, they still run.
    • We can link to, and run, in-build-tree libsqlite3.so without any hoop-jumping beyond adjusting (if needed) the building user's $LD_LIBRARY_PATH.

It was always my intent to install the DLL in what, to my eyes, is a conventional fashion:

  • libsqlite3.so.MAJOR.MINOR.PATCH is the real DLL
  • libsqlite3.so.3 is a symlink to libsqlite3.so.MAJOR.MINOR.PATCH
    (Edit: in hindsight, we don't need this unless we set an soname of libsqlite3.so.3, so this link can be removed.)
  • libsqlite3.so is a symlink to libsqlite3.so.MAJOR.MINOR.PATCH

Clients link against -lsqlite3, which picks up libsqlite3.so, which leads them to libsqlite3.#.#.#.

However:

  • We cannot set soname to libsqlite3.so.0 if we do that because the soname needs to match a file name.
  • It breaks clients which linked against pre-3.48 DLLs, as those are looking for a DLL which matches the soname of libsqlite3.so.0.

A snippet from forum:046133a7da9d4732 provides a non-invasive workaround:

Adding libsqlite.so.0 as a symlink would probably be sufficient -- as long as the filename matches I don't think the loader verifies that the SONAME is the one that was requested.

By adding an additional symlink to the install process, we resolve the linking issues both for our in-build-tree purposes and for all legacy-linked clients tested this with so far:

  • libsqlite3.so.0 is a symlink to one of the above-listed symlinks or the DLL (it doesn't really matter which one, so long as it reaches libsqlite3.so.MAJOR.MINOR.PATCH).

With that symlink in place at install-time, we appease legacy-linked clients and no longer interfere with in-build-tree linking. (Sidebar: it is not known whether that also resolves the problem on OpenBSD, but their package maintainers apparently set an soname of their own devising, and they are free to continue to do so.)

Current Resolution

  • The current (as of this writing) configure process does not set the soname and the installation installs a symlink named libsqlite3.so.0 as a placeholder for clients linked with that historical soname.

  • The --soname=legacy|none|* configure flag can, if it proves necessary for a particular environment, be used to set a specific soname. Its values are:

    • "none" or an empty string sets no soname.
    • "legacy" uses an soname of libsqlite3.so.0.
    • Any value which matches the glob libsqlite3.so.* is used as-is, primarily to support builds, like OpenBSD's, which set a custom soname.
    • Any other value is assumed to be a suffix, which gets applied to libsqlite3.so.THE_SUFFIX.