checking for OOM in sqlite3_column_blob

(1) By anonymous on 2025-04-05 21:54:27 [link] [source]

The documentation for sqlite3_column_blob currently states:

If an out-of-memory error occurs, then the return value from these routines is the same as if the column had contained an SQL NULL value. Valid SQL NULL returns can be distinguished from out-of-memory errors by invoking the sqlite3_errcode() immediately after the suspect return value is obtained and before any other SQLite interface is called on the same database connection.

However, this does not explain how to leverage sqlite3_errcode. In particular, it seems that if there was not an OOM error, then sqlite3_errcode will return the same value it did before the call to sqlite3_column_blob, and not something reliable like SQLITE_OK. In my case, that meant it returned SQLITE_ROW, but in theory it could be anything.

Consequently, I don't think it is possible to rely on the return value from sqlite3_errcode to have any particular meaning in this case under the current implementation.

As an example:

#include <stdlib.h>
#include <stdio.h>
#include <sqlite3.h>

int main(int argc, char** argv) {
    sqlite3* db;
    printf("sqlite3_open_v2: %d\n", sqlite3_open_v2(":memory:", &db, SQLITE_OPEN_READWRITE, NULL));

    sqlite3_stmt *stmt1;
    printf("sqlite3_prepare_v2: %d\n", sqlite3_prepare_v2(db, "SELECT zeroblob(0)", -1, &stmt1, NULL));

    printf("sqlite3_step: %d\n", sqlite3_step(stmt1));

    sqlite3_stmt *stmt2;
    printf("sqlite3_prepare_v2: %d\n", sqlite3_prepare_v2(db, "SELECT", -1, &stmt2, NULL));

    const void* b = sqlite3_column_blob(stmt1, 0);
    printf("sqlite3_column_blob() == NULL? %d\n", b == NULL);
    printf("sqlite3_errcode: %d\n", sqlite3_errcode(db));

    return EXIT_SUCCESS;
}

This sample program shows that sqlite3_errcode at the end returns SQLITE_ERROR (the return value from the second prepare), despite the intervening call to sqlite3_column_blob.

The same problem exists for sqlite3_column_text.

(2) By Richard Hipp (drh) on 2025-04-05 22:03:48 in reply to 1 [link] [source]

Here's an alternative algorithm: Call sqlite3_column_type() first. If the value returned is SQLITE_BLOB, and then if the subsequent sqlite3_column_blob() returns zero, you know you got an OOM.

(3) By anonymous on 2025-04-05 22:38:39 in reply to 2 [link] [source]

That doesn't work.

#include <stdlib.h>
#include <stdio.h>
#include <sqlite3.h>

int main(int argc, char** argv) {
    sqlite3* db;
    printf("sqlite3_open_v2: %d\n", sqlite3_open_v2(":memory:", &db, SQLITE_OPEN_READWRITE, NULL));

    sqlite3_stmt *stmt;
    printf("sqlite3_prepare_v2: %d\n", sqlite3_prepare_v2(db, "SELECT zeroblob(0)", -1, &stmt, NULL));

    printf("sqlite3_step: %d\n", sqlite3_step(stmt));

    printf("sqlite3_column_type() == SQLITE_BLOB? %d\n", sqlite3_column_type(stmt, 0) == SQLITE_BLOB);

    const void* b = sqlite3_column_blob(stmt, 0);
    printf("sqlite3_column_blob() == NULL? %d\n", b == NULL);

    return EXIT_SUCCESS;
}

output:

sqlite3_open_v2: 0
sqlite3_prepare_v2: 0
sqlite3_step: 100
sqlite3_column_type() == SQLITE_BLOB? 1
sqlite3_column_blob() == NULL? 1

(7.1) By DrkShadow (drkshadow) on 2025-04-06 06:58:09 edited from 7.0 in reply to 3 [link] [source]

Skipped a step..

    int retrieved_type = sqlite3_column_type(stmt, 0);
    printf("sqlite3_column_type() == SQLITE_BLOB? %d\n", retrieved_type == SQLITE_BLOB);

    if (retrieved_type == SQLITE_BLOB) {
        const void* b = sqlite3_column_blob(stmt, 0);
        printf("sqlite3_column_blob() == NULL? %d\n", b == NULL);
        if (b == NULL) {
            // Could be zero, could be error
            int res = sqlite3_errcode();
            if (res != SQLITE_OK) {
                fprintf(stderr, "Got error for sqlite3_blob: %i\n", res);
            }
        }
    }

Specifically skipped: "// Could be zero, could be error, for type BLOB"

Key confusing point, "Valid SQL NULL returns can be distinguished from out-of-memory errors by invoking the sqlite3_errcode() immediately after the suspect return value is obtained and before any other SQLite interface is called"

(8) By anonymous on 2025-04-06 17:14:32 in reply to 7.1 [link] [source]

As already mention, that doesn't work. When sqlite3_column_blob returns a legitimate NULL (i.e., it's not an out-of-memory condition), the return value of sqlite3_errcode will not be SQLITE_OK, but rather whatever value it had previously, which could be anything at all.

(9.1) By DrkShadow (drkshadow) on 2025-04-06 23:00:16 edited from 9.0 in reply to 8 [link] [source]

I see, sorry.

I would agree that it's not appropriate for the error code to remain - but for a work-around, you could reset it. After anything else and before sqlite_column_blob(), you could sqlite_exec(db, "SELECT NULL WHERE NULL"); and that will (certainly?) be a success. Then sqlite_column_blob() to get the value, and check error code.

Well.. maybe. maybe not. :-/

sucky.. I'd agree that there's a bug if sqlite_column_* are expected to give error codes on failure, but do not set error codes on success. (disclaimer: I am not an sqlite dev)

(4) By Bo Lindbergh (_blgl_) on 2025-04-06 00:47:02 in reply to 2 [link] [source]

Suggestion: add sqlite3_column_blob_v2 and sqlite3_value_blob_v2 functions that don't return NULL for zero-length blobs.

(5.1) By Stephan Beal (stephan) on 2025-04-06 00:58:36 edited from 5.0 in reply to 4 [source]

Suggestion: add sqlite3_column_blob_v2 and sqlite3_value_blob_v2 functions that don't return NULL for zero-length blobs.

Hypothetically... something like this?

const void * pBlob = 0;
int nBlob = 0;
int rc = sqlite3_column_blob_v2(stmt, col, &pBlob, &nBlob /*optional - may be NULL*/);
if( rc ) { ... error... }
else { ... use pBlob ... }

:-?

(6) By Bo Lindbergh (_blgl_) on 2025-04-06 01:45:05 in reply to 5.1 [link] [source]

That would do the job. But if you're really going to refresh things, please add a transparent counterpart of the opaque sqlite3_value:

typedef union sqlite3_any {
    unsigned char type;
    struct {
        unsigned char type;
        sqlite3_int64 val;
    } any_int;
    struct {
        unsigned char type;
        double val;
    } any_float;
    struct {
        unsigned char type;
        unsigned char encoding;
        void const *val;
        size_t size;
    } any_text;
    struct {
        unsigned char type;
        void const *val;
        size_t size;
    } any_blob;
} sqlite3_any;

int sqlite3_column_any(
    sqlite3_stmt *stmt,
    int col,
    sqlite3_any *any);

int sqlite3_bind_any(
    sqlite3_stmt *stmt,
    int param,
    sqlite3_any const *any,
    sqlite3_destructor_type destroy);

(10.2) By Mark Benningfield (mbenningfield1) on 2025-06-30 11:59:17 edited from 10.1 in reply to 5.1 [link] [source]

Deleted

(11) By Mark Benningfield (mbenningfield1) on 2025-06-30 12:04:35 in reply to 5.1 [link] [source]

That would be nice, since this is the best workaround I could come up with in my wrapper code:

if (sqlite3_column_type(pStmt, colIdx) == SQLITE_NULL) {
  return NULL;
}
// Make sure that we own the error code

sqlite3 *pDb = sqlite3_db_handle(pStmt);
sqlite3_mutex_enter(sqlite3_db_mutex(pDb));

int ec1 = sqlite3_errcode(pDb);

// call these in order per the SQLite docs
unsigned char *pBuff = sqlite3_column_blob(pStmt, colIdx);
int cb = sqlite3_column_bytes(pStmt, colIdx);

int ec2 = sqlite3_errcode(pDb);

sqlite3_mutex_leave(sqlite3_db_mutex(pDb));

// since we have determined that the column value is not null, a null
// pointer is either a zero-length blob, or an OOM error (not very likely,
// but technically possible).
if (ec1 == SQLITE_NOMEM) {
  // there is already an OOM error flagged, so just bail.
  return NULL;
}
if (ec2 == SQLITE_NOMEM) {
  // this is our OOM, so throw
  // ... error handling code
}

// ... copy over the bytes

The new functions would make this a LOT simpler.

(12) By Stephan Beal (stephan) on 2025-07-01 08:54:22 in reply to 11 [link] [source]

Hypothetically... something like this? ...

That would be nice...

We're currently experimenting with new APIs in that form. They're not yet complete, and it's not yet decided whether they will go into the trunk, but you are welcomed to try them out and help shape them:

src:/timeline?r=column-text-blob-v2

(14) By Bo Lindbergh (_blgl_) on 2025-07-01 15:38:29 in reply to 12 [link] [source]

The new functions still give you null pointers for zero-length blobs, so you still have to call sqlite3_value_type / sqlite3_column_type to distinguish that case from an SQL NULL. Inconvenient!

(15.1) By Stephan Beal (stephan) on 2025-07-01 16:40:27 edited from 15.0 in reply to 14 [link] [source]

The new functions still give you null pointers for zero-length blobs

The current approach is not to provide new semantics, but to simplify checking of that pesky OOM case for the text/blob getters.

so you still have to call sqlite3_value_type / sqlite3_column_type to distinguish that case from an SQL NULL. Inconvenient!

Can you propose an API shape which doesn't require that? Perhaps we should add an optional output pointer which holds the value's type?

Edit: that addition is now in the branch but ideas for alternative solutions are welcomed.

(16) By Bo Lindbergh (_blgl_) on 2025-07-01 16:31:59 in reply to 15.0 [link] [source]

These are new functions; there are no old semantics to preserve!

Making them more consistent with how things work when moving data in the other direction would be a good thing. You need a non-null pointer to pass a zero-length blob to sqlite3_bind_blob* / sqlite3_result_blob*, because they treat a null pointer as meaning an SQL NULL.

(17) By Stephan Beal (stephan) on 2025-07-01 16:54:17 in reply to 16 [link] [source]

These are new functions; there are no old semantics to preserve!

There are internal semantics which must not be changed. e.g. column type coercion and text re-encoding is handled by the routines which fetch the sqlite3_value memory. The new APIs build off of those, and exposing a non-NULL, length-0 blob would go against their long-standing behaviors and assumptions and give users yet one more way to corrupt the library's memory (by handing them a pointer they don't otherwise have). All that said...

You need a non-null pointer to pass a zero-length blob...

FWIW, i agree 100% that the current handling of empty blobs requires undue client-side hoop-jumping at times. However, the semantics of "do not expose the underlying pointer for a zero-length blob" are deep-seated and inventing new semantics on top of the current internals currently feels ill-advised to me. That feeling may change once those internals are more familiar to me (this past day has been my first time working on that corner of the code).

In case you missed the edit: see my prior response to a link to the value type output pointer addition.

PS: there's no current deadline on this experiment - we're free to tinker with it as much as we like until it's "right".

(21) By Bo Lindbergh (_blgl_) on 2025-07-02 09:36:30 in reply to 17 [link] [source]

So don't expose any internal pointers; return something unrelated in the zero-length case. Nowhere in the documentation does it say that these pointers have malloc-like semantics (either NULL or distinct from every other currently valid pointer).

static unsigned char const emptyBlob[1] = {0};

(23) By Stephan Beal (stephan) on 2025-07-02 10:21:39 in reply to 21 [link] [source]

So don't expose any internal pointers; return something unrelated in the zero-length case.

That's a fair approach, and i'll discuss it with Dan and Richard. Giving clients a pointer they can't do anything with except compare to NULL gives me a bad feeling in my stomach, but Richard and Dan are better informed on the matter, so i'll defer to their judgment on it.

FWIW, my counter-arguments against it include...

Clients have to check either the length or the pointer before using it. If we return non-NULL here they still have to check for a length of 0 before accessing the pointer. That's fundamentally no different than comparing for a NULL pointer when using the v1 API, so any argument that client code can avoid a NULL check is moot - it's transformed into a length check instead of a NULL check.

The intent of this change has been to simplify checking for out-of-memory, not to introduce new semantics. My expectation from a client-side perspective is that text/blob_v2() should be semantically compatible with their v1 counterparts, just like we can use prepare_v3() as a compatible replacement for prepare() and prepare_v2(), but we cannot do that if they have different zero-length-blob behavior.

Sidebar: during development of this API, while attempting to use it as a drop-in replacement in some test code, one of the existing tests broke because blob_v2 was inadvertently returning non-NULL for length-zero blobs (due to a bug on my part). The setup, grossly simplified, boiled down to:

const void * getBlob(sqlite3_stmt *q, int col){
#if 1
  return sqlite3_column_blob(q, col);
#else
  const void * p = 0;
  sqlite3_column_blob_v2(q, col, &p, NULL, NULL);
  return p;
#endif
}

Those should, IMO, behave the same. Technical considerations aside, not doing so would be in violation of the Principle of Least Surprise (assuming we've already gotten over the surprise of being passed NULL for a valid blob, that is ;).

(24) By Stephan Beal (stephan) on 2025-07-02 13:28:32 in reply to 23 [link] [source]

That's a fair approach, and i'll discuss it with Dan and Richard.

Cooler heads prevail again: based on this thread and an off-list discussion, the two blob_v2() variants now return an "opaque non-NULL pointer" for length-0 blobs. They will still return 0/NULL for an SQL NULL. We also removed the value-type output pointer argument, as it adds more noise than utility.

(25) By Bo Lindbergh (_blgl_) on 2025-07-03 00:16:02 in reply to 24 [link] [source]

Thank you!

(18) By Nuno Cruces (ncruces) on 2025-07-01 22:39:10 in reply to 16 [link] [source]

Note that you can use the zeroblob functions to avoid the need for a pointer.

(20.1) By Stephan Beal (stephan) on 2025-07-02 08:31:23 edited from 20.0 in reply to 15.1 [link] [source]

Edit: that addition [an output pointer for the column type] is now in the branch but ideas for alternative solutions are welcomed.

@Bo i'm not convinced that that additional argument has enough generic utility to warrant the "noise" it adds at every call point which does not care about the column's original type. If the caller cared about the type, they probably wouldn't be extracting it as text/blob or they'd dispatch based on that type before fetching its value.

My current inclination is to remove that argument.

In practice, client code tends to either (A) dispatch handling of a column by its type, handling the value on a per-type basis, or (B) fetch the value as type X and don't care what the underlying type is. Having this extra output pointer doesn't help with either of those, and the rare cases which need both can still use sqlite3_column_type() or sqlite3_value_type(), as appropriate.

Internally, fetching the column's value type from the sqlite3_value/column_text/blob_v2() interface is precisely as expensive as it is for a client to do so with sqlite3_value/column_type() - there's no shortcut for it. It's just as expensive for the library to do as it is for the client, but the internal handling of this column-type output pointer adds CPU cycles (via an if(theOutputPtr!=0) check) for every user, not just the ones want to know the column's type.

Unless a compelling argument can be made for keeping it, it's likely to be removed. Arguments for keeping it are welcomed.

(22) By Bo Lindbergh (_blgl_) on 2025-07-02 10:06:24 in reply to 20.1 [link] [source]

If you want the original type, you might want the original text encoding as well, and that brings us all the way to reply 6 in this thread. Don't stop halfway. :-)

More brainstorming: add a new result code that the *_v2 functions can return in case of SQL NULL:

#define SQLITE_OK_NULL                 (SQLITE_OK | (3<<8))

(19) By Mark Benningfield (mbenningfield1) on 2025-07-02 07:30:26 in reply to 12 [link] [source]

I'm not too particular about the exact form of the newer versions. The main sticking point is that presently, we have to call sqlite3_errcode and pray.

If the new versions provide enough information (in whatever form) to distinguish -- at the call site -- if we have an empty array, a null value, or an OOM, then I'm tickled.

I know it's tricky, because according to the standard, there's no such thing as a zero-length array, so you have to be able to maintain that invariant while still being able to say "This column has an empty array in it".

(13) By Roger Binns (rogerbinns) on 2025-07-01 14:17:41 in reply to 5.1 [link] [source]

sqlite3_column_blob_v2(stmt, col, &pBlob, &nBlob)

I appreciate the length being part of the API rather than having to separately find it out. Reduces my code size.