SQLite Forum

RFC: Built-in way to "decode" binary data
Login

RFC: Built-in way to "decode" binary data

(1) By ddevienne on 2023-07-05 13:49:42 [source]

Hi,

Below's a SQL snippet which I'm using as a generated column
accessing the "row" column ([thanks to that SO post][1]),
which contains encoded binary data.

I'm dynamically extracting a name at a known offset,
with its length in the previous 4-byte big-endian integer.

In my case, I know the names are unlikely to be large,
thus I get away with reading and converting just 1 byte of the 4.
Without that simplication, the SQL would be even more unwieldly,
and if it was 8-byte integer, even more so (16 lines of instr and co...)

The purpose of this post is to say there ought to be a better way to do this.

Yes I could easily write custom C functions for that purpose, but those DBs
are meant to be readable and useable from any SQLite tool (CLI or UI),
thus only built-in functionality is acceptable. And IMHO, loadable extensions
are not practical enough, in part because there's no way to declare in the DB
itself the one(s) the tool ought to load to make sense of the schema.

The functionality needed is similar to the od one from Linux I guess.

I hope this is not too niche to be considered for a future SQLite version.

Thanks, --DD

name as (
  cast(
    substr(
      "row", 15, -- 1-based offset of the start of the name
      ( instr("123456789ABCDEF", substr(hex(substr("row", 14, 1)), 1, 1)) << 4
      | instr("123456789ABCDEF", substr(hex(substr("row", 14, 1)), 2, 1))
      ) -- name length (assumed <= 255) from the least-sign byte of big-endian int32
    )
    as text
  )
)

[1]: https://stackoverflow.com/a/70296198/103724 for the instr+substr+hex

(2) By ddevienne on 2023-07-05 14:12:52 in reply to 1 [link] [source]

I guess another way to make this more pallatable, would be a way to define pure-SQL functions, defined in SQL and stored in sqlite_master as usual.

That way at least the long and ugly SQL expression can be factored once and for all in a reusable function.

As much as I'd like that, I'm fairly sure adding an od-like built-in function would be easier.