SQLite Forum

Comment about application file format
Login
> Microsoft Office format does use a wrapped pile of files

Perhaps drh was thinking of the pre-[OOXML](https://en.wikipedia.org/wiki/Office_Open_XML) formats. It does say "DOC" rather than "DOCX" and so forth. That interpretation is somewhat untenable, though, given that the referenced document [only goes back to 2014](https://www.sqlite.org/docsrc/finfo?name=pages/appfileformat.in).

> it isn't ZIP

OOXML most definitely is ZIP-based.

I just created a PPTX file in PowerPoint 2019, and "`unzip -t`" reads the resulting document successfully. From my reading about this [CFBF] format, a file in that format shouldn't pass this test. CFBF is reportedly the basis of the MSI format, for example, and the unzip test doesn't pass on an MSI file I created recently.

How do I get Office to produce a CFBF-based file, one that fails this unzip test?

The relevance, of course, is toward the referenced SQLite doc: until we know what it takes to produce a CFBF based doc, it doesn't seem useful to be talking about it in that doc, except perhaps in reference to MSI.

> I am not sure how to propose adding new things to the list of application ID numbers into the magic.txt

It depends on which implementation of `file(1)` that comes from, but statistically, it's likely to be [this one][file]. You may find the core of this project referenced as "`libmagic`", since it has wrappers other than `file(1)`.

> Using separate journal files, etc doesn't seem the best way to me

Those are only created when the DB is in WAL mode. Solution: don't use WAL mode.

> a few people prefer TRON

I hadn't even *heard* of [TRON encoding][TRON] before today.

Who are these "few people" that they have any hope of countering the three decades of inertia behind Unicode?

At minimum, I think it'd take a supremely popular new OS that used TRON by default to wag this dog, this late in the game.

> PostScript…doesn't have good FFI…not really a problem with SQLite itself

Yes, so why is it any objection here?

Never mind the specifics: you seem to be using these extremely niche cases to argue against SQLite as an application format. So fine, it doesn't work in those niches. How does that argue the broader point?

I once participated in an Internet flame thread where someone tried to argue that the adage "there's always someone worse off than you" was wrong by pointing out that *you* could be the worst-off person in the world. And yes, for one in 7.5 billion people, that's true, but to take that position, you're arguing a one-in-7.5-billion-against position.

Is the point here to be "technically right", even if that means being wrong in almost every practical case?

[CFBF]: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-cfb/
[file]: http://www.darwinsys.com/file/
[TRON]: https://en.wikipedia.org/wiki/TRON_(encoding)