Can parts of a database be protected for distribution?
(1) By Gary (1codedebugger) on 2021-02-04 20:12:44 [link] [source]
The context of my question is that I'm attempting to take a tool I was building in a different format and convert it all to a SQLite format. It ends up being a C program compiled with the SQLite amalgamation file and communicates with a UI.
The tool allows users to build and run study modules; and the UI provides a build side and a run side. Ultimately, though, all the user really produces at the end is a sqlite file, which can be shared with other users and opened in the tool, either to be edited further or simply run.
The run side also writes to the database, because the user can record notes within the study modules and there are assessments, the historical results of which can be stored to log the user's progress.
My question is can it be set up such that a shared database cannot be edited through the build side, if the builder chooses to protect it, and still allow the user to write to the database in the run side? I know that the build functionality in the UI can be made not available based on values within the database but anyone can alter those values at the command line if they know where to look.
My objective is to protect the builders' work from abuse. It can take many many hours of hard work to build a set of quality modules and it wouldn't be right for another to start from that work and edit it to take an opposite position unless the builder chooses to distribute it in an open format.
If it is possible, is it something that I need to build in to the tool or is it something that can be done after the database has been built and before distribution to others?
By the way, I'm not talking about selling or licensing access to the database modules, but just protecting them from being manipulated on the build side after distribution.
Physical access is total access in a system where you can read the data.
Tell me, what possible system could prevent VACUUM INTO from the protected DB to a plain SQLite DB?
You can play games with public-key cryptosystems to sign the blobs so that unhacked client programs can check for legitimate signatures before showing the result, but there again you've got to ask how likely it is that a bad actor wouldn't just modify the client to read anything, or intercept the signature check, or replace the keys, or whatever.
This sounds like a good use of web technologies, allowing you to put the data on the other side of an authorization layer, which only lets the owner of the data modify or replace it.
The "approved" way to do this (in the United States and other Communist/Fascist Regimes) is to set a flag somewhere that "authorized" applications obey.
You can then use the DMCA to sue those who would change or disregard the flag as being "unlawful circumvention" and those who publish details of how to change the flag setting as providing "illegal circumvention technologies".
You should then also embark on a massive advertising campaign against "illegal circumvention tools".
This is the Communist and Fascist Approved method of control.
In other words, once you have let something out of your physical control, you no longer have physical control. Sorry to tell you that this is how the universe works, but it has always been thus.
(4) By Gunter Hick (gunter_hick) on 2021-02-05 07:23:06 in reply to 1 [link] [source]
Maybe the sqlite3_set_authorizer() interface would help to implement "cooperative protection" as oposed to "protection against hacking". It would require that "stuff to be protected" and "unprotected stuff" reside in distinct tables and/or fields and a callback that allows or denies the execution of specific actions.
(5) By Gary (1codedebugger) on 2021-02-05 21:41:44 in reply to 4 [link] [source]
Thank you. I appreciate it. I'll look into that option. I'm not trying to secure the database itself but just make parts of it not editable through my little tool, when the builder chooses to distribute it as such. If someone wants to attempt to build something that will run in the tool by externally modifying a database, that would be quite a challenge and easier to start anew from within the tool. That type of disincentive is about all I was expecting to achieve.
If it's unlikely that someone would hack on the DB directly, then I don't see that you need an authorizer at all. Your application can simply mark the DB as "not editable" and then refuse to let someone edit it.
If you install an authorizer atop that, what stops the end user from patching the authorizer call out? Are you depending on the dubious protection of a compiled binary-only program, or what?
Please precisely define your threat model.
(7) By Gary (1codedebugger) on 2021-02-10 07:45:22 in reply to 4 [link] [source]
I've been thinking about this a bit more and from a different perspective.
First, there's no money to be made by messing with my little tool or the work others build in it; so, I'm not expecting to attract people who can reverse engineer complex programs. I'm just trying to prevent the clever jackass from messing with the work of others and discouraging people from sharing content.
The database itself is no more useful outside of the tool than an Excel file is outside of Excel. So, it doesn't matter if someone can read the data externally, but just that they can't edit certain areas from within the tool or open it at all. But, of course, there has to be a value or values within each database indicating which areas can be edited. The question seemed to be how can they be protected, obfuscated, hidden; but maybe there's an additional step.
Before considering SQLite, the data that would be shared was in a flat file that contained a set of maps required to read different areas of that data to be parsed, some of which also contained internal maps for further parsing. Values that were related to functionality weren't simply 0 for false and 1 for true; but were of different lengths, such that, if they were to be modified external to the tool, the data and the maps wouldn't be consistent and parseable. That wasn't for security but was just the way it was set up from the beginning. The time it would take to find and alter the maps, which would also alter the data layout further requiring other maps to be altered, just made it not worth the effort for some joker to figure out.
Is there an equivalent in relational databases? For example, could a set of triggers be used such that if one of those values that determines when the build side is displayed in the UI is altered, a trigger alters something else in the database which instructs the tool to no longer function on that database period and/or delete important sections of data?
Perhaps, it could be set up such that if one knew which triggers were significant, they still couldn't be deleted because they would also be required for normal valid operation in some manner.
That may be a ridiculous example but I'm just trying to give an example of what I'm asking. Is it possible to sort of wire a database to self destruct if off-limit areas are tampered with? At least to make it a ball of yarn tangled up enough to discourage the jerks from messing with it? Each mistake would "ruin" the database to the extent that it couldn't be opened in the tool again without rebuilding it, requiring the user to obtain another copy to try it again.
(8) By RandomCoder on 2021-02-10 17:06:35 in reply to 7 [link] [source]
If someone can edit the database, what stops them from turning off the triggers, should they exist, making the edits, then turning them back on?
It strikes me that you could compile your own version of the SQLite library with the SQLITE_FILE_HEADER value changed.
Then only your application can open or modify the database, so you could simply have flags in your database to protect content from being edited in your application. Well, of course, it's still trivial for someone in the know to modify the database binary using a hex editor, then change the file using the tool of their choice, and revert the header so your application will read it.
The truth is much more nuanced than this. "The clever jackass" is often perfectly capable of opening a file, and changing all occurrences of "should" to "won't " and leaving something that isn't changed in size, but conveys different meanings. Sometimes they'll also go to surprising lengths to just break the protection for no real gain. You're at the bottom rung of a game of cat and mouse that others have played for various reasons, on there are no simple solutions that will handle all attack vectors.
(11) By Gary (1codedebugger) on 2021-02-19 02:34:10 in reply to 8 [link] [source]
Thank you for this information. It's quite helpful, for I hadn't even approached the problem from the correct perspective.
(9) By Scott Robison (casaderobison) on 2021-02-10 17:29:38 in reply to 7 [link] [source]
Depending on just how you want to structure it this is possible.
The first thought that comes to my mind (which doesn't mean it is a great idea) is to use asymmetric encryption of the data. Your program which writes the sensitive portions of the database includes the private key and encrypts the fields that should not be editable. The program which reads the sensitive portions of the database decrypts with the public key. Any non-sensitive data need not use any encryption.
Of course, this makes the data opaque to SQL as well, so if you want to be able to make queries based on the contents of the encrypted data, you are out of luck. Unless you build the decryption function into SQL instead of just doing that in your program logic.
The next idea, which would be less complicated, would be to add a column with a hashed value of the data, then check to ensure the data matches the corresponding hash. This will be easier for the user to circumvent but adds a tiny level of intricacy above and beyond simple SQL queries.
My last idea, for now, would be to write a custom VFS for SQLite that stores data in a non-standard way that makes it unusable with off the shelf SQLite compatible tools.
Have the checksum field be an encrypted blob that consists of a random salt used to initialize the hash of all the other columns of the row. That does mean the queries have to return the entire row and have discipline to avoid unexpected type conversions.