Serious problems with STORED columns
(1) By anonymous on 2021-03-15 21:30:32 [source]
Create an empty database and run this script a few times a few seconds apart.
CREATE TABLE IF NOT EXISTS xxx.log(
dt text as (datetime()) stored, --datetime of event
msg text collate nocase not null, --message
unique (dt,msg)
);
insert into log(msg) values('a');
insert into log(msg) values('b');
insert into log(msg) values('c');
Now, .dump
the database (or just the one table log
) to a file:
sqlite3 sample.db ".d log" > sample.sql
Problems:
The
dt
column is not part of the dump. So, all that very important information is lost in the backup. You can no longer restore the database to its original content.Because of the use of the
datetime()
function in the table definition, attempting to create the database by feeding the dump, gives a sequence ofUNIQUE constraint failed: log.dt, log.msg
errors as there are multiple identical record insertions attempted.
My suggested solution is the introduction of some pragma
that will allow loading stored
columns directly as if they were regular columns. This can be part of the dump to allow correct restoration of the table. The dump should include stored columns too (except for non-stored generated ones).
(2) By Warren Young (wyoung) on 2021-03-15 21:35:36 in reply to 1 [link] [source]
I think you want a trigger here, not stored columns. On insert, set dt, etc.
(3.1) By Keith Medcalf (kmedcalf) on 2021-03-15 22:56:43 edited from 3.0 in reply to 1 [link] [source]
This is how computed columns work. You do not want a computed column, you want a default:
create table log
(
dt text not null on conflict replace default current_timestamp check (dt IS datetime(dt, '+0 days')),
msg text collate nocase not null,
unique (dt, msg)
);
create trigger log_noupdate before update on log
begin
select raise(ABORT, 'Cannot update log');
end;
insert into log(msg) values('a');
insert into log(msg) values('b');
insert into log(msg) values('c');
Edited to fix syntax error and trigger.
(4) By Keith Medcalf (kmedcalf) on 2021-03-15 22:50:07 in reply to 3.0 [link] [source]
Of course, if it really is a log and that is not just a euphemism, you probably want to prohibit all update operations (and mayhaps delete as well).
(5) By Keith Medcalf (kmedcalf) on 2021-03-15 22:59:45 in reply to 3.1 [link] [source]
Of course, this will only work to keep the log entries in order if you only have a maximum of one entry per second. You should change the dt field to real and the default to (julianday()) and get rid of the check constraint if you want to maintain order for up to one entry per millisecond (or one per tick).
(6) By anonymous on 2021-03-16 00:01:07 in reply to 3.1 [link] [source]
This is how computed columns work. You do not want a computed column, you want a default:
Regardless of my example, and how one may avoid this issue by not using a stored column, the problem is real when using one.
Unlike a VIRTUAL column that is computed on the fly, a generated STORED column is actual stored data and as such it must survive dumps, i.e., it must be possible to restore to the exact same content it had before a DUMP.
AFAIC, a solution is needed for that. The current behavior in my eyes is buggy!
(7) By Kees Nuyt (knu) on 2021-03-16 01:42:28 in reply to 6 [link] [source]
Unlike a VIRTUAL column that is computed on the fly, a generated STORED column is actual stored data and as such it must survive dumps, i.e., it must be possible to restore to the exact same content it had before a DUMP.
That would be hard, as a dump file just contains DDL and INSERT statements. And you can't INSERT :
insert into log(dt,msg) VALUES ('somestamp','c');
Error: near line ...: cannot INSERT into generated column "dt"
AFAIC, a solution is needed for that. The current behavior in my eyes is buggy!
A solution is available in the backup
interface and the VACUUM INTO
statement.
(8) By Keith Medcalf (kmedcalf) on 2021-03-16 02:47:55 in reply to 6 [link] [source]
Unlike a VIRTUAL column that is computed on the fly, a generated STORED column is actual stored data and as such it must survive dumps
Herein is the root of your bad expectation. The difference between a VIRTUAL computed column and a STORED computed column is when the value is computed. In the case of a VIRTUAL computed column, the computation is performed when the row is retrieved, each time the row is retrieved. For a STORED virtual column, the computation is made when the row is STORED or UPDATED, and merely retrieved when the row is retrieved.
A column defined as x as (current_timestamp) VIRTUAL
will be the timestamp at the time the row was retrieved (SELECT) whereas a column defined x as (current_timestamp) STORED
will be the timestamp at the time the row was stored (INSERT/UPDATE).
Since the row is first created when the INSERT statement is executed, the value cannot possibly be preserved via the "dump as sql"/"run the SQL" interface.
(10) By anonymous on 2021-03-16 09:01:46 in reply to 8 [link] [source]
I hear all your answers but I still don't have an answer on how to restore that table with the values it had so that it behaves as expected.
A VIRTUAL column is "restored" by having the calculations done again and getting the same result.
But a STORED column can never be restored!!!
It's illogical the before and after (dump reloaded) state of the table to not be the same.
(11) By Ryan Smith (cuz) on 2021-03-16 10:45:21 in reply to 10 [link] [source]
But a STORED column can never be restored!!!
Yes, and a letter knife can never fly a plane. So?
You need a screw-driver, but you insist on using a hammer and then complain of the fact that the hammer cannot undo a screw for you.
If you want persistent data, please use a persistent column. ANY form of Calculated column, with "STORED" privileges or otherwise, isn't persistent storage and never will be.
Why the insistence on using the wrong tool for the job? Change that column to a persistent storage column and all your troubles will go away.
We have a saying in my country: "If a monkey wore a golden ring, it still remains an ugly thing."
Just because a hammer can also make a screw go into a block of wood, doesn't make it a screwdriver - as will become painfully clear when you try to undo that screw.
Similarly, just because your calculated column also stores it's data (given some givens) like the big-boy real columns, doesn't make it a big-boy real column.
Your need is for a real column, not a calculated column. And if you think the "calculated" part of the column brought you some advantage that you do not have with a real column, please state that clearly, and someone here will show you how to make that happen with your real column.
(12) By Ryan Smith (cuz) on 2021-03-16 11:09:46 in reply to 10 [link] [source]
It's illogical the before and after (dump reloaded) state of the table to not be the same.
That's false. A dump isn't in any way under contract to preserve the "state" of the DB, only the Schema and persistent Data.
Put another way - if you had a view in your DB like this:
CREATE VIEW AS SELECT datetime('now') AS HereAndNow;
and you dump that DB and then restore it, would you expect the view to still show the old date and time from before you dumped it? Would you complain that the DB isn't exactly the same after restoring it?
No, of course not, it's silly. Well a calculated column is in principle a VIEW based on other columns/expressions. It will change whenever it is referenced, or in the case of it having the "STORED" property in SQLite specifically, whenever the row is written - which happens to happen at the moment you INSERT or UPDATE (or restore) that row.
You know what doesn't change? A normal column. Maybe try one of those? :)
(13) By anonymous on 2021-03-16 11:36:17 in reply to 12 [link] [source]
It's illogical the before and after (dump reloaded) state of the table to not be the same.
That's false. A dump isn't in any way under contract to preserve the "state" of the DB, only the Schema and persistent Data.
OK, let's play your way. Should persistent data be restored? According to your quoted sentence above it should.
In the my table I have one STORED column along with normal columns. Due to this problem, a table that had hundreds of rows now only has two. The rows with the persistent data were not loaded due to the collision of keys which didn't happen in the original but now all times are the same and rejected as duplicates.
You have to be able to at least load the rest of the row columns even if the STORED (which proves to be a misnomer) column contains garbage.
(15) By Ryan Smith (cuz) on 2021-03-16 14:58:45 in reply to 13 [link] [source]
OK, let's play your way.
Thanks!
You have to be able to at least load the rest of the row columns even if the STORED (which proves to be a misnomer) column contains garbage.
Absolutely, unless said garbage (read: ephemeral) value containing rows are used stupidly in a Primary or Unique Key.
Look, what you describe would be clear to anyone who knows how it works to be an enormously idiotically stupid thing. The grace here, and why it is not so stupid in your case, is that up to now you have genuinely believed that a STORED column is equivalent to a persisted column - a perfectly understandable expectation which would make your assumptions perfectly valid, if it were the case.
But now you have been pointed towards the light of truth, the correct response is "Thank you all, I have learned and now will use a real column."
Any more insistence to manufacture the most silly compilations of schemata as arguments to hang on to the idea that you had some glimmer of a point to begin with, will achieve the opposite.
I've just had to change my statement in another post where i had a wrong assumption this very day. Further to that, I've had my thinking revised on this very forum on both SQL and SQLite many times - it's called learning, that's why we like it here. You've learned - enjoy it. :)
(14) By Keith Medcalf (kmedcalf) on 2021-03-16 14:40:32 in reply to 10 [link] [source]
That is easy.
create table log
(
jdt real not null on conflict replace default (julianday()),
dt text as (datetime(jdt)) STORED,
msg text not null collate nocase,
unique (dt, msg)
);
Now when you insert your msg and use dump and restore the value of the dt column is preserved because its value is computed based on a persistent value inherent in the record and not an ephemeral value.
Why you would do this and not just make the dt column a default is beyond me, but you are the one with the problem that needs solving.
(9.1) By Ryan Smith (cuz) on 2021-03-16 03:07:09 edited from 9.0 in reply to 6 [link] [source]
Regardless of my example, and how one may avoid this issue by not using a stored column, the problem is real when using one.
STORED doesn't mean what you think.
You are confusing a calculated column with "stored" privileges with a physical data column.
(EDIT: Before I receive a slew of corrections - These remarks were aimed at SQL calculated column theory in general, not specifically SQLite's mechanism for handling it - which works slightly different than stated here, though the error of using it for persistent storage remains.)
A "calculated" column is, and forever will be, only existing in calculation space. It isn't real, and at its core is no reliable/firm/tangible data - Only a calculation. Moreover, it must at all times be reconstructable - and if one of the values it is reconstructed from should change, it's value (stored or otherwise) will be directly dismissed like a dirty cache and reconstructed, as if it never existed.
Now in order to save the processor from having to redo the math all the time, we sometimes allow the calculated value to be cached by giving it "stored" privileges - but think of this as a cache or a temp store, it isn't real, it isn't persisted (not officially anyway), and it most certainly isn't part of the official DB data. It has no use being output to a data dump, because it will be reconstructed upon insert anyway.
If you want a column for which its value is "calculated-then-stored-persistently" it had better be a persistent (i.e: "normal") column, and then have a calculated value upon insert, or calculated and updated by a trigger, or perhaps a default value calculation. What it can't ever be, is a "calculated" column (of any kind).
A calculated-stored column is a bit like your best friend's wife - you know her well, you can flirt with her, and if your friend is out of town he may even ask you to take her out on a date to avoid boredom. Now even though she watches the movie with you and you pay for her coffee, She exists as your date only in the moment. Your "date" privileges end at the doorstep, and in your house remains only your own wife as your only tangible companion.
(Unless you are looking to lose both your friend and your wife - but that's a different DB mechanism) :)
(16) By David Raymond (dvdraymond) on 2021-03-16 15:40:21 in reply to 1 [link] [source]
Quick question.
According to the Generated Columns Page section 2.3.3
The expression of a generated column may only reference constant literals and columns within the same row, and may only use scalar deterministic functions. The expression may not use subqueries, aggregate functions, window functions, or table-valued functions.
So my question is: Is datetime() considered deterministic? Or should there have been an error when creating that table which said "deterministic functions only please"?
(17) By anonymous on 2021-03-16 16:07:04 in reply to 16 [link] [source]
Hi David. I guess that's the special-case for date-time functions (Paragraph#3)
(18) By Richard Hipp (drh) on 2021-03-16 16:13:09 in reply to 17 [link] [source]
Right. And I did not remember that "datetime()
" with no arguments is the
same as "datetime('now')
" and so I failed to raise the exception when it
is used that way. This is bug. Unfortunately, fixing it might break some
preexisting schemas. Dunno exactly what to do about that....
(19) By Keith Medcalf (kmedcalf) on 2021-03-16 16:36:20 in reply to 18 [link] [source]
Which of course would also mean that current_date
, current_time
, current_timestamp
and julianday()
(with no arguments) would become impure as well.
(20) By Richard Hipp (drh) on 2021-03-16 16:38:10 in reply to 19 [link] [source]
Yeah. It gets complicated, doesn't it.
(21) By anonymous on 2021-03-16 17:11:45 in reply to 18 [link] [source]
If it's considered a bug, and given that using it in a stored column will certainly cause 'chaos' down the road on some unsuspecting 'idiot', like myself, who years later may try to recover a database from an old dump only to find out AT THAT TIME the table definition should have been different.
On the other hand, because this stored column is physically stored in the database (regardless of the low level details, i.e., how it may differ from a normal column), IMO a safer and usable approach would be to use some new pragma
to make stored columns temporarily appear as normal columns so they can be loaded as-is without computation.
And if you're quick to dismiss this idea because you think datetime() is the only possible 'failure' scenario, consider time-consuming calculations.
Isn't the whole point of having stored (instead of virtual) columns to save repeated (and possibly very expensive) calculations?
Consider a table for each row of which you have to spend N seconds to calculate the stored value. Now, after years of using the database, and thousands of accumulated rows, you need to reload the dump. All those calculations will happen at the same time, making the restoration a horror task.
Now, if you tell me you shouldn't have used a stored column for that case but you should have used a normal ('real') column instead, I will tell you: YES but why on Earth do we then ever need stored columns at all?
Everything that is not significantly time consuming to recalculate can be done with virtual columns that don't need to store anything.
(22) By Ryan Smith (cuz) on 2021-03-16 17:41:58 in reply to 21 [link] [source]
The "Bug" is not that it didn't work as you hoped, that is still the case an always will be.
The "bug" is simply that it didn't error out when you made the nonsense schema. An error would certainly have helped put you on the right road long ago and saved you from the heartache.
Isn't the whole point of having stored (instead of virtual) columns to save repeated (and possibly very expensive) calculations?
Yes, though not the only point, but that case aims to solve querying loops and repetitive reads. Restoring the DB is done maybe once in a blue moon. It doesn't count as such a case. And I for one want my DB engine's Calculated columns to do what they promise to do: "Calculate values", not "Remember values", else I would use a "Remembery" column.
(24) By doug (doug9forester) on 2021-03-17 17:34:23 in reply to 22 [link] [source]
I am now seriously confused. I assumed that "stored" meant persistent, but you've destroyed that assumption. What is the meaning of "stored" and when (and when not) should I use it?
(25) By Kees Nuyt (knu) on 2021-03-17 17:52:37 in reply to 24 [link] [source]
stored
is persistent, the values just don't survive
sqlite3 olddb .dump | sqlite3 newdb
But they will survive .backup
/ .restore
and VACUUM INTO
.
--
Regards,
Kees
(26) By David Raymond (dvdraymond) on 2021-03-17 18:25:45 in reply to 24 [link] [source]
The idea of a generated column is that you have something which can be determined solely from the other columns in that record. So the value doesn't really count as useful data, because it's completely dependent on other fields.
Example: Say for example you have a country table, and one of the fields, "geometry" is the shape of the country. So you're storing the entire outline of the country for display.
Now let's say in the same table you have a "surface_area" field and a "border_length" field. Well, each of those fields can be completely determined by the geometry field, and a bunch of calculations. So you can define them as calculated fields. If geometry gets changed, the calculated fields will change with it, and you don't need separate update statements, and can't accidentally update one and not the other.
You might used a stored field for something like the surface area. You don't want to go through the entire geometry of Russia and calculate the surface area every time you select that record, because it's an expensive operation. So marking it as stored means that the calculated value will get stored in the table/file. So if the geometry field isn't changing then you don't have to do all those expensive calculations again because it was stored for you to look at. But if the geometry field does get changed, then SQLite will again automatically do the calculations and store the new result for quick reference until geometry gets updated again.
The dump command in the CLI doesn't output those stored calculated fields because they're supposed to be able to be calculated from the other fields in the record. So it's "avoiding wasted storage space" if you will. It means it might be slower on a restore, because all those calculations have to be done again, but it's not "lost data" because it's supposed to be completely deterministic from the other fields.
(27) By doug (doug9forester) on 2021-03-17 22:16:51 in reply to 26 [link] [source]
Is it logical to assume that any STORED value will not change if NO OTHER FIELDS it was calculated from changes? For example, can I change the algorithm for calculating a STORED field after inserting records, perhaps, by using an ALTER COLUMN command? I'm thinking it better be impossible to do that or the database can become inconsistent.
Also, the STORED value is written to the database, right? After all, the point of STORED is to not have to calculate it again, so it must be written out. And if it's written out, then it - by definition - is persistent - from the database's point of view. You can mess with Sqlite's definition of persistent all you like, but the data persists and shows up when fetched as written to the database. That is, when I insert a record with a STORED field, and read it back a year later, it's unchanged and not recalculated, right?
So why would I ever use a STORED field instead of a real (persistent) field if the physicals are identical?
(29) By Ryan Smith (cuz) on 2021-03-17 22:58:00 in reply to 27 [link] [source]
So why would I ever use a STORED field instead of a real (persistent) field if the physicals are identical?
To save yourself the trouble of having to manually recalculate that column and update it back into the database every time one of its constituents change.
To be clear: That's your ONLY advantage with a STORED column. It's still a pretty good advantage though, right?
With a non-stored column, you get the added benefit of saving the data space, but you pay for that with increased CPU cycles when querying.
There is no magic here. No red pill. It's all cost/value benefit ratios and as an engineer/programmer you need to pick the ratio that makes most sense for your application.
(30) By Larry Brasfield (larrybr) on 2021-03-17 23:02:49 in reply to 27 [link] [source]
Is it logical to assume that any STORED value will not change if NO OTHER FIELDS it was calculated from changes?
Yes. Not just logical to assume but assured by the documented behavior.
For example, can I change the algorithm for calculating a STORED field after inserting records, perhaps, by using an ALTER COLUMN command?
That is not presently supported. The only column alteration supported is renaming, not adding/dropping/changing column constraints, triggers, or computation.
I'm thinking it better be impossible to do that or the database can become inconsistent.
I'm sure you are not alone in such thinking, although I would put it differently: Doing that would be an implementation complication because in no case will database inconsistency be a tolerated side-effect.
Also, the STORED value is written to the database, right?
Yes, of course. (There is no better place for it to be STORED.)
So why would I ever use a STORED field instead of a real (persistent) field if the physicals are identical?
There would be several good reasons. Perhaps the most important is the clear expression of intent, right in the SQL DDL. The other has to do with enforcing the intended consistency. The DBMS will take care of getting that right across inserts and updates. There is more to the feature's value than just what must be stored. For example, you might change between STORED and VIRTUAL generated columns based on performance measurement, with no other changes on your part (but big changes in what the DBMS must do.)
(31) By Ryan Smith (cuz) on 2021-03-17 23:37:45 in reply to 27 [link] [source]
Also, the STORED value is written to the database, right? After all, the point of STORED is to not have to calculate it again, so it must be written out. And if it's written out, then it - by definition - is persistent - from the database's point of view.
I felt uneasy about that statement, so thought I would make a separate post tackling it.
This is well known logic fallacy, or more commonly known as a semantic argument or semantic error. A common problem of language not being mathematics.
Saying that if it is stored, it is "by definition" persistent, may be correct in some sense, but not others.
It's like saying that someone who is driving a car is "by definition" a driver. What about if he takes a Taxi tomorrow? May we no longer call him a driver? What about if he is 12 years old and doesn't possess a driving permit, is he still by definition "a driver"? Maybe he is in the technical sense, but the court judge certainly won't hear of it, in the legal sense he is NOT a driver.
So if you are talking "does the bytes go into the persistent storage?" then YES a STORED column is persistent in that sense, but in the sense that it is NOT the data you put there, and never will be, and the data it has is fleeting and will be updated on a whim by some other column that happens to be in its calculation expression, then NO, it is most certainly NOT persistent in that sense, it is derived.
I hope that helps to clear up some of the other texts on the matter.
(32) By Keith Medcalf (kmedcalf) on 2021-03-18 00:46:33 in reply to 31 [link] [source]
The x as (expression) stored
is syntactic sugar. One can replace the following:
create table y
(
x real not null,
y as (BloodyLongCalcualtion(x)) STORED
);
with
create table y
(
x real not null,
y
);
create trigger y_y_disallow_insert before insert on y
begin
select raise(ABORT, 'Cannot insert into computed column y')
where y.y is not null;
end;
create trigger y_y_recompute_insert after insert on y
begin
update y
set y = BloodyLongCalculation(new.x)
where rowid == new.rowid;
end;
create trigger y_y_disallow_update before update of y on y
begin
select raise(ABORT, 'Cannot update computed column y')
when old.x IS new.x
and old.y IS NOT new.y;
end;
create trigger y_y_update after update of x on y
begin
update y
set y = BloodyLongCalculation(new.x)
where old.x IS NOT new.x
and rowid == new.rowid;
end;
As you can see, using a computed column is a crapload less typing to achieve the same ends. About the only difference is that the computed value must be deterministic whereas the trigger does not have to be.
(33) By doug (doug9forester) on 2021-03-18 07:13:08 in reply to 31 [link] [source]
I understand everything about STORED fields now - how to define them, how they are stored and retrieved to and from the database, how they are updated, and when to use them. Thanks everyone for that.
But I take issue with the words you use to explain them, specifically
So if you are talking "does the bytes go into the persistent storage?" then YES a STORED column is persistent in that sense, but in the sense that it is NOT the data you put there, and never will be, and the data it has is fleeting and will be updated on a whim by some other column that happens to be in its calculation expression, then NO, it is most certainly NOT persistent in that sense, it is derived.
The data saved in all database fields is most certainly the data intended to be there, even if you didn't give specific values. For example, when you specify time field with a value of "now()", you don't know what it is but your intention is clear. For STORED field, the intention is specified in the DDL. Even though you don't specify the value, the intention is clear. And it most certainly will not be "fleeting" nor updated on a "whim". It will be maintained and updated by the rules specified in the DDL.
And you might want to think about "derived" differently, as well. Every field in the database (except perhaps referential id's) is "derived" from something. That derivation is done inside the application, and result is saved in database fields. The value for a STORED field, on the other hand, is calculated and maintained rigorously by Sqlite using the DDL rules that the application specifies. You can label that "derived" if you want (or calculated or computed or ...), but the value stored in that field is exactly and precisely what the application intended.
(34) By Richard Damon (RichardDamon) on 2021-03-18 11:06:04 in reply to 33 [link] [source]
You specify what value they are to have by the expression that generates them. The intent is that it will ALWAYS have that value. STORED means err on the side of saving the computed value, and use that saved value if you can. Omitting the STORED says always recompute.
The error was that you used a function for the value that keeps changing, but SQLite didn't realize that. Thus it would give you back wrong values because it didn't realize it needed to recompute the value.
Storing a cached value so that you don't need to recompute every time is different than a 'normal' persistent value which absolutely won't change until you explicitly write a new value to it.
(35) By Ryan Smith (cuz) on 2021-03-18 11:48:08 in reply to 33 [link] [source]
Again, that entire explanation is not wrong or disagreeable, but it doesn't speak to the definition of persistent that is important here.
A column can be either "Calculated" or "Persistent" - that's the intended use, but since this causes headaches, I've devised a new plan.
What I'll do here to remove the ambiguity from the argument, is retract the use of the words "persistent" and "fleeting" in all the above posts and replacing it with the word "Scoriginy" and "schponkish". These words do not exist yet in any language, I checked, so nobody can argue with me about their meaning.
Here is the excerpts form the great Dictionary of Ryan - aka The "Ryanomicon": (I say "the" excerpts rather than "some" excerpts because so far they are the only two)
Definition:
Schponkish - Adj.
def 1. - To exist at the mercy of its peers. To their whims, even
though said existence may seem very real and persistent,
it can be altered in an instance by peers, through no fault
and/or doing of the object itself or its creator or other
authority - though those may change it too.
Scoriginous - Adj.
def 1. - To exist firmly and by design, how it was made to be. Can
only be altered or removed by its creator/originator or
proper authority. Its peers holds no sway over its
existence.
So now I can say with confidence: A column can either be
- a normal Scoriginous data column, or
- a calculated Schponkish column.
While the normal scoriginous columns are ALWAYS persistent, the Calculated columns may be persisted through special addition of the "STORED" directive, but they will still always be very schponkish and change/update the moment any peer within their expressions change, regardless of their "STORED" status.
I'm hoping you will find this a more agreeable statement than the previous. :)
(28) By Ryan Smith (cuz) on 2021-03-17 22:45:33 in reply to 24 [link] [source]
I think David and Kees probably solved the conundrum for you, perhaps if I can offer some added notes in case any more readers wonder the same:
A normal field/column is "persistent" storage because you will forever be able to retrieve whatever you yourself put in there. It persists.
A calculated field is never "persistent" because its value is dependent on the expression/calculation/formula, mostly based on "other" columns and will change any time any of those change, thus you can never get what you put in there, only what it calculated from whatever else. Its value doesn't persist, it is derived.
Now usually this calculation happens every single time the column in that row forms part of a query. So to save the incessant recalculation, there is a property/ability/permission/hint we can add in SQLite specifically that we can tell the column to, in stead of calculating EVERY time, rather calculate as soon as any data in the row is written/updated/inserted, and then remember that answer in your own cache until the next data change, so that when queried, you just give the cached/remembered answer (BUT, you certainly CAN recalculate at any point/time if you feel like it and the cached answer MUST be the same as what any such recalculation at any random time would produce, i.e: be deterministic - this is the only premise upon which such a cached system would be, and could be, allowed[1]).
The name for this caching mechanism in an SQLite calculated column schema is, what seemed like a perfectly good name: "STORED".
So you see the calculation never becomes "persistent" storage, just cached, and like any cache, it is fleeting and unimportant - the very moment the database engine's nose gets any whiff of a scent of change/update/rewrite to its row, the calculated value will be recalculated. This is why it doesn't get exported in a DUMP, because there is no point, upon re-import it MUST by definition be recalculated, yet it does get included in a backup because for the backup the data is not technically manipulated (no insert/update), it's just copied.
I wish the SQL word used was "CACHED" in stead of "STORED", though it would probably be equally susceptible to confusion. That's why reading the documentation is such an important part of engineering.
[1]: The bug discovered in SQLite during this thread is precisely because the query engine neglected to enforce this rule - something the OP transgressed (since his calculation was not deterministic and so did not yield the same result as the cached value every time) and in stead of getting an error, was allowed to do it, which ended in heartache when he dumped data and tried to re-import it, the calculated column now no longer evaluating to the same cached value as before.
(The OP also made the mistake of using a dump as a backup, if he had used the backup mechanism, he would have been fine - but nowhere is this practice discouraged, it's perfectly acceptable and it would have worked just fine if the column was set up deterministically, so nobody can hold that against him.)
(23) By Richard Damon (RichardDamon) on 2021-03-17 00:53:21 in reply to 18 [link] [source]
Is there a requirement that the default be a deterministic value?
Yes, where it was used in a computed column, it seems that it would be wrong by the rules, (and this case sort of shows why), but it should be allowed in the default clause.
Backward compatibility is always an issue when thinking of fixing this sort of bug.
(36) By anonymous on 2021-03-18 13:30:06 in reply to 23 [link] [source]
Backward compatibility does not have to be an issue. Think about foreign keys: potential problems are avoided by controlling enforcement with "pragma foreign_keys". In this case something like "pragma deterministic_store" could avoid breaking existing code when fixing this.