CHECK violation during table definition ignored, normal or bug?

(1) By anonymous on 2020-08-25 18:45:20 [link] [source]

create table t(
  id integer primary key,
  s text not null default 'abc' check(length(s) < 3) --no error (bug?)
  );

insert into t(id) values(1); --expected error as default value violates CHECK

(2) By Larry Brasfield (LarryBrasfield) on 2020-08-25 18:58:49 in reply to 1 [link] [source]

The current release version (3.33.0) produces this error with your SQL:Error: CHECK constraint failed: t. What version are you using?

(3) By Keith Medcalf (kmedcalf) on 2020-08-25 19:00:12 in reply to 1 [link] [source]

This was fixed sometime between whatever version of SQLite3 you are using and the current tip of trunk.

SQLite version 3.34.0 2020-08-24 16:05:49
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite>
sqlite>
sqlite> create table t(
   ...>   id integer primary key,
   ...>   s text not null default 'abc' check(length(s) < 3) --no error (bug?)
   ...>   );
sqlite> insert into t(id) values(1); --expected error as default value violates CHECK
Error: CHECK constraint failed: t
sqlite>

So what old version of SQLite3 are you using?

Note that if what you claim were true, then data passing an INSERT would violate a subsequent integrity_check. CHECK constraints are invarient so I suspect this was fixed at the same time as the "application of affinity before evaluating check constraints" was fixed several releases ago a this caused the same problem -- that the CHECK constraints were not properly invariant.

(4) By anonymous on 2020-08-25 19:02:56 in reply to 2 [link] [source]

I'm using v3.33.0

To make myself clearer:

It's expected for the error to appear on the INSERT line.

What I'm asking is if it's normal to NOT get an error during the table definition when the default value 'abc' already violates the CHECK.

(5.1) By Richard Hipp (drh) on 2020-08-26 11:01:28 edited from 5.0 in reply to 4 [link] [source]

SQLite does not evaluate check constraints until you actually do an INSERT or UPDATE.

Follow-up: I tried this same thing on MySQL 5.6, PostgreSQL 9.3, Oracle 11g, and SQL Server 2017. They all work the same as SQLite in this regard - they all defer the error detection until an INSERT or UPDATE occurs.

(6) By Larry Brasfield (LarryBrasfield) on 2020-08-25 19:23:49 in reply to 4 [link] [source]

I suppose that, in some sort of ideal, imaginary world, a tool such as SQLite would be light-weight AND contain enough logic to recognize that a given default value will never satisfy the given check constraint. Ours is not such a world. For all the omnipotence behind such a feature could know in advance, somebody might wish to use an unacceptable default value to detect when an insert statement did not provide anything for that column.

I cannot speak to normality in general, but I can say that attempting to preclude creation of ostensibly silly but syntactically correct table definitions is inconsistent with the spirit of the SQLite design. The general assumption is that the user's intent is to be honored if its expression is a legal instance of the language and can be mapped to executable code.

(7) By David Raymond (dvdraymond) on 2020-08-25 19:46:02 in reply to 4 [link] [source]

To a certain extent this isn't much different from just having a column defined as "s text not null" right? The default value for that column is NULL, but that violates the constraint.

How about a case where you want NULL to be a valid value, but you don't want there to be a default. That is, you want someone to be required to explicitly enter NULL rather than having it be NULL by accident. You can't just say "not null" or something like that. But setting the default to something that violates a constraint (as in your example) would do that.

(I don't have a real world use case in mind right at the moment, I'm just "thinking out loud" as it were)

(8) By anonymous on 2020-08-25 20:46:20 in reply to 5.0 [link] [source]

That's all I needed to know. Thanks.

To answer some of the other comments:

This was a simplified example to show the issue.

In the actual case the table was much more complex with a lot of different CHECK constraints, and a default value for one them had a typo making it invalid.

It took a while to figure out which CHECK was giving the error when you don't expect a presumably accepted default value during table creation to be the culprit.

Perhaps, if the error message was a bit more informative by giving you which column is in CHECK violation, things would be easier.

Thank you all.

(9.1) By Keith Medcalf (kmedcalf) on 2020-08-25 21:13:52 edited from 9.0 in reply to 8 [link] [source]

It does, if and only if you ask it too.

create table t
(
   a integer, 
   b text default 'abc' constraint b_too_long check (length(b)<3)
);
insert into t (a) values (1);

yields:

sqlite> create table t
   ...> (
   ...>    a integer,
   ...>    b text default 'abc' constraint b_too_long check (length(b)<3)
   ...> );
sqlite> insert into t (a) values (1);
Error: CHECK constraint failed: b_too_long
sqlite>

Note that it is a bad tradesman who blames their tools when they use those tools improperly.

(10) By anonymous on 2020-08-26 16:35:11 in reply to 6 [link] [source]

I suppose that, in some sort of ideal, imaginary world, a tool such as SQLite would be light-weight AND contain enough logic to recognize that a given default value will never satisfy the given check constraint.

It could, but it is not reasonable to do so for CREATE TABLE (since you might want to require the data to be specified explicitly when adding it); it would, however, make sense to do this check for ALTER TABLE ADD COLUMN (if the table is not empty); the documentation says "Future versions of SQLite might change to validate CHECK constraints as they are added." (Implementing this seem like it would not be too complicated, as long as the CHECK constraint refers only to the column being added and not other columns; it would just evaluate the given expression, and see if the result is true.)

(11) By Larry Brasfield (LarryBrasfield) on 2020-08-26 18:43:02 in reply to 10 [link] [source]

(Implementing this seem like it would not be too complicated, as long as the CHECK constraint refers only to the column being added and not other columns; ...

A CHECK constraint acts just like a table constraint, and can refer to other columns in the same row. [a] Hence, it can be arbitrarily complex, refer to any column in the same table, and require function calls [b], making that limitation on what could be checked problematic, and making a sanity check upon the DDL useless or counter-productive.

[a. This "same row" limitation follows from the exclusion of subqueries in the check expression. ]

[b. These functions would not be limited to pure functions; their results could depend on context or history, either of which could differ at INSERT or UPDATE time from ALTER or CREATE TABLE time. ]

I should have written, "In a fantastical, unrealizable world, ..."

(12) By anonymous on 2020-08-27 11:51:38 in reply to 9.1 [link] [source]

It does not give the name of the column, but the name you assign to the constraint. So, it's not quite the same.

Giving the name to similar constraints is useful. You can (and in many cases you'd want to) give the exact same name, such as 'not_yyyy_mm_dd' so that you know the type of violation. Doing the same and pre/ap-pending the name of column just to know which column gave the error is rather inadequate, and prone to misinformation, if you happen to rename the column but forget to also update the constraint name.

I still think the error would be much more useful if it returned the column name affected. For me, returning just the table name is rather redundant as an INSERT or UPDATE can only work on a single table, so there is no ambiguity. Maybe the error could be like: t(col) to show both affected table and column.

Note that it is a bad tradesman who blames their tools when they use those tools improperly.

As for this comment, it's fortunate there are enough 'bad tradesmen' around or your presence here would be a bit redundant with no-one to help!

(13) By TripeHound on 2020-08-27 14:19:02 in reply to 12 [link] [source]

returning just the table name is rather redundant as an INSERT or UPDATE can only work on a single table, so there is no ambiguity

Unless triggers are involved, in which case the table with a failing CHECK clause may not be the one mentioned in the INSERT or UPDATE command:

sqlite> create table x (a check ( a < 10 ) ) ;
sqlite> create table y (b check ( b < 20 ) ) ;
sqlite> create trigger yx after insert on y begin insert into x values ( new.b ) ; end ;
sqlite> insert into y values ( 24 ) ;
Error: CHECK constraint failed: y
sqlite> insert into y values ( 14 ) ;
Error: CHECK constraint failed: x

(14) By Larry Brasfield (LarryBrasfield) on 2020-08-27 14:39:09 in reply to 12 [link] [source]

The check expression is not limited to having only one column as its term(s). What you are suggesting is that a check constraint violation should additionally produce and emit data from which "what was wrong" can be inferred. Yes, that would be doable (letting "Lite" considerations slide) for simple expressions involving just one unknown or variable term. But it would cover only a subset of the cases and the runtime cost would be out of balance with the benefits.

Keith's answer has, beyond the virtue of being right for SQLite as it is, merit as arguably the best way to permit (but not require [a]) enough of a clue to be emitted that debugging can be begin with a very limited scope, without imposing a inescapable runtime cost on everybody who uses table/column constraints.

[a. The "Lite" considerations work against forcing the additional byte code to be generated that are necessary for the clue to be emitted when the violation occurs. ]

Your comment regarding how a constraint name might fail to track changes to its logic is true in a limited sense. I would say that the name may indeed be inadequate (or entirely missing), at inception or later. So might the constraint logic or any query or DDL/DML that humans have devised. Whose problem is that? [b] The answer leads to the party in the best position to solve it. That would be the person(s) responsible for the code. The tool (SQLite) is never going to be able to eliminate the need for diligence and skill in writing the SQL it sees.

[b. As elaborated in a great book for those who face, frame and often solve problems, "Are Your Lights On?", this question is crucial. ]

(15) By anonymous on 2020-08-27 14:57:30 in reply to 14 [link] [source]

The check expression is not limited to having only one column as its term(s).

Either we're on different pages, or I misunderstand the syntax diagram.

From the syntax diagram the constraint no matter how it is 'fired' (triggers or not), and what other columns it may include in its expression, it is associated with a single column during table definition.

That column name I'm talking about, it's always the same for a given constraint.

At any rate, if that's something that can be done without unreasonably affecting performance, it'd be good to have. If not, life goes on!

(16) By anonymous on 2020-08-27 15:36:40 in reply to 13 [link] [source]

Fine, although you're cheating a bit :)

The second error comes from insert into x values ( new.b ) inside the trigger, and not from insert into y values ( 14 ) per se.

But your example proves that it certainly helps to have the table name show as well; it will at least alert you to look at triggers.

(17) By Larry Brasfield (LarryBrasfield) on 2020-08-27 15:39:18 in reply to 15 [link] [source]

As explained here for check constraints, the expression is less limited than you have led yourself to believe. If you follow the non-terminal, "expr", in that railroad chart, you will see that an arbitrarily complex expression can be written in its place among the more confined syntax elements. At that link on check constraints, you can learn that a "CHECK constraint may be attached to a column definition or specified as a table constraint. In practice it makes no difference." This further supports what the fully expanded syntax diagram says.

My contention is that helping with the diagnosis of those arbitrarily complex expressions, just when they fail, would be much more costly than it's worth. And, as I explained in this post, such a check performed when a table is created or altered could even fail when such a result would work against the SQL writer's legitimate intention. (Consider, for example, using the current datetime as a default value where the check insists that it fall within business hours but schema changes are done safely outside of business hours.)

(18) By anonymous on 2020-08-27 16:18:40 in reply to 17 [link] [source]

OK, I'm fully with you (for now).

I'm talking about the case the constraint is specified as part of a column definition. And yes, I understand that execution-wise it makes no difference, as it clearly says here:

A CHECK constraint may be attached to a column definition or specified as a table constraint. In practice it makes no difference.

Following the syntax diagram the compiler has the (now missed) opportunity to know if the constraint was part of a column definition or table definition.

It can, therefore, tag this constraint with a name, either the column's, or no name (blank) if it is at the table level. Then, it can use this name to produce more informative errors when that particular constraint fails.

Of course, if and only if the compiler is smart enough to optimize to a single execution path separately defined but identical operation constraints -- same exact instructions -- (which would be odd to happen in practice as you normally CHECK the column you're on when doing it at the column definition, so at least the operand(s) could change), then what I propose would be impossible, obviously. But a workaround would be for the tag to include all names like: col_a, col_b. Still, better that no help at all.

Otherwise, it doesn't seem that much of a chore.

(19) By Keith Medcalf (kmedcalf) on 2020-08-27 17:02:26 in reply to 18 [link] [source]

All CHECK constraints are table constrains. It is just happenstance that their is syntactic sugar allowing you to pretend they are part of the column definition when they are not:

sqlite> create table t
   ...> (
   ...> a integer,
   ...> b integer check (c > 4),
   ...> c integer check (b < 2)
   ...> );

This is the same as the "ON expr" clause being mere syntactic sugar for the clause WHERE/AND (expr) except in the specific case of an outer join.

(20) By anonymous on 2020-08-27 17:18:20 in reply to 19 [link] [source]

And this syntactic sugar is what can help assign a name to that constraint, not the column name(s) involved in whatever expression follows but that of the column where the CHECK appears.

So, in your specific example, I would expect a failure on b integer check (c > 4) to show Error: CHECK constraint failed: t(b)

even if it was the value in c that caused it.

Because now I can go to my table definition and look at column b (the one shown in the error) and see there is a check there which refers to column c (and possibly others). But, at least now I have a place to start my search rather than look all over the table to see which of a myriad of checks may have failed.

(21) By anonymous on 2020-08-27 18:49:38 in reply to 5.1 [link] [source]

Your checkin is a much better solution than I had envisioned, and I believe takes care of practically all concerns in this discussion.

MANY MANY THANKS!!!

(22) By Keith Medcalf (kmedcalf) on 2020-08-27 18:49:55 in reply to 20 [link] [source]

You can achieve this by doing the following, for example:

sqlite> create table t
   ...> (
   ...> a integer,
   ...> b integer constraint "t(b)" check (c > 4),
   ...> c integer constraint "t(c)" check (b < 2)
   ...> );

though I would note that this is rather confusing, but if that is how you want to do things, then you are of course free to do so.

Personally I should think that having the constraint named in such fashion as to indicate the constraint itself would be more rational and it should be noted that the exact same person as creates the check constraint also has control over what the constraint is named and that if this person deliberately chooses not to name the constraint then that is their decision.

I would note that your entire thesis is ill-conceived given that ALL check constraints are TABLE CONSTRAINTS even though they may be syntactically (though not semantically) spoken as part of the column definition.

It would perhaps be rational to request that if no constraint name was specified that the default name should be the SPAN of the expression composing the constraint. I have no idea if the SPAN data is maintained (as it is for example for the column default) but this would only be a rather small parser change I should think.

(23) By Keith Medcalf (kmedcalf) on 2020-08-27 19:12:09 in reply to 22 [link] [source]

Richard is brilliant!

I note that he just added a checkin which makes the default name of a check constraint be the SPAN of the check expression.

https://www.sqlite.org/src/info/5ce34a955bb36d77

(24.1) By ddevienne on 2020-08-28 14:10:01 edited from 24.0 in reply to 21 [source]

Did we lose the table name though? From the tests, it kinda looks that way.
I very much welcome this change too, although I'd hate to lose the table name.

It's fairly common to have tables with similar structures, in whole or part,
and thus they could share the same CHECK expressions. The message would then
be ambiguous, w/o the table name prefixed. table-name: check-expr seems like
a good format, FWIW.