SQLDIFF not showing true differences

(1) By Tony Papadimitriou (tonyp) on 2020-05-20 16:18:17 [source]

If a column has collate nocase, SQLDIFF utility fails to report any differences.

Database a.db:

CREATE TABLE t(s collate nocase);
INSERT INTO t VALUES('sample');

and database b.db:

CREATE TABLE t(s collate nocase);
INSERT INTO t VALUES('SAMPLE');

Now, use SQLDIFF a.db b.db and you won't see any differences. This is because of the collate nocase as when you remove it, the difference is shown.

I don't know if this is intentional behavior but to me it appears wrong. I expect a difference tool to show true differences between two databases.

The use of collate nocase may be used for case-insensitive retrieval but if the actual contents are different, a difference tool should report that.

(2) By Richard Hipp (drh) on 2020-05-20 16:27:44 in reply to 1 [link] [source]

The intent of the sqldiff utility is to show the logical differences between two database files. But there is not logical difference between 'sample' and 'SAMPLE' if they are in a "COLLATE nocase" column. SQLite considers them to be the same value. So sqldiff is doing the right thing by reporting them as the same. We had to add extra logic to sqldiff to make it work that way. It is intentional.

(3) By Tony Papadimitriou (tonyp) on 2020-05-20 16:29:14 in reply to 2 [link] [source]

In that case could there at least be some option to ignore collate nocase?

Or else, how can one create 'patch' files to update one database from another?

(4.1) By luuk on 2020-05-20 18:46:53 edited from 4.0 in reply to 3 [link] [source]

sqlite3 a.db

sqlite>attach 'b.db' as b;

sqlite>select t.s t1, (x.s collate binary) t2 from t inner join b.t x on t.s=x.s where t1<>t2;

t1 t2

sample SAMPLE sqlite>

(5) By anonymous on 2020-05-21 12:10:22 in reply to 2 [link] [source]

I can the logic. I was still surprised. =O

I guess this reflects the slight logical incongruity of a "case insensitve, case preserving" collation.

Could an UPDATE affecting only case on a COLLATE nocase column be validly ignored as an optimisation?

(6) By Tony Papadimitriou (tonyp) on 2020-05-25 13:20:56 in reply to 5 [link] [source]

Could an UPDATE affecting only case on a COLLATE nocase column be validly ignored as an optimization?

Good point, and ... obviously not!

Database content has two primary uses, retrieval and display. The collate nocase should only affect retrieval. Otherwise, the database engine would be free to convert all text of a collate nocase column to either all-lower or all-upper case for storage. (This would actually provide micro-optimizations in search speed [case-sensitive only comparisons] and required storage [better compression of a smaller char set].)

But, you need to preserve the actual case at least for display purposes. And although I can understand the 'logical' perspective argument, I consider SQLDIFF's failure to update case-only changes of collate nocase columns, incorrect behavior from the 'practical' perspective.

And if an option is added to allow both scenarios, the default (IMO) should be to ignore collate nocase and give a 'practical' DB difference which is what I believe most users would expect to see.

BTW, from SQLDIFF's help screen:

Output SQL text that would transform DB1 into DB2.

Well, not really, as I won't get identical data displays from either db.