SQLite

View Ticket
Login
Ticket Hash: f617ea3125e9ca959b8f980043a22670ede0ae97
Title: Incorrect ORDER BY with colliding input and output column names
Status: Fixed Type: Code_Defect
Severity: Severe Priority: Immediate
Subsystem: Unknown Resolution: Fixed
Last Modified: 2013-08-15 22:42:19
Version Found In: 3.7.17
User Comments:
drh added on 2013-08-14 20:24:32: (text/x-fossil-wiki)
Consider the following SQL:

<blockquote><verbatim>
CREATE TABLE t1(m CHAR(2));
INSERT INTO t1 VALUES('az');
INSERT INTO t1 VALUES('by');
INSERT INTO t1 VALUES('cx');
SELECT '1', substr(m,2) AS m FROM t1 ORDER BY m;
SELECT '2', substr(m,2) AS m FROM t1 ORDER BY m COLLATE binary;
</verbatim></blockquote>

The identifier "m" used in the ORDER BY clauses might refer to either the
input column "t1.m" or to the second column of output.  The SQL standards
are ambiguous on which should be used.  PostgreSQL uses the output column
for unadorned identifiers, such as found in query (1) and it uses
the input column when m is part of an expression, such as query (2).
Hence, PostgreSQL answers queries (1) and (2) with rows that in opposite
orders from one another.  MS-SQL 2008 works like PosgreSQL.  Oracle appears
to use the output column name in both cases.

SQLite should be changed to agree with PostgreSQL.

drh added on 2013-08-15 20:11:43: (text/x-fossil-wiki)
Add to the test case above:

<blockquote><pre>
SELECT '3', substr(m,2) AS m FROM t1 ORDER BY lower(m);
</pre></blockquote>

PostgreSQL binds ORDER BY terms to output columns if the term is an ordinal
or a simple identifier that matches the name of an output column.  If if the
ORDER BY term is an expression, it matches against input columns instead.
PostgreSQL considers "m COLLATE binary" to be an expression.  But for
compatibility with SQL92 and historical compatibility with older versions
of SQLite, we will break with PostgreSQL and bind ORDER BY terms of the
form "m COLLATE binary" to output columns.  However, "lower(m)" is clearly an
expression, so both PostgreSQL and SQLite will resolve the "m" variable there
to match the input column, t1.m.