Is the order conserved in a table from a VALUES clause?
(1.1) By Check_values on 2021-11-02 10:51:21 edited from 1.0 [link] [source]
I have the following scenario (see fiddle here):
CREATE TABLE x ( val INT PRIMARY KEY, txt TEXT NOT NULL );
INSERT INTO x VALUES (1, 'val_1'), (2, 'val_2'), (3, 'val_3'), (4, 'val_4'), (5, 'val_5'), (6, 'val_6'), (7, 'val_7');
SELECT x.val, x.txt FROM x JOIN ( SELECT ROW_NUMBER() OVER () AS rn, column1 FROM (VALUES (5), (6), (1)) ) AS tab ON x.val = tab.column1 ORDER BY tab.rn;
val txt 5 val_5 6 val_6 1 val_1
Now, I want to know am I guaranteed to have the
ORDER of my
VALUES clause respected - no matter how big it gets? Will
ROW_NUMBER() match the VALUES in tab in the order in which they are declared?
With PostgreSQL and
WITH ORDINALITY, one can guarantee the order of the result.
So, for SQLite, is this the case now and will it remain the case into the future?
If the behaviour is undefined, may I suggest that it be guaranteed - makes life a lot easier - and if one has entered the values in a particular order, it seems reasonable to respect that down the line - at least to me! Or else they would have been entered in a different order!
Just a (respectful) thought! Rgs.
(2) By ddevienne on 2021-11-02 12:14:05 in reply to 1.1 [link] [source]
First, it's just wrong to assume rows come in any particular order,
unless you explicitly ask for a given order, via an
ORDER BY clause.
The order of rows depends on the query plan, whether it uses an index or not,
and even whether some pragmas are used or not.
Second, if you have your Primary Key match the internal SQLite rowid, as you do,
and explicitly provide those PKs are you do in your example, then simply use
select val, txt from x ORDER BY val to have a deterministic order
for free, since the FULL SCAN will be in PK-order, no need for Window functions.
SQLite will never provide any guarantees for what order rows are stored into,
nor how they are selected into (w/o an
order by clause), whether you insert them
values() or not. How it works now is not relevant either, as that could change too.
(3) By Check_values on 2021-11-02 12:54:30 in reply to 2 [link] [source]
It's not the ORDER BY of the x table that I'm worried about.
The ORDER BY that I am concerned with is the one in the table produced by the VALUES clause - i.e. (5), (7), (1) - and then I want the values selected from the x table to be in the order of 5,7,1 - i.e. as defined by the VALUES clause.
So, you are saying, that given the current implementation of SQLite, the only way I can guarantee the order by in the VALUES clause is to do something like (fiddle):
SELECT st.id, st.filler, tab.rn, tab.column1, tab.column2 FROM some_table st JOIN ( SELECT ROW_NUMBER() OVER () AS rn, x.column1, x.column2 FROM (VALUES (1, 5), (2, 7), (3, 1)) AS x ) AS tab ON st.id = tab.column2 ORDER BY tab.column1;
Though this always seems to work (same fiddle):
SELECT ROW_NUMBER() OVER () AS rn, * FROM (VALUES (5), (7), (1));
gives (1, 5), (2, 7)... &c. - i.e. order is respected.
I don't see why respecting the ordering of the input sequence in the VALUES clause couldn't be made the default - to me it makes perfect sense. Any implementations that explicitly use the ORDER BY will continue to function... any that don't - behaviour was undefined anyway!
Anway, it would be nice to have a WITH ORDINALITY clause to imitate PostgreSQL's functionality - as I said, why put the VALUEs clause in a certain order if you didn't want that order in the first place?
Thanks for your input on this! Rgs,
(5) By ddevienne on 2021-11-02 13:15:17 in reply to 3 [link] [source]
Well, in the current impl, the
values clause is special it seems,
PRAGMA reverse_unordered_selects does not affect it (see below).
But the fact remains that
values is table-like, and represents a set of rows,
not a list. So it's entirely conceivable for SQLite to internally sort them,
or heap-sort them, or leave them as-is, or randomize them.
Just don't really on any implicit or lexical order. Always use an explicit order.
sqlite> with t(ord, val) as (values (1, 'two'), (2, 'one'), (3, 'five')) select val from t order by ord;
C:\Users\ddevienne>sqlite3 SQLite version 3.28.0 2019-04-16 19:49:53 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> create table t (v int); sqlite> insert into t(v) values (1), (2), (3); sqlite> select * from t; 1 2 3 sqlite> select * from (values (1), (2), (3)); 1 2 3 sqlite> PRAGMA reverse_unordered_selects; 0 sqlite> PRAGMA reverse_unordered_selects = 1; sqlite> PRAGMA reverse_unordered_selects; 1 sqlite> select * from t; 3 2 1 sqlite> select * from (values (1), (2), (3)); 1 2 3 sqlite> select * from (values (1), (2), (3)) order by 1; 1 2 3 sqlite>
(8) By Ryan Smith (cuz) on 2021-11-02 13:38:49 in reply to 3 [link] [source]
The order of the values are most assuredly not guaranteed in SQL in general, and I doubt it is anywhere guaranteed by SQLite (unless I've missed such documentation).
To ensure a row order, you have to be sure the set contains values which possess intrinsic order and by which you specifically ask the results to be ordered. That is the only way. Any apparent order you are currently getting from any query that isn't ordering by a specific value, is purely happenstance and coincidental.
Luckily SQLite provides a fool-proof way by which you can see what is "guaranteed": There is a pragma called "reverse_unordered_selects" (you can read about it here) which, when activated, will purposefully give your query results back in a bad order UNLESS you asked correctly for it to be ordered, in which case it returns it correctly. So if your query returns a good order while that pragma is in effect, you can be guaranteed that the query will work in future for the order you've intended.
(9) By ddevienne on 2021-11-02 13:48:46 in reply to 8 [link] [source]
Well Ryan, as I showed above,
values is currently immune to that
(11) By Ryan Smith (cuz) on 2021-11-02 13:54:14 in reply to 9 [link] [source]
Right you are, sorry was already replying when that was sent.
I have to assume that this means that either:
- the values clause is definitely guaranteed to be ordered in SQLite, or
- that there is a bug and reverse_unordered_selects is not doing its job.
(12) By Gunter Hick (gunter_hick) on 2021-11-02 14:31:47 in reply to 11 [link] [source]
Neither. See also my comment below. This is the bytecode generated by SQLite 3.24.0: asql> explain select * from (values (1),(2),(3) ); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 1 0 00 NULL 1 InitCoroutine 1 9 2 00 NULL 2 Integer 1 2 0 00 NULL 3 Yield 1 0 0 00 NULL 4 Integer 2 2 0 00 NULL 5 Yield 1 0 0 00 NULL 6 Integer 3 2 0 00 NULL 7 Yield 1 0 0 00 NULL 8 EndCoroutine 1 0 0 00 NULL 9 InitCoroutine 1 0 2 00 NULL 10 Yield 1 14 0 00 NULL 11 Copy 2 4 0 00 NULL 12 ResultRow 4 1 0 00 NULL 13 Goto 0 10 0 00 NULL 14 Halt 0 0 0 00 NULL The guarantee holds only as long as the code generator implements VALUES as a coroutine, and this prevents the pragma from affecting the order.
(13) By Ryan Smith (cuz) on 2021-11-02 14:52:18 in reply to 12 [link] [source]
I did see your other comment and I do see what you are saying of "why" it happens to work that way, no argument there, but I am not accepting that that is what "should" happen.
My contention is simply that either it must be so guaranteed in the documentation to work that way, now and forever, OR, pragma_reverse_unordered_selects must catch this case and reverse it also.
We are very much using the reverse_unordered_selects as part of pre-production testing because I thought that if it works with that pragma enabled, then it works correctly, no exceptions (and I do not think my belief is misplaced). Now we are seeing a case where that simply may not be, a query order may change in future for a query that is going into production and we have no way of knowing now if all our SQL is good and ordered and will remain so for days to come.
I mean, if it only works "sometimes", what is the point of having that pragma?
(14) By Gunter Hick (gunter_hick) on 2021-11-02 15:24:53 in reply to 13 [link] [source]
The pragma will reverse the order (my guess here) if and when SQLite decides to implement VALUES as an internal table. I'm also guessing that the pragma affects the execution of the SQL program and not the generated bytecode, so concievably you could prepare a statement, run it, change the pragma setting, and rerun it, yielding different ordering of unordered selects. There is also this disclaimer, which IMHO covers the subject exception on VALUES: "The reverse_unordered_selects pragma works for most SELECT statements, however the query planner may sometimes choose an algorithm that is not easily reversed, in which case the output will appear in the same order regardless of the reverse_unordered_selects setting."
(4) By Simon Slavin (slavin) on 2021-11-02 12:58:13 in reply to 1.1 [link] [source]
A SQL (not just SQLite) database is a set of rows. Not an ordered set. The rows of a table are in no particular order. Theoretically you can execute
SELECT * FROM MyTable
twice and get the same answers in a different order each time.
Row numbers normally don't matter: you don't bother assigning them and you would never care enough to retrieve them. If, on the other hand, you want specific row numbers for each of the values you insert, you can assign your row numbers to a column.
(6) By Check_values on 2021-11-02 13:16:42 in reply to 4 [link] [source]
If, on the other hand, you want specific row numbers for each of the values you insert, you can assign your row numbers to a column.
Indeed, and I've done this - see the fiddle in my previous response.
OTOH, it strikes me as logical to keep the order of the data in the VALUES clause (as the default) - otherwise, why not simply specify it in an incrementing fashion in the first place?
Or, maybe, at least keep it in the part where the table defined by the VALUES clause is created - i.e. guarantee the ROW_NUMBER() functionality that I want/require?
At the very least, some sort of implementation of PostgreSQL's WITH ORDINALITY would be desirable!
Just for the record, I have nothing but admiration and respect for the SQLite project generally and D. Richard Hipp in particular!
(7) By Gunter Hick (gunter_hick) on 2021-11-02 13:36:40 in reply to 1.1 [link] [source]
In SQLite 3.24.0 the VALUES clause is implemented as a coroutine, so the implicit order of the rows returned is that from the code generator, which currently happens to be the lexical order in the statement. None of this is subject to any guarantees. If in the future SQLite decides to make an internal table out of the VALUES clause and the query planner thinks it expedient to access this table on the RHS side of a join and maybe with a temporary index, anything could happen with the visitation order.
(10) By ddevienne on 2021-11-02 13:49:47 in reply to 7 [link] [source]
Thanks Gunter. That's insightful.
(15) By Check_values on 2021-11-02 16:26:30 in reply to 7 [link] [source]
Hi Gunter - and thanks for your reply!
So, can I take it that current versions of SQLite do this?
You are clear that this may change in the future? However, I can rely on this functionality for versions up till now?
You mention 3.24 - the latest is 3.35.5 - am I sure that this will continue to function correctly (as I see it - i.e. order is preserved) up to this version?
I don't see why not - it's a good default - follows the principle of least surprise - it's logical as far as I'm concerned!
(16) By Larry Brasfield (larrybr) on 2021-11-02 17:54:37 in reply to 15 [link] [source]
You can rely on currently released versions to continue their present behavior. (There are no manipulations of history in the source archives.)
Anybody is welcome to rely on undocumented, implementation-defined behavior. They do so at their peril with respect to continuation of such behavior in future or other as-yet-untested releases.
(18.1) By Check_values on 2021-11-03 07:47:14 edited from 18.0 in reply to 16 [link] [source]Deleted
(19.1) By Check_values on 2021-11-03 07:47:38 edited from 19.0 in reply to 16 [source]
They do so at their peril with respect to continuation of such behavior in future or other as-yet-untested releases.
Thanks for your input Larry - I totally get that relying on undocumented behaviour is not a good strategy!
The thrust of my argument now is that the current behaviour should be fixed and documented.
If a programmer wants the values in the VALUES clause to have a different order, all they have to do is use a different order - so that, for example, ROW_NUMBER() will work as expected (or at least as I expect! :-)
(21) By Larry Brasfield (larrybr) on 2021-11-03 15:18:04 in reply to 19.1 [link] [source]
I almost hate to spoil the fun here, but ...
The thrust of my argument now is that the current behaviour should be fixed and documented.
Yes, I understood that.
What is documented is that an ORDER clause enforces some kind of ordering. If feasible, it will be optimized away where the implementation naturally (in its current evolution) achieves the specified ordering. You can rely on that.
The extra guarantee you seek/suggest is highly unlikely to be made by the present SQLite project. In part this is because VDBE code generation is complex enough without imposing requirements which are contrary to long-standing SQL convention. And, I venture, further cause to decline your invitation is that it would weaken the message: If you want ordering, use ORDER BY. The docs have been quite clear that, absent an ORDER BY clause, no ordering is guaranteed. I do not foresee an ever-growing list of exceptions appearing with those warnings.
This stance on ordering is especially important for input to the windowing functions, where ordering (or misordering) is baked into the result rather than merely affecting result ordering.
(17) By Gunter Hick (gunter_hick) on 2021-11-03 06:47:42 in reply to 15 [link] [source]
I cannot make assertions on behalf of the SQLite Dev team. I am just reporting the status for a specific version I am working with. We have SQLite integrated into our proprietary build environment with some customizations that require careful merging and testing when switching versions. We also use our own virtual table implementations extensively. So there is considerable work involved in switching SQLite releases, which is why we are currently using 3.24.0 Please do not construe my report as an endorsement to rely on current undocumented behaviour.
(20) By Check_values on 2021-11-03 07:50:10 in reply to 17 [link] [source]
Please do not construe my report as an endorsement to rely on current undocumented behaviour.
Hi again Gunter, and thanks so much for your input into this thread!
See my reply to Larry about this issue - I'll leave it there now - making the current behaviour the documented behaviour would be a Good Thing (TM) in my not so humble opinion...
(22) By anonymous on 2021-11-03 19:27:25 in reply to 1.1 [link] [source]
You can't use
ORDER BY with
VALUES, although I think that if the outer statement is purely
VALUES (and not a compound statement, etc) then the order ought to be guaranteed in that case (if it isn't, this is a feature suggestion to change this).
But in the case you have, that is not needed since you can instead write:
VALUES (1,5), (2,6), (3,1)
But as far as I know, column names are not guaranteed for
VALUES (and you cannot specify column names with
VALUES nor can they be specified in a table alias), so would instead need a more complicated code.
I do think that
WITH ORDINALITY would be useful though, mainly for use with recursive CTEs.
(23) By Gunter Hick (gunter_hick) on 2021-11-04 07:18:54 in reply to 22 [link] [source]
This: WITH ov(idx,val) AS (SELECT * FROM (VALUES (1,'Z'),(3,'M'),(2,'A')) ORDER BY 1) SELECT * FROM ov; idx val --- --- 1 Z 2 A 3 M still uses a coroutine, but returns the entries in the requested order. And is future proof, because it explicitly specifies an order. explain with ov(idx,val) as (select * from (values (1,'Z'),(3,'M'),(2,'A')) order by 1) select * from ov; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 1 0 00 NULL 1 InitCoroutine 1 12 2 00 NULL 2 Integer 1 2 0 00 NULL 3 String8 0 3 0 Z 00 NULL 4 Yield 1 0 0 00 NULL 5 Integer 3 2 0 00 NULL 6 String8 0 3 0 M 00 NULL 7 Yield 1 0 0 00 NULL 8 Integer 2 2 0 00 NULL 9 String8 0 3 0 A 00 NULL 10 Yield 1 0 0 00 NULL 11 EndCoroutine 1 0 0 00 NULL 12 SorterOpen 2 4 0 k(1,B) 00 NULL 13 InitCoroutine 1 0 2 00 NULL 14 Yield 1 21 0 00 NULL 15 Copy 2 7 0 00 NULL 16 Copy 3 8 0 00 NULL 17 Copy 7 6 0 00 NULL 18 MakeRecord 6 3 9 00 NULL 19 SorterInsert 2 9 6 3 00 NULL 20 Goto 0 14 0 00 NULL 21 OpenPseudo 3 10 4 00 NULL 22 SorterSort 2 28 0 00 NULL 23 SorterData 2 10 3 00 NULL 24 Column 3 2 8 00 NULL 25 Column 3 1 7 00 NULL 26 ResultRow 7 2 0 00 NULL 27 SorterNext 2 23 0 00 NULL 28 Halt 0 0 0 00 NULL
(24) By ddevienne on 2021-11-04 07:52:12 in reply to 23 [link] [source]
Really? Given that the outer
SELECT has no
ORDER BY, I don't see why.
That's why I had the
ORDER BY outside of the CTE, and not inside like you do, in my earlier post
(25) By Gunter Hick (gunter_hick) on 2021-11-04 08:17:27 in reply to 24 [link] [source]
In SQLite 3.24.0, both versions (inside/outside CTE) generate identical code. Maybe there is a difference if the CTE is used in a JOIN instead of directly, so the difference between "natural order" (inside cte) and "explicit order" (outside cte) becomes visible. explain with t(ord, val) as (values (1, 'two'), (2, 'one'), (3, 'five')) select val from t order by ord; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 1 0 00 NULL 1 InitCoroutine 1 12 2 00 NULL 2 Integer 1 2 0 00 NULL 3 String8 0 3 0 two 00 NULL 4 Yield 1 0 0 00 NULL 5 Integer 2 2 0 00 NULL 6 String8 0 3 0 one 00 NULL 7 Yield 1 0 0 00 NULL 8 Integer 3 2 0 00 NULL 9 String8 0 3 0 five 00 NULL 10 Yield 1 0 0 00 NULL 11 EndCoroutine 1 0 0 00 NULL 12 SorterOpen 1 3 0 k(1,B) 00 NULL 13 InitCoroutine 1 0 2 00 NULL 14 Yield 1 20 0 00 NULL 15 Copy 3 7 0 00 NULL 16 Copy 2 6 0 00 NULL 17 MakeRecord 6 2 8 00 NULL 18 SorterInsert 1 8 6 2 00 NULL 19 Goto 0 14 0 00 NULL 20 OpenPseudo 2 9 3 00 NULL 21 SorterSort 1 26 0 00 NULL 22 SorterData 1 9 2 00 NULL 23 Column 2 1 7 00 NULL 24 ResultRow 7 1 0 00 NULL 25 SorterNext 1 22 0 00 NULL 26 Halt 0 0 0 00 NULL
(26) By Keith Medcalf (kmedcalf) on 2021-11-04 11:33:09 in reply to 1.1 [link] [source]
Yes. Always. It must be. There can be no change nor discussion on this point.
The parent of your so-called VALUES clause originates from the usage in an INSERT statement. It has just been generalized to not require an
INSERT INTO prefix.
INSERT INTO x VALUES (1), (2), (3) ...;
is absolutely nothing more than semantic sugar for:
INSERT INTO x VALUES (1); INSERT INTO x VALUES (2); INSERT INTO x VALUES (3); ...
Therefore the processing of the list of values sets must be left-to-right (in order specified).
(27) By anonymous on 2021-11-04 19:53:04 in reply to 26 [link] [source]
That's not what the documentation says multi-row