Sorting by date into groups
(1) By Calab (Cal_ab) on 2023-02-07 03:39:13 [link] [source]
I want to order my data by the date of the first occurrence of ID, then by the Started column. How can I do this?
Sample data:
ID Started Result
BcdMhdQf5L|2023-02-06 14:13:45.697587|ONLINE
BcdMhdQf5L|2023-02-06 14:35:56.836783|NETWORK
VbR7aK4ENK|2023-02-03 13:08:10.601581|ONLINE
VbR7aK4ENK|2023-02-03 13:08:10.619467|ONLINE
VbR7aK4ENK|2023-02-03 13:08:25.252219|NETWORK
VbR7aK4ENK|2023-02-03 13:08:38.480098|ONLINE
VbR7aK4ENK|2023-02-06 10:21:39.077173|ONLINE
fpnct7eR5o|2023-02-03 13:09:03.907468|NETWORK
fpnct7eR5o|2023-02-03 13:09:03.932700|ONLINE
fpnct7eR5o|2023-02-03 13:09:18.252934|ONLINE
fpnct7eR5o|2023-02-03 13:13:31.717305|ONLINE
fpnct7eR5o|2023-02-06 10:20:52.256900|ONLINE
Order that I would like the results:
VbR7aK4ENK|2023-02-03 13:08:10.601581|ONLINE <- First because 2023-02-03 13:08:10.601581 is the earliest entry
VbR7aK4ENK|2023-02-03 13:08:10.619467|ONLINE <- The rest are all from ID VbR7aK4ENK, sorted by the date
VbR7aK4ENK|2023-02-03 13:08:25.252219|NETWORK
VbR7aK4ENK|2023-02-03 13:08:38.480098|ONLINE
VbR7aK4ENK|2023-02-06 10:21:39.077173|ONLINE
fpnct7eR5o|2023-02-03 13:09:03.907468|NETWORK <- Second because the date of this ID is next in the sequence
fpnct7eR5o|2023-02-03 13:09:03.932700|ONLINE
fpnct7eR5o|2023-02-03 13:09:18.252934|ONLINE
fpnct7eR5o|2023-02-03 13:13:31.717305|ONLINE
fpnct7eR5o|2023-02-06 10:20:52.256900|ONLINE
BcdMhdQf5L|2023-02-06 14:13:45.697587|ONLINE <- Finally, this ID because it's earliest entry is later than any entry in the other IDs
BcdMhdQf5L|2023-02-06 14:35:56.836783|NETWORK
I've tried several different ways to generate this data, but none of my SQL statements produce what I want. I end up with each ID showing up only once, or the ID's are split up by date, etc.
How can I produce the result that I want when using SQLite3 v 3.7.17?
(2) By Richard Damon (RichardDamon) on 2023-02-07 04:59:37 in reply to 1 [link] [source]
Don't have time to do the full work, but my thought is you need to do a join (based on ID) between the table you have to a query collapsing the data by grouping on ID and getting the minimum started for that ID, and then order the results of the joinr by that minimum started date, then the current records started date.
(3) By anonymous on 2023-02-07 05:30:50 in reply to 2 [link] [source]
That is the correct approach. Something like this:
with FirstStart (ID, First) as
(select ID, min(Started) from Sample
group by ID)
select Sample.ID, Started, Result
from Sample,
FirstStart on FirstStart.ID=Sample.ID
order by First, Sample.ID, Started;
(You need to include the ID in the sort order in case multiple IDs happen to have the exact same earliest timestamp.)
(5) By Calab (Cal_ab) on 2023-02-07 13:56:52 in reply to 3 [link] [source]
I'm using SQLite3 v 3.7.17, which came with Python3. It doesn't seem to like the WITH command as the above sample does not work:
Error: near "with": syntax error
(6) By Stephan Beal (stephan) on 2023-02-07 14:00:47 in reply to 5 [link] [source]
I'm using SQLite3 v 3.7.17, which came with Python3.
That version is now almost 10 years old. Maybe it came with an early release of python3 but is certainly not still distributed with any recent release.
(7) By Gunter Hick (gunter_hick) on 2023-02-07 14:13:36 in reply to 5 [link] [source]
Either update to a more recent SQLite version or reformulate the query without the use of CTEs
(8.1) By Ryan Smith (cuz) on 2023-02-07 14:41:10 edited from 8.0 in reply to 5 [source]
-- SQLite version 3.37.2 [ Release: 2022-01-06 ] on SQLitespeed version 2.1.3.11.
-- ================================================================================================
CREATE TABLE netlog(ID TEXT, Started NUMERIC, Result TEXT);
CREATE INDEX idx_netlog_start ON netlog(Started ASC);
INSERT INTO netlog(ID, Started, Result) VALUES
('BcdMhdQf5L', '2023-02-06 14:13:45.697587', 'ONLINE' )
,('BcdMhdQf5L', '2023-02-06 14:35:56.836783', 'NETWORK')
,('VbR7aK4ENK', '2023-02-03 13:08:10.601581', 'ONLINE' )
,('VbR7aK4ENK', '2023-02-03 13:08:10.619467', 'ONLINE' )
,('VbR7aK4ENK', '2023-02-03 13:08:25.252219', 'NETWORK')
,('VbR7aK4ENK', '2023-02-03 13:08:38.480098', 'ONLINE' )
,('VbR7aK4ENK', '2023-02-06 10:21:39.077173', 'ONLINE' )
,('fpnct7eR5o', '2023-02-03 13:09:03.907468', 'NETWORK')
,('fpnct7eR5o', '2023-02-03 13:09:03.932700', 'ONLINE' )
,('fpnct7eR5o', '2023-02-03 13:09:18.252934', 'ONLINE' )
,('fpnct7eR5o', '2023-02-03 13:13:31.717305', 'ONLINE' )
,('fpnct7eR5o', '2023-02-06 10:20:52.256900', 'ONLINE' )
;
SELECT *
FROM (SELECT ID, MIN(Started) AS minStart FROM netlog GROUP BY ID ORDER BY MIN(Started)) AS LS
JOIN netlog AS NL ON NL.ID = LS.ID
ORDER BY LS.minStart, NL.Started
;
-- ID |minStart |ID |Started |Result
-- ------------|----------------------------|------------|----------------------------|---------
-- VbR7aK4ENK |2023-02-03 13:08:10.601581 |VbR7aK4ENK |2023-02-03 13:08:10.601581 |ONLINE
-- VbR7aK4ENK |2023-02-03 13:08:10.601581 |VbR7aK4ENK |2023-02-03 13:08:10.619467 |ONLINE
-- VbR7aK4ENK |2023-02-03 13:08:10.601581 |VbR7aK4ENK |2023-02-03 13:08:25.252219 |NETWORK
-- VbR7aK4ENK |2023-02-03 13:08:10.601581 |VbR7aK4ENK |2023-02-03 13:08:38.480098 |ONLINE
-- VbR7aK4ENK |2023-02-03 13:08:10.601581 |VbR7aK4ENK |2023-02-06 10:21:39.077173 |ONLINE
-- fpnct7eR5o |2023-02-03 13:09:03.907468 |fpnct7eR5o |2023-02-03 13:09:03.907468 |NETWORK
-- fpnct7eR5o |2023-02-03 13:09:03.907468 |fpnct7eR5o |2023-02-03 13:09:03.932700 |ONLINE
-- fpnct7eR5o |2023-02-03 13:09:03.907468 |fpnct7eR5o |2023-02-03 13:09:18.252934 |ONLINE
-- fpnct7eR5o |2023-02-03 13:09:03.907468 |fpnct7eR5o |2023-02-03 13:13:31.717305 |ONLINE
-- fpnct7eR5o |2023-02-03 13:09:03.907468 |fpnct7eR5o |2023-02-06 10:20:52.256900 |ONLINE
-- BcdMhdQf5L |2023-02-06 14:13:45.697587 |BcdMhdQf5L |2023-02-06 14:13:45.697587 |ONLINE
-- BcdMhdQf5L |2023-02-06 14:13:45.697587 |BcdMhdQf5L |2023-02-06 14:35:56.836783 |NETWORK
DROP TABLE netlog;
Note: The "ORDER BY" in the sub-query is technically not needed, and I've used "SELECT * " to show what's going on, but you may obviously only use the last 3 columns (or whatever is preferred) in the real query.
(9.1) By Calab (Cal_ab) on 2023-02-07 15:29:08 edited from 9.0 in reply to 8.1 [link] [source]
SELECT *
FROM (SELECT ID, MIN(Started) AS minStart FROM netlog GROUP BY ID ORDER BY MIN(Started)) AS LS
JOIN netlog AS NL ON NL.ID = LS.ID
ORDER BY LS.minStart, NL.Started
;
Thank you very much Ryan!!!
This was what I was looking for... I just could not get my head around the whole problem at once.
(4) By Gunter Hick (gunter_hick) on 2023-02-07 06:46:58 in reply to 1 [link] [source]
Maybe something like WITH ids (id, min) AS (SELECT ID,min(started) AS min FROM data GROUP BY ID ORDER BY min), grp (id, started, result) AS (SELECT ID,started,result FROM data ORDER BY 1,2) SELECT grp.* FROM ids CROSS JOIN grp ON (ids.id = grp.id);