SQLite Forum

SELECT question
Login

SELECT question

(1) By Trudge on 2021-01-09 17:51:28 [link] [source]

I have 2 tables from which I'm trying to extract data, but having some problems with the query.

CREATE TABLE IF NOT EXISTS "docs" (
id integer not null primary key autoincrement,
authorid integer,
docid integer,
title text,
source text,
pubyear integer,
notes text,
author text,
filename text);

CREATE TABLE keywords ( id integer not null primary key autoincrement, keyword text, authorid integer, docid integer);

What I would like is to select the most recent 5 titles, along with ANY/ALL keywords attached to them. The closest I can get is:

select docs.id, title,author,source,docs.authorid,docs.docid,keyword,keywords.authorid,keywords.docid from docs,keywords where docs.authorid=keywords.authorid and docs.docid=keywords.docid  order by docs.id desc limit 5;
49|Redeeming Charles Babbage's Mechanical Computer|Swade,Doron D.|Scientific American 1993-02|1009|1|Computer Technology|1009|1
49|Redeeming Charles Babbage's Mechanical Computer|Swade,Doron D.|Scientific American 1993-02|1009|1|Difference Engine|1009|1
49|Redeeming Charles Babbage's Mechanical Computer|Swade,Doron D.|Scientific American 1993-02|1009|1|Engineering|1009|1
48|A Partly True Story|Stewart,Ian|Scientific American 1993-02|1008|1|Fuzzy Logic|1008|1
48|A Partly True Story|Stewart,Ian|Scientific American 1993-02|1008|1|Mathematics|1008|1
However, the 'limit by' seems applied to the title only, not the final result as above. I don't know ahead of time how many keywords each title may have - most are > 1.

In any case I would like the most recent 5 titles plus their attached keywords.

I'm sure this can be done in a single SQL query but so far it has escaped me. Any advice / suggestions greatly appreciated.

(2) By Ryan Smith (cuz) on 2021-01-09 20:00:01 in reply to 1 [link] [source]

There are two ways of doing this, both of which rely on essentially the same principle.

The first step is to phrase the problem in it's simplest form in common English, then translate the English to SQL.

The simplest way I can describe your request is: Give me the first 5 recent-most titles To each of those, join all their keywords.

The simplest way to say that first statement in SQL is:

SELECT id, title, author, source
  FROM docs
  ORDER BY id DESC
  LIMIT 5

Now that should work as expected, showing exactly 5 rows of unique recent-most docs.

The second part of the problem then is to link the keywords to the above selection, so we can make the above into a sub-query or CTE (I will show both ways below) and join the keywords to it. Note in both cases that I re-use the above query verbatim.

Example 1. - Sub-Query:

SELECT D.id, D.title, D.author, D.source, K.keyword, K.authorid
  FROM (SELECT id, title, author, source
          FROM docs
         ORDER BY id DESC
         LIMIT 5
        ) AS D
  JOIN keywords AS K ON K.docid = D.id

That should do it. By the way, your additional filters and be transposed like this: "and K.authorid=D.authorid etc." but they make no sense unless a single document can have multiple authors AND you link keywords by Author. which I doubt. It looks like the kind of nonsense an MSSQL query editor would produce.

Example 2. - Same solution using CTE:

WITH D AS (
    SELECT id, title, author, source
      FROM docs
     ORDER BY id DESC
     LIMIT 5
)
SELECT D.id, D.title, D.author, D.source, K.keyword, K.authorid
  FROM D,keywords AS K
 WHERE K.docid = D.id

There's no major advantage to using either way, just two tools in your toolbox, though I find the CTE's to generally be more human-readable and they can be re-used multiple times in a long query whereas a sub-query has to be re-stated, but that's an opinion/preference based distinction - the query engine treats them mostly the same.

One more example, this time getting only 5 lines of output with the keywords concatenated together into a comma-separated list:

WITH D AS (
    SELECT id, title, author, source
      FROM docs
      ORDER BY id DESC
     LIMIT 5
)
SELECT D.id, MAX(D.title) AS title, MAX(D.author) AS author,
       MAX(D.source) AS source, GROUP_CONCAT(K.keyword)
  FROM D,keywords AS K
 WHERE K.docid = D.id
 GROUP BY D.id

If keywords repeat or appear many time, you can use: GROUP_CONCAT(DISTINCT K.keyword)

Hope that helps!

PS: The above is all typed from memory and not tested, but it should work. If not, feel free to ask again, someone here will correct my mistake!

(3) By Trudge on 2021-01-09 21:53:49 in reply to 2 [link] [source]

Much appreciated. I was thinking a similar direction in that I knew I had to somehow combine the 2 queries. I tried your suggestions in various ways but did not get a reasonable answer. Sometimes no result at all.

I think it is the 'group by' that may be in the wrong place.

(4) By Ryan Smith (cuz) on 2021-01-09 22:07:16 in reply to 3 [link] [source]

I double checked it, but am failing to see an obvious error. The Group-by is definitely in the correct place, if you use the grouped query at all - but I may be missing something obvious.

Do you mind uploading/posting your DB somewhere? Feel free to remove any tables and data that is private/sensitive, just enough so we could test the queries easily.

(5) By Trudge on 2021-01-09 22:13:22 in reply to 4 [link] [source]

My bad - I had been using a 'group by' initially but it seemed to mess things up, so I removed it. Along the lines of answering the 2 queries:

sqlite> select id,authorid,docid,title,source from docs order by id desc limit 5;
49|1009|1|Redeeming Charles Babbage's Mechanical Computer|Scientific American 1993-02
48|1008|1|A Partly True Story|Scientific American 1993-02
47|1007|1|Alphabet's Now Building Cities|Wired 2018-01-02
46|1006|1|The Talented Luthier|Scotland Magazine 2021-01
45|1005|1|Sean Connery|Scotland Magazine 2021-01

sqlite> select authorid,docid,keyword from keywords where authorid in (1005,1006,1007,1008,1009) order by id desc; 1009|1|Difference Engine 1009|1|Engineering 1009|1|Computer Technology 1008|1|Fuzzy Logic 1008|1|Mathematics 1007|1|Urban Design 1007|1|Smart City 1006|1|Art 1006|1|Luthier 1006|1|Music - Violin 1005|1|Scottish Ancestry 1005|1|Biography

I can u/l the DB (~20KB) and let you know where it is.

(6) By Trudge on 2021-01-09 22:29:26 in reply to 5 [link] [source]

I couldn't find an email for you so posting the .sql here.

PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE IF NOT EXISTS "docs" (
id integer not null primary key autoincrement,
authorid integer,
docid integer,
title text,
source text,
pubyear integer,
notes text,
author text,
filename text);
INSERT INTO docs VALUES(1,501,1,'Ocean Viruses','Quanta Magazine',2019,NULL,'Lambert,Jonathan','Ocean Viruses.pdf');
INSERT INTO docs VALUES(2,649,1,'20 Things You Didn''t Know About Glaciers','Discover',2020,NULL,'Keats,Jonathan','20 Things You Didn''t Know About Glaciers.pdf');
INSERT INTO docs VALUES(3,650,1,'Crystal Colony','Discover',2020,NULL,'Walter,Jennifer','Crystal Colony.pdf');
INSERT INTO docs VALUES(4,651,1,'Smarter Cities','MIT Technology Review',2020,NULL,'Yang,Lynne Pesko','Smarter Cities.pdf');
INSERT INTO docs VALUES(5,652,1,'Scientist in Toyland','Discover',2020,NULL,'Ornes,Stephen','Scientist in Toyland.pdf');
INSERT INTO docs VALUES(6,502,1,'The Sixth Extinction','Canadian Geographic',2020,NULL,'Anthony,Leslie;Haasdyk,Mary','The Sixth Extinction.pdf');
INSERT INTO docs VALUES(20,653,1,'Engineering A Culture of Privacy','IEEE Consumer Electronics',2020,'','Kosmala,Peter B.','Engineering A Culture of Privacy.pdf');
INSERT INTO docs VALUES(26,654,1,'All Puffed Up','www.admin-magazine.com',2020,NULL,'Hein,Matthias','All Puffed Up.pdf');
INSERT INTO docs VALUES(27,503,1,'Move','New Scientist',2020,NULL,'Williams,Caroline','Move.pdf');
INSERT INTO docs VALUES(28,656,1,'Covid-19''s AI Revolution','New Scientist',2020,NULL,'Ong,Sandy','Covid-19''s AI Revolution.pdf');
INSERT INTO docs VALUES(29,657,1,'DDoS State Of Affairs','IEEE Computer',2020,NULL,'Osterweil,Eric;Stavrou,Angelos;Zhang,Lixia','DDoS State Of Affairs.pdf');
INSERT INTO docs VALUES(30,671,1,'Double-edged Sword','www.admin-magazine.com',2020,NULL,'Gerling,Rainer W.','Double-edged Sword.pdf');
INSERT INTO docs VALUES(31,999,1,'Evolving Evolution','New Scientist',2020,NULL,'Various','Evolving Evolution.pdf');
INSERT INTO docs VALUES(32,660,1,'Feeling Queasy','New Scientist',2020,NULL,'Thompson,Helen','Feeling Queasy.pdf');
INSERT INTO docs VALUES(33,654,2,'Fleet of Foot','www.admin-magazine.com',2020,NULL,'Hein,Matthias','Fleet of Foot.pdf');
INSERT INTO docs VALUES(34,504,1,'Just A Second','New Scientist',2020,NULL,'Nuwer,Rachel','Just A Second.pdf');
INSERT INTO docs VALUES(35,507,1,'New Top in Town','www.admin-magazine.com',2020,NULL,'Layton,Jeff','New Top in Town.pdf');
INSERT INTO docs VALUES(36,672,1,'Pass The Test','www.admin-magazine.com',2020,NULL,'Wubbeling,Matthias','Pass The Test.pdf');
INSERT INTO docs VALUES(37,658,1,'Relationship Status','www.admin-magazine.com',2020,NULL,'Dodlek,Zeljko','Relationship Status.pdf');
INSERT INTO docs VALUES(38,659,1,'Relationships','www.admin-magazine.com',2020,NULL,'Brendel,Jens-Christoph','Relationships.pdf');
INSERT INTO docs VALUES(39,670,1,'The Ultimate Battery','New Scientist',2020,NULL,'Hambling,David','The Ultimate Battery.pdf');
INSERT INTO docs VALUES(40,1000,1,'The Universe According to Emmy Noether','Discover 2017-06',2017,'','Nadis,Steve','The Universe According to Emmy Noether.pdf');
INSERT INTO docs VALUES(41,1001,1,'A Profound Plan To Save the Seas','Science 2020-10',2020,NULL,'Hannibal,Mary Ellen','A Profound Plan.pdf');
INSERT INTO docs VALUES(42,1002,1,'A Technology of Kinetic Art','Scientific American 1993-02',1993,NULL,'Rickey,George','A Technology of Kinetic Art.pdf');
INSERT INTO docs VALUES(43,1003,1,'Context Is Key to Olfaction','Science 2020-07',2020,NULL,'McGann,John P.','Context Is Key to Olfaction.pdf');
INSERT INTO docs VALUES(44,1004,1,'Lizard Man','Science 2020-07',2020,NULL,'Pennisi,Elizabeth','Lizard Man.pdf');
INSERT INTO docs VALUES(45,1005,1,'Sean Connery','Scotland Magazine 2021-01',2021,NULL,'Rowe,Jenny;Adolph,Anthony','Sean Connery.pdf');
INSERT INTO docs VALUES(46,1006,1,'The Talented Luthier','Scotland Magazine 2021-01',2021,NULL,'Gilchrist,Jim','The Talented Luthier.pdf');
INSERT INTO docs VALUES(47,1007,1,'Alphabet''s Now Building Cities','Wired 2018-01-02',2018,NULL,'Wright,Emily','Alphabet Cities.pdf');
INSERT INTO docs VALUES(48,1008,1,'A Partly True Story','Scientific American 1993-02',1993,NULL,'Stewart,Ian','Partly True Story.pdf');
INSERT INTO docs VALUES(49,1009,1,'Redeeming Charles Babbage''s Mechanical Computer','Scientific American 1993-02',1993,NULL,'Swade,Doron D.','Babbage.pdf');
CREATE TABLE authors (
id integer not null primary key autoincrement,
author text,
authorid integer,
docid integer);
INSERT INTO authors VALUES(1,'Lambert,Jonathan',501,1);
INSERT INTO authors VALUES(2,'Keats,Jonathan',649,1);
INSERT INTO authors VALUES(3,'Walter,Jennifer',650,1);
INSERT INTO authors VALUES(4,'Yang,Lynne Pesko',651,1);
INSERT INTO authors VALUES(5,'Ornes,Stephen',652,1);
INSERT INTO authors VALUES(6,'Anthony,Leslie;Haasdyk,Mary',502,1);
INSERT INTO authors VALUES(20,'Kosmala,Peter B.',653,NULL);
INSERT INTO authors VALUES(26,'Hein,Matthias',654,1);
INSERT INTO authors VALUES(28,'Ong,Sandy',656,1);
INSERT INTO authors VALUES(29,'Osterweil,Eric;Stavrou,Angelos;Zhang,Lixia',657,1);
INSERT INTO authors VALUES(30,'Gerling,Rainer W.',671,1);
INSERT INTO authors VALUES(32,'Thompson,Helen',660,1);
INSERT INTO authors VALUES(34,'Nuwer,Rachel',504,1);
INSERT INTO authors VALUES(35,'Layton,Jeff',507,1);
INSERT INTO authors VALUES(36,'Wubbeling,Matthias',672,1);
INSERT INTO authors VALUES(37,'Dodlek,Zeljko',658,1);
INSERT INTO authors VALUES(38,'Brendel,Jens-Christoph',659,1);
INSERT INTO authors VALUES(39,'Hambling,David',670,1);
INSERT INTO authors VALUES(40,'Williams,Caroline',503,1);
INSERT INTO authors VALUES(42,'Various',999,1);
INSERT INTO authors VALUES(43,'Nadis,Steve',1000,NULL);
INSERT INTO authors VALUES(44,'Hannibal,Mary Ellen',1001,1);
INSERT INTO authors VALUES(45,'Rickey,George',1002,1);
INSERT INTO authors VALUES(46,'McGann,John P.',1003,1);
INSERT INTO authors VALUES(47,'Pennisi,Elizabeth',1004,1);
INSERT INTO authors VALUES(48,'Rowe,Jenny;Adolph,Anthony',1005,1);
INSERT INTO authors VALUES(49,'Gilchrist,Jim',1006,1);
INSERT INTO authors VALUES(50,'Wright,Emily',1007,1);
INSERT INTO authors VALUES(51,'Stewart,Ian',1008,1);
INSERT INTO authors VALUES(52,'Swade,Doron D.',1009,1);
CREATE TABLE keywords (
id integer not null primary key autoincrement,
keyword text,
authorid integer,
docid integer);
INSERT INTO keywords VALUES(1,'Marine Virus',501,1);
INSERT INTO keywords VALUES(2,'Oceanography',501,1);
INSERT INTO keywords VALUES(3,'Biodiversity',649,1);
INSERT INTO keywords VALUES(4,'Glaciers',649,1);
INSERT INTO keywords VALUES(5,'Geology',649,1);
INSERT INTO keywords VALUES(6,'Climate',649,1);
INSERT INTO keywords VALUES(7,'Bees',650,1);
INSERT INTO keywords VALUES(8,'Mathematics',650,1);
INSERT INTO keywords VALUES(9,'Computer Networks',651,1);
INSERT INTO keywords VALUES(10,'Urban Development',651,1);
INSERT INTO keywords VALUES(11,'Geometry',652,1);
INSERT INTO keywords VALUES(12,'Architecture',652,1);
INSERT INTO keywords VALUES(13,'Kinematic Sculpture',652,1);
INSERT INTO keywords VALUES(14,'Species Conservation',502,1);
INSERT INTO keywords VALUES(27,'Computer Networks',657,1);
INSERT INTO keywords VALUES(28,'DNS Encryption',671,1);
INSERT INTO keywords VALUES(29,'Network Security',654,1);
INSERT INTO keywords VALUES(30,'Network Privacy',653,1);
INSERT INTO keywords VALUES(31,'Computer Networks',671,1);
INSERT INTO keywords VALUES(32,'Computer Networks',654,1);
INSERT INTO keywords VALUES(33,'Network Security',657,1);
INSERT INTO keywords VALUES(34,'Artificial Intelligence',656,1);
INSERT INTO keywords VALUES(35,'Coronavirus Pandemic',656,1);
INSERT INTO keywords VALUES(36,'Human Vestibular System',503,1);
INSERT INTO keywords VALUES(37,'Human Cognitive Behavior',503,1);
INSERT INTO keywords VALUES(38,'TCP/IP Maintenance',654,1);
INSERT INTO keywords VALUES(39,'TCP/IP Bufferbloat',654,1);
INSERT INTO keywords VALUES(40,'DNS Maintenance',671,1);
INSERT INTO keywords VALUES(41,'Evolution',999,1);
INSERT INTO keywords VALUES(42,'Battery Power',670,1);
INSERT INTO keywords VALUES(43,'Energy Conservation',504,1);
INSERT INTO keywords VALUES(44,'Nuclear Power',670,1);
INSERT INTO keywords VALUES(45,'Regular Expressions',672,1);
INSERT INTO keywords VALUES(46,'Programming',672,1);
INSERT INTO keywords VALUES(47,'Shell Programming',507,1);
INSERT INTO keywords VALUES(48,'Bash Shell',507,1);
INSERT INTO keywords VALUES(49,'Horology',504,1);
INSERT INTO keywords VALUES(50,'Climate',502,1);
INSERT INTO keywords VALUES(51,'Biodiversity',502,1);
INSERT INTO keywords VALUES(52,'Wi-Fi Networking',654,2);
INSERT INTO keywords VALUES(53,'Database Design',658,1);
INSERT INTO keywords VALUES(54,'Graph Database',658,1);
INSERT INTO keywords VALUES(55,'Database Design',659,1);
INSERT INTO keywords VALUES(56,'Graph Database',659,1);
INSERT INTO keywords VALUES(57,'Mathematics',1000,1);
INSERT INTO keywords VALUES(58,'Physics',1000,1);
INSERT INTO keywords VALUES(59,'Marine Conservation',1001,1);
INSERT INTO keywords VALUES(60,'Oceanography',1001,1);
INSERT INTO keywords VALUES(61,'Art',1002,1);
INSERT INTO keywords VALUES(62,'Kinematics',1002,1);
INSERT INTO keywords VALUES(63,'Physics',1002,1);
INSERT INTO keywords VALUES(64,'Olfaction',1003,1);
INSERT INTO keywords VALUES(65,'Neuroscience',1003,1);
INSERT INTO keywords VALUES(66,'Smell',1003,1);
INSERT INTO keywords VALUES(67,'Evolution',1004,1);
INSERT INTO keywords VALUES(68,'Lizards',1004,1);
INSERT INTO keywords VALUES(69,'Sean Connery',1005,1);
INSERT INTO keywords VALUES(70,'Biography',1005,1);
INSERT INTO keywords VALUES(71,'Scottish Ancestry',1005,1);
INSERT INTO keywords VALUES(72,'Music - Violin',1006,1);
INSERT INTO keywords VALUES(73,'Luthier',1006,1);
INSERT INTO keywords VALUES(74,'Art',1006,1);
INSERT INTO keywords VALUES(75,'Human Vestibular System',660,1);
INSERT INTO keywords VALUES(76,'Human Balance',660,1);
INSERT INTO keywords VALUES(77,'Motion Sickness',660,1);
INSERT INTO keywords VALUES(78,'Smart City',1007,1);
INSERT INTO keywords VALUES(79,'Urban Design',1007,1);
INSERT INTO keywords VALUES(80,'Mathematics',1008,1);
INSERT INTO keywords VALUES(81,'Fuzzy Logic',1008,1);
INSERT INTO keywords VALUES(82,'Computer Technology',1009,1);
INSERT INTO keywords VALUES(83,'Engineering',1009,1);
INSERT INTO keywords VALUES(84,'Difference Engine',1009,1);
DELETE FROM sqlite_sequence;
INSERT INTO sqlite_sequence VALUES('docs',49);
INSERT INTO sqlite_sequence VALUES('authors',52);
INSERT INTO sqlite_sequence VALUES('keywords',84);
COMMIT;

(7) By Ryan Smith (cuz) on 2021-01-10 00:58:33 in reply to 6 [source]

Thank you, this helps.

My mistake was actually less of an SQL fault and more a misunderstanding of your original schema. WHat use is "docid" - it seems to just always be 1. (I mistakenly thought it was an FK into "id" of the "document" table).

Anyway, armed with this new information, I had made the changes and it seems to work fine. I include here the entire script in case anyone wishes to reproduce it.

The added queries are at the end - I include both examples of getting the fully expanded list, and also the grouped list with concatenated keywords.

  -- SQLite version 3.30.1  [ Release: 2019-10-10 ]  on  SQLitespeed version 2.1.3.11.
===============================================================================

CREATE TABLE IF NOT EXISTS "docs" (
  id integer not null primary key autoincrement,
  authorid integer,
  docid integer,
  title text,
  source text,
  pubyear integer,
  notes text,
  author text,
  filename text
);

INSERT INTO docs VALUES
(1,501,1,'Ocean Viruses','Quanta Magazine',2019,NULL,'Lambert,Jonathan','Ocean Viruses.pdf'),
(2,649,1,'20 Things You Didn''t Know About Glaciers','Discover',2020,NULL,'Keats,Jonathan','20 Things You Didn''t Know About Glaciers.pdf'),
(3,650,1,'Crystal Colony','Discover',2020,NULL,'Walter,Jennifer','Crystal Colony.pdf'),
(4,651,1,'Smarter Cities','MIT Technology Review',2020,NULL,'Yang,Lynne Pesko','Smarter Cities.pdf'),
(5,652,1,'Scientist in Toyland','Discover',2020,NULL,'Ornes,Stephen','Scientist in Toyland.pdf'),
(6,502,1,'The Sixth Extinction','Canadian Geographic',2020,NULL,'Anthony,Leslie;Haasdyk,Mary','The Sixth Extinction.pdf'),
(20,653,1,'Engineering A Culture of Privacy','IEEE Consumer Electronics',2020,'','Kosmala,Peter B.','Engineering A Culture of Privacy.pdf'),
(26,654,1,'All Puffed Up','www.admin-magazine.com',2020,NULL,'Hein,Matthias','All Puffed Up.pdf'),
(27,503,1,'Move','New Scientist',2020,NULL,'Williams,Caroline','Move.pdf'),
(28,656,1,'Covid-19''s AI Revolution','New Scientist',2020,NULL,'Ong,Sandy','Covid-19''s AI Revolution.pdf'),
(29,657,1,'DDoS State Of Affairs','IEEE Computer',2020,NULL,'Osterweil,Eric;Stavrou,Angelos;Zhang,Lixia','DDoS State Of Affairs.pdf'),
(30,671,1,'Double-edged Sword','www.admin-magazine.com',2020,NULL,'Gerling,Rainer W.','Double-edged Sword.pdf'),
(31,999,1,'Evolving Evolution','New Scientist',2020,NULL,'Various','Evolving Evolution.pdf'),
(32,660,1,'Feeling Queasy','New Scientist',2020,NULL,'Thompson,Helen','Feeling Queasy.pdf'),
(33,654,2,'Fleet of Foot','www.admin-magazine.com',2020,NULL,'Hein,Matthias','Fleet of Foot.pdf'),
(34,504,1,'Just A Second','New Scientist',2020,NULL,'Nuwer,Rachel','Just A Second.pdf'),
(35,507,1,'New Top in Town','www.admin-magazine.com',2020,NULL,'Layton,Jeff','New Top in Town.pdf'),
(36,672,1,'Pass The Test','www.admin-magazine.com',2020,NULL,'Wubbeling,Matthias','Pass The Test.pdf'),
(37,658,1,'Relationship Status','www.admin-magazine.com',2020,NULL,'Dodlek,Zeljko','Relationship Status.pdf'),
(38,659,1,'Relationships','www.admin-magazine.com',2020,NULL,'Brendel,Jens-Christoph','Relationships.pdf'),
(39,670,1,'The Ultimate Battery','New Scientist',2020,NULL,'Hambling,David','The Ultimate Battery.pdf'),
(40,1000,1,'The Universe According to Emmy Noether','Discover 2017-06',2017,'','Nadis,Steve','The Universe According to Emmy Noether.pdf'),
(41,1001,1,'A Profound Plan To Save the Seas','Science 2020-10',2020,NULL,'Hannibal,Mary Ellen','A Profound Plan.pdf'),
(42,1002,1,'A Technology of Kinetic Art','Scientific American 1993-02',1993,NULL,'Rickey,George','A Technology of Kinetic Art.pdf'),
(43,1003,1,'Context Is Key to Olfaction','Science 2020-07',2020,NULL,'McGann,John P.','Context Is Key to Olfaction.pdf'),
(44,1004,1,'Lizard Man','Science 2020-07',2020,NULL,'Pennisi,Elizabeth','Lizard Man.pdf'),
(45,1005,1,'Sean Connery','Scotland Magazine 2021-01',2021,NULL,'Rowe,Jenny;Adolph,Anthony','Sean Connery.pdf'),
(46,1006,1,'The Talented Luthier','Scotland Magazine 2021-01',2021,NULL,'Gilchrist,Jim','The Talented Luthier.pdf'),
(47,1007,1,'Alphabet''s Now Building Cities','Wired 2018-01-02',2018,NULL,'Wright,Emily','Alphabet Cities.pdf'),
(48,1008,1,'A Partly True Story','Scientific American 1993-02',1993,NULL,'Stewart,Ian','Partly True Story.pdf'),
(49,1009,1,'Redeeming Charles Babbage''s Mechanical Computer','Scientific American 1993-02',1993,NULL,'Swade,Doron D.','Babbage.pdf');

CREATE TABLE authors (
  id integer not null primary key autoincrement,
  author text,
  authorid integer,
  docid integer
);

INSERT INTO authors VALUES
(1,'Lambert,Jonathan',501,1),
(2,'Keats,Jonathan',649,1),
(3,'Walter,Jennifer',650,1),
(4,'Yang,Lynne Pesko',651,1),
(5,'Ornes,Stephen',652,1),
(6,'Anthony,Leslie;Haasdyk,Mary',502,1),
(20,'Kosmala,Peter B.',653,NULL),
(26,'Hein,Matthias',654,1),
(28,'Ong,Sandy',656,1),
(29,'Osterweil,Eric;Stavrou,Angelos;Zhang,Lixia',657,1),
(30,'Gerling,Rainer W.',671,1),
(32,'Thompson,Helen',660,1),
(34,'Nuwer,Rachel',504,1),
(35,'Layton,Jeff',507,1),
(36,'Wubbeling,Matthias',672,1),
(37,'Dodlek,Zeljko',658,1),
(38,'Brendel,Jens-Christoph',659,1),
(39,'Hambling,David',670,1),
(40,'Williams,Caroline',503,1),
(42,'Various',999,1),
(43,'Nadis,Steve',1000,NULL),
(44,'Hannibal,Mary Ellen',1001,1),
(45,'Rickey,George',1002,1),
(46,'McGann,John P.',1003,1),
(47,'Pennisi,Elizabeth',1004,1),
(48,'Rowe,Jenny;Adolph,Anthony',1005,1),
(49,'Gilchrist,Jim',1006,1),
(50,'Wright,Emily',1007,1),
(51,'Stewart,Ian',1008,1),
(52,'Swade,Doron D.',1009,1);

CREATE TABLE keywords (
  id integer not null primary key autoincrement,
  keyword text,
  authorid integer,
  docid integer
);

INSERT INTO keywords VALUES
(1,'Marine Virus',501,1),
(2,'Oceanography',501,1),
(3,'Biodiversity',649,1),
(4,'Glaciers',649,1),
(5,'Geology',649,1),
(6,'Climate',649,1),
(7,'Bees',650,1),
(8,'Mathematics',650,1),
(9,'Computer Networks',651,1),
(10,'Urban Development',651,1),
(11,'Geometry',652,1),
(12,'Architecture',652,1),
(13,'Kinematic Sculpture',652,1),
(14,'Species Conservation',502,1),
(27,'Computer Networks',657,1),
(28,'DNS Encryption',671,1),
(29,'Network Security',654,1),
(30,'Network Privacy',653,1),
(31,'Computer Networks',671,1),
(32,'Computer Networks',654,1),
(33,'Network Security',657,1),
(34,'Artificial Intelligence',656,1),
(35,'Coronavirus Pandemic',656,1),
(36,'Human Vestibular System',503,1),
(37,'Human Cognitive Behavior',503,1),
(38,'TCP/IP Maintenance',654,1),
(39,'TCP/IP Bufferbloat',654,1),
(40,'DNS Maintenance',671,1),
(41,'Evolution',999,1),
(42,'Battery Power',670,1),
(43,'Energy Conservation',504,1),
(44,'Nuclear Power',670,1),
(45,'Regular Expressions',672,1),
(46,'Programming',672,1),
(47,'Shell Programming',507,1),
(48,'Bash Shell',507,1),
(49,'Horology',504,1),
(50,'Climate',502,1),
(51,'Biodiversity',502,1),
(52,'Wi-Fi Networking',654,2),
(53,'Database Design',658,1),
(54,'Graph Database',658,1),
(55,'Database Design',659,1),
(56,'Graph Database',659,1),
(57,'Mathematics',1000,1),
(58,'Physics',1000,1),
(59,'Marine Conservation',1001,1),
(60,'Oceanography',1001,1),
(61,'Art',1002,1),
(62,'Kinematics',1002,1),
(63,'Physics',1002,1),
(64,'Olfaction',1003,1),
(65,'Neuroscience',1003,1),
(66,'Smell',1003,1),
(67,'Evolution',1004,1),
(68,'Lizards',1004,1),
(69,'Sean Connery',1005,1),
(70,'Biography',1005,1),
(71,'Scottish Ancestry',1005,1),
(72,'Music - Violin',1006,1),
(73,'Luthier',1006,1),
(74,'Art',1006,1),
(75,'Human Vestibular System',660,1),
(76,'Human Balance',660,1),
(77,'Motion Sickness',660,1),
(78,'Smart City',1007,1),
(79,'Urban Design',1007,1),
(80,'Mathematics',1008,1),
(81,'Fuzzy Logic',1008,1),
(82,'Computer Technology',1009,1),
(83,'Engineering',1009,1),
(84,'Difference Engine',1009,1);

DELETE FROM sqlite_sequence;
INSERT INTO sqlite_sequence VALUES('docs',49);
INSERT INTO sqlite_sequence VALUES('authors',52);
INSERT INTO sqlite_sequence VALUES('keywords',84);


SELECT id,authorid,docid,title,source
  FROM docs
  ORDER BY id DESC
  LIMIT 5
;

  --  id |authorid|docid|title                                            |source                       
  -- ----|--------|-----|-------------------------------------------------|-----------------------------
  --  49 |  1009  |  1  |Redeeming Charles Babbage's Mechanical Computer  |Scientific American 1993-02  
  --  48 |  1008  |  1  |A Partly True Story                              |Scientific American 1993-02  
  --  47 |  1007  |  1  |Alphabet's Now Building Cities                   |Wired 2018-01-02             
  --  46 |  1006  |  1  |The Talented Luthier                             |Scotland Magazine 2021-01    
  --  45 |  1005  |  1  |Sean Connery                                     |Scotland Magazine 2021-01    


SELECT authorid,docid,keyword
  FROM keywords
 WHERE authorid IN (1005,1006,1007,1008,1009)
 ORDER BY id DESC
;

  -- authorid|docid|keyword              
  -- --------|-----|---------------------
  --   1009  |  1  |Difference Engine    
  --   1009  |  1  |Engineering          
  --   1009  |  1  |Computer Technology  
  --   1008  |  1  |Fuzzy Logic          
  --   1008  |  1  |Mathematics          
  --   1007  |  1  |Urban Design         
  --   1007  |  1  |Smart City           
  --   1006  |  1  |Art                  
  --   1006  |  1  |Luthier              
  --   1006  |  1  |Music - Violin       
  --   1005  |  1  |Scottish Ancestry    
  --   1005  |  1  |Biography            
  --   1005  |  1  |Sean Connery         


WITH D AS (
    SELECT id, authorid, docid, title, source
      FROM docs
     ORDER BY id DESC
     LIMIT 5
)
SELECT D.id, D.authorid, D.docid, D.title, A.author, D.source, K.keyword
  FROM D
  JOIN authors AS A ON A.docid = D.docid AND A.authorid = D.authorid
  JOIN keywords AS K ON K.docid = D.docid AND K.authorid = D.authorid
;

  --  id |authorid|docid|title                                            |author                     |source                       |keyword              
  -- ----|--------|-----|-------------------------------------------------|---------------------------|-----------------------------|---------------------
  --  49 |  1009  |  1  |Redeeming Charles Babbage's Mechanical Computer  |Swade,Doron D.             |Scientific American 1993-02  |Computer Technology  
  --  49 |  1009  |  1  |Redeeming Charles Babbage's Mechanical Computer  |Swade,Doron D.             |Scientific American 1993-02  |Difference Engine    
  --  49 |  1009  |  1  |Redeeming Charles Babbage's Mechanical Computer  |Swade,Doron D.             |Scientific American 1993-02  |Engineering          
  --  48 |  1008  |  1  |A Partly True Story                              |Stewart,Ian                |Scientific American 1993-02  |Fuzzy Logic          
  --  48 |  1008  |  1  |A Partly True Story                              |Stewart,Ian                |Scientific American 1993-02  |Mathematics          
  --  47 |  1007  |  1  |Alphabet's Now Building Cities                   |Wright,Emily               |Wired 2018-01-02             |Smart City           
  --  47 |  1007  |  1  |Alphabet's Now Building Cities                   |Wright,Emily               |Wired 2018-01-02             |Urban Design         
  --  46 |  1006  |  1  |The Talented Luthier                             |Gilchrist,Jim              |Scotland Magazine 2021-01    |Art                  
  --  46 |  1006  |  1  |The Talented Luthier                             |Gilchrist,Jim              |Scotland Magazine 2021-01    |Luthier              
  --  46 |  1006  |  1  |The Talented Luthier                             |Gilchrist,Jim              |Scotland Magazine 2021-01    |Music - Violin       
  --  45 |  1005  |  1  |Sean Connery                                     |Rowe,Jenny;Adolph,Anthony  |Scotland Magazine 2021-01    |Biography            
  --  45 |  1005  |  1  |Sean Connery                                     |Rowe,Jenny;Adolph,Anthony  |Scotland Magazine 2021-01    |Scottish Ancestry    
  --  45 |  1005  |  1  |Sean Connery                                     |Rowe,Jenny;Adolph,Anthony  |Scotland Magazine 2021-01    |Sean Connery         


WITH D AS (
    SELECT id, authorid, docid, title, source
      FROM docs
     ORDER BY id DESC
     LIMIT 5
)
SELECT D.id, MAX(D.title) AS Title, MAX(A.author) AS Author, MAX(D.source) AS Source,
       GROUP_CONCAT(K.keyword) AS Keywords
  FROM D
  JOIN authors AS A ON A.docid = D.docid AND A.authorid = D.authorid
  JOIN keywords AS K ON K.docid = D.docid AND K.authorid = D.authorid
 GROUP BY D.id
 ORDER BY D.id DESC
;

  --  id |Title                                            |Author                     |Source                       |Keywords                                           
  -- ----|-------------------------------------------------|---------------------------|-----------------------------|---------------------------------------------------
  --  49 |Redeeming Charles Babbage's Mechanical Computer  |Swade,Doron D.             |Scientific American 1993-02  |Computer Technology,Difference Engine,Engineering  
  --  48 |A Partly True Story                              |Stewart,Ian                |Scientific American 1993-02  |Fuzzy Logic,Mathematics                            
  --  47 |Alphabet's Now Building Cities                   |Wright,Emily               |Wired 2018-01-02             |Smart City,Urban Design                            
  --  46 |The Talented Luthier                             |Gilchrist,Jim              |Scotland Magazine 2021-01    |Art,Luthier,Music - Violin                         
  --  45 |Sean Connery                                     |Rowe,Jenny;Adolph,Anthony  |Scotland Magazine 2021-01    |Biography,Scottish Ancestry,Sean Connery           

  --   Script Stats: Total Script Execution Time:     0d 00h 00m and 00.044s
  --                 Total Script Query Time:         0d 00h 00m and 00.003s
  --                 Total Database Rows Changed:     141
  --                 Total Virtual-Machine Steps:     6145
  --                 Last executed Item Index:        14
  --                 Last Script Error:               
  -- -------------------------------------------------------------------------------

  -- 2021-01-10 02:45:37.922  |  [Success]    Script Success.
  -- 2021-01-10 02:45:37.928  |  [Success]    Transaction Rolled back.
===============================================================================

Hope that clears things up - but feel free to ask if any of it needs further explaining.

Good luck!

(8) By Trudge on 2021-01-10 01:38:27 in reply to 7 [link] [source]

Wow! Thank you so much for your time and efforts. That query worked first time and shows me all the keywords attached to each title. Exactly what I need. Now my Perl script can capture everything and format it the way I want (this is actually a web project I'm in the middle of making).

sqlite> WITH D AS (
   ...>     SELECT id, authorid, docid, title, source
   ...>       FROM docs
   ...>      ORDER BY id DESC
   ...>      LIMIT 5
   ...> )
   ...> SELECT D.id, MAX(D.title) AS Title, MAX(A.author) AS Author, MAX(D.source) AS Source,
   ...>        GROUP_CONCAT(K.keyword) AS Keywords
   ...>   FROM D
   ...>   JOIN authors AS A ON A.docid = D.docid AND A.authorid = D.authorid
   ...>   JOIN keywords AS K ON K.docid = D.docid AND K.authorid = D.authorid
   ...>  GROUP BY D.id
   ...>  ORDER BY D.id DESC
   ...> ;
49|Redeeming Charles Babbage's Mechanical Computer|Swade,Doron D.|Scientific American 1993-02|Computer Technology,Difference Engine,Engineering
48|A Partly True Story|Stewart,Ian|Scientific American 1993-02|Fuzzy Logic,Mathematics
47|Alphabet's Now Building Cities|Wright,Emily|Wired 2018-01-02|Smart City,Urban Design
46|The Talented Luthier|Gilchrist,Jim|Scotland Magazine 2021-01|Art,Luthier,Music - Violin
45|Sean Connery|Rowe,Jenny;Adolph,Anthony|Scotland Magazine 2021-01|Biography,Scottish Ancestry,Sean Connery

The 'docid' is used to identify which document a particular author has written. As you might notice most of the authors only have 1 title, but there happened to be an author with several titles - not an uncommon situation. I've thought of using authorid and docid as a primary key but that is another fire to put out.

Again many thanks for all your work.

(9.1) By Keith Medcalf (kmedcalf) on 2021-01-10 01:43:43 edited from 9.0 in reply to 7 [link] [source]

You could also use a simple correlated subquery if all you wanted was the concat of the applicable keywords:

  select docs.id, 
         docs.title AS Title, 
         authors.author AS Author, 
         docs.source) AS Source,
         (
          select group_concat(keyword)
            from keywords
           where docid == D.docid
             and authorid == D.Authorid
         ) as Keywords
    from docs, authors
   where docs.authorid == authors.authorid
     and docs.docid == author.docid
order by docs.id desc
   limit 5
;

(10) By Trudge on 2021-01-10 02:46:19 in reply to 9.1 [link] [source]

On my system that threw an error:

sqlite> select docs.id, 
   ...>          docs.title AS Title, 
   ...>          authors.author AS Author, 
   ...>          docs.source) AS Source,
   ...>          (
   ...>           select group_concat(keyword)
   ...>             from keywords
   ...>            where docid == D.docid
   ...>              and authorid == D.Authorid
   ...>          ) as Keywords
   ...>     from docs, authors
   ...>    where docs.authorid == authors.authorid
   ...>      and docs.docid == author.docid
   ...> order by docs.id desc
   ...>    limit 5
   ...> ;
Error: near ")": syntax error

(11) By John Dennis (jdennis) on 2021-01-10 03:25:22 in reply to 10 [link] [source]

It looks to me that there's an extraneous right parenthesis on line 4 - I think "docs.source)" should be "docs.source" ??

(12.1) By Keith Medcalf (kmedcalf) on 2021-01-10 04:04:12 edited from 12.0 in reply to 11 [link] [source]

LOL. Quite right. That bracket is a "pasto" --- sort of like a typo, except that a typo comes from typing whereas a pasto comes from pasting :)

Line 13 also contains a typo where the table name authors is missing the "s".

(13) By Trudge on 2021-01-10 04:29:12 in reply to 12.1 [link] [source]

Ah yes. I also had to remove the reference to 'D'.

sqlite> select docs.id, docs.title AS Title, authors.author AS Author, docs.source AS Source,
   ...> (
   ...> select group_concat(keyword)
   ...> from keywords
   ...> where docid = docs.docid
   ...> and authorid = docs.Authorid
   ...> )
   ...> from docs, authors
   ...> where docs.authorid = authors.authorid
   ...> and docs.docid = authors.docid
   ...> order by docs.id desc
   ...> limit 5;
49|Redeeming Charles Babbage's Mechanical Computer|Swade,Doron D.|Scientific American 1993-02|Computer Technology,Engineering,Difference Engine
48|A Partly True Story|Stewart,Ian|Scientific American 1993-02|Mathematics,Fuzzy Logic
47|Alphabet's Now Building Cities|Wright,Emily|Wired 2018-01-02|Smart City,Urban Design
46|The Talented Luthier|Gilchrist,Jim|Scotland Magazine 2021-01|Music - Violin,Luthier,Art
45|Sean Connery|Rowe,Jenny;Adolph,Anthony|Scotland Magazine 2021-01|Sean Connery,Biography,Scottish Ancestry

(14) By Keith Medcalf (kmedcalf) on 2021-01-10 22:58:02 in reply to 13 [link] [source]

Note also that if you want your concat of the keywords to be alphabetical you would use this:

  select docs.id, docs.title AS Title, authors.author AS Author, docs.source AS Source,
         (
          select group_concat(keyword)
            from (
                    select keyword
                      from keywords
                     where docid = docs.docid
                       and authorid = docs.Authorid
                  order by keyword
                 )
         )
    from docs, authors
   where docs.authorid = authors.authorid
     and docs.docid = authors.docid
order by docs.id desc
   limit 5;

I would also point out that usually the relationship between "docs" and "authors" is N:M, as is the relationship between "docs" and "keywords". That is, an "author" can write many "docs" and a "doc" may have many authors. However the "Isaac Asimov" that wrote one "doc" may be the same author as wrote another "doc" -- you current scheme is subject to update anomalies because it requires separate "Isaac Asimov" in the authors table despite there only being one such author. Mutatis Mutandis keywords.

The usual way to do this in a Relational Database is to have a "linkage table" which enumerates the N:M linkages thus:

create table authors
(
  authorid integer primary key,
  name     text collate nocase unique
);
create table keywords
(
   keywordid integer primary key,
   keyword   text collate nocase unique
);
create table docs
(
  docid  integer primary key,
  title  text collate nocase unique
);
create table docauthors
(
   docid    integer not null references docs on delete cascade,
   authorid integer not null references authors on delete cascade,
   unique (docid, authorid),
   unique (authorid, docid)
);
create table dockeywords
(
   docid     integer not null references docs on delete cascade,
   keywordid integer not null references keywords on delete cascade,
   unique (docid, keywordid),
   unique (keywordid, docid)
);

In this way you can have multiple authors per docs, multiple docs per authors, multiple keywords per docs, and multiple docs per keywords.

You can also retrieve multiple keywords per authors or multiple authors per keywords.

Note that you could declare the linkage tables differently to conserve disk space by storing only 2 b-tree's per linkage rather than 3:

create table docauthors
(
   docid    integer not null references docs on delete cascade,
   authorid integer not null references authors on delete cascade,
   primary key (docid, authorid),
   unique (authorid, docid)
) without rowid;
create table dockeywords
(
   docid     integer not null references docs on delete cascade,
   keywordid integer not null references keywords on delete cascade,
   primary key (docid, keywordid),
   unique (keywordid, docid)
) without rowid;

(15) By Trudge on 2021-01-11 15:52:23 in reply to 14 [link] [source]

Appreciate your experience in this matter. It has also been on my radar to 'fix' that situation. The way I handle multiple docs by the same author is to increment the docid for that author. I do have an 'authors' table

CREATE TABLE authors (
id integer not null primary key autoincrement,
author text,
authorid integer,
docid integer);
so essentially the same as your suggestion without the linkage tables. That's what I'm using authorid and docid for. Some typical Perl code:
$sth=$dbh->prepare(qq{select title,docs.authorid,docs.docid,pubyear,source,filename,author,
(
select group_concat(keyword)
from keywords
where keywords.docid = docs.docid
and keywords.authorid = docs.authorid
)
from docs
order by docs.id desc limit 5});
Now that I have this situation solved I can work on 'normalizing' the rest of the DB and any tweaking. Thank you for your experience.

(16) By Keith Medcalf (kmedcalf) on 2021-01-11 16:32:53 in reply to 15 [link] [source]

If you were using the linkages such that docs:keywords and docs:authors are both N:M you could use a query like the following to get one row per docs instance:

  select docs.docid,
         docs.title,
         (
          select group_concat(name, '; ')
            from (
                    select name
                      from docauthors, authors
                     where docauthors.docid == docs.docid
                       and docauthors.authorid == authors.authorid
                  order by name
                 )
         ) as authors,
         (
          select group_concat(keyword, '; ')
            from (
                    select keyword
                      from dockeywords, keywords
                     where dockeywords.docid == docs.docid
                       and dockeywords.keywordid == keywords.keywordid
                  order by keyword
                 )
         ) as keywords
    from docs
order by docid desc
   limit 5
;

with all the applicable authors and keywords as single semi-colon separated lists ...

(17) By Trudge on 2021-01-12 15:08:34 in reply to 16 [link] [source]

So many options! The more I use Sqlite the more I see why it is so popular. Many thanks again.

(18) By Larry Brasfield (LarryBrasfield) on 2021-01-12 16:04:01 in reply to 17 [link] [source]

(Without diminishing Keith's suggestions:) The plethora of solution approaches is more due to the power of specifying set relationships via queries than SQLite's virtue alone. SQLite, among other DBMS's, accepts SQL for specifying the result sets of query operations, and compiles it to create a program targeting its own dedicated virtual machine which is specialized for querying and modifying the connected or attached DB(s). The flexibility you see arises from the flexibility of (and some redundancy in) SQL, combined with skilled use of it.

(19) By Trudge on 2021-01-13 15:21:55 in reply to 18 [link] [source]

"combined with skilled use of it." That's the part I'm missing. Working on it.