SQLite Forum

Is this query correctly formulated and giving right output?
Login

Is this query correctly formulated and giving right output?

(1) By Techcosec Limited (DigitalSplendid) on 2022-05-17 12:21:51 [link] [source]

My problem statement is to have a table of customers based on the popularity in terms of country. So the country with the most invoicelineid will rank first. I have formed this query but am not sure if I had created it the right way and gotten the right output.

WITH countrycounter AS (
SELECT Invoice.BillingCountry AS countryofbilling, COUNT(Invoice.BillingCountry) AS county FROM Customer JOIN Invoice ON Customer.CustomerId = Invoice.CustomerId JOIN InvoiceLine ON Invoice.InvoiceId = InvoiceLine.InvoiceId GROUP BY countryofbilling ORDER BY countryofbilling ) SELECT countrycounter.countryofbilling, countrycounter.county, dense_rank() OVER ( ORDER BY county DESC) AS rank FROM countrycounter

https://www.canva.com/design/DAFA9CAIUTQ/32tD8JLxYdJ6zTJ9xOSnAw/view?utm_content=DAFA9CAIUTQ&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton

(2) By Harald Hanche-Olsen (hanche) on 2022-05-17 13:08:34 in reply to 1 [link] [source]

Here, I reformatted your question so people can actually read it. It's a precondition for responding. (I made the link clickable, too.) You might wish to learn a little bit of markdown, just to format code blocks.

My problem statement is to have a table of customers based on the popularity in terms of country. So the country with the most invoicelineid will rank first. I have formed this query but am not sure if I had created it the right way and gotten the right output.

WITH countrycounter AS 
(   
      SELECT Invoice.BillingCountry AS countryofbilling, COUNT(Invoice.BillingCountry) AS county
   FROM
      Customer 
      JOIN
         Invoice 
         ON Customer.CustomerId = Invoice.CustomerId 
      JOIN
         InvoiceLine 
         ON Invoice.InvoiceId = InvoiceLine.InvoiceId 
		 GROUP BY countryofbilling
   ORDER BY
      countryofbilling
)
SELECT
   countrycounter.countryofbilling,
   countrycounter.county,
   dense_rank() OVER (
ORDER BY
   county DESC) AS rank 
FROM
   countrycounter

https://www.canva.com/design/DAFA9CAIUTQ/32tD8JLxYdJ6zTJ9xOSnAw/view

(3) By Techcosec Limited (DigitalSplendid) on 2022-05-17 13:14:58 in reply to 2 [link] [source]

Thanks for formatting. Any tutorial link to know how to correctly use Markdown?

(5) By Warren Young (wyoung) on 2022-05-17 15:24:20 in reply to 3 [link] [source]

When you reply here, you'll see a "Markup style" link. Click it, then click the "Markdown Formatting Rules" link, landing you here.

That page gives a brief summary only. Web searches will pull up broader tutorials, but beware that there are many flavors of Markdown. Fossil's dialect tends to follow the CommonMark spec, but it isn't perfectly-compliant.

That's why the forum forces you to go through Preview before submitting a post. Even long-time Fossil users like me sometimes have to edit a post a few times to get it to format as desired.

(6) By Harald Hanche-Olsen (hanche) on 2022-05-17 15:49:32 in reply to 3 [link] [source]

As Warren said.

Really, the most important bit to learn for the sake of posting in this forum, is code formatting using single or triple backquotes.

For icing on the cake, learn about quoting, boldface, and italics. You can safely ignore the rest.

(7) By Warren Young (wyoung) on 2022-05-17 16:21:28 in reply to 6 [link] [source]

You can safely ignore the rest.

Oh, surely not!
box "Oh," fit radius 0.2
arrow 0.2
box "surely" fit
arrow 0.2
box "not!" fit radius 0.2

(Click it.)

(8) By Adrian Ho (lexfiend) on 2022-05-18 04:52:08 in reply to 7 [link] [source]

(4) By Ryan Smith (cuz) on 2022-05-17 13:40:54 in reply to 1 [source]

Now that the question is legible (thanks Harald), I can say that on first glance-over, your query seems fine.

Is there some reason you are doubting the outcome? Does it perhaps not stroke with reality in some way?

It should be an easy check to get the count for a specific country, and see if it matches the query outcome. Usually in queries this simple (meaning devoid of many joins and inter-dependencies), when they break, they break spectacularly and noticeably. Still, it's a good idea to double-check the outcome.

At this point we can't help much - we just have to assume that what you put in the query is what you meant to put in the query and if it runs it works as intended. If you tell us THAT it is broken, and HOW/WHY you believe it is broken, then we can surely help.

(9) By Techcosec Limited (DigitalSplendid) on 2022-05-18 07:11:58 in reply to 4 [link] [source]

One thing that is not clear is that the invoice associated with invoice.billingcountry will not able to track no. of InvoiceLineId.

Suppose 4 InvoiceLineId from country A made it to an Invoice (say Invoice no. 1). This means 4 tracks are sold from country A.

Say 5 InvoiceLineId from country B made it to an Invoice (say Invoice No. 2). This means 5 tracks sold from country B.

To my understanding, COUNT(Invoice.BillingCountry) is just taking into account an invoice no. (or invoice id). So both Invoice no.1 and Invoice no. 2 are given a value of 1 or counted as 1.

Then the objective of counting track ids before ranking is not met.

(10) By Ryan Smith (cuz) on 2022-05-18 15:21:29 in reply to 9 [link] [source]

It's not immediately clear to me from your description what the misconception is, so I thought I would go over the workings of GROUP BY and COUNT() to make sure there is exact knowledge how it processes data - perhaps then either the misconception goes away, or we can make me understand the question better.

GROUP BY groups rows by a given value that must be unique for the items in the group. All other row-values that may or may not have different values, can be reduced to a single value by one or more "aggregate" functions.

To demonstrate - imagine I have 7 rows of Olympic medal info in a table like this:

CREATE TABLE medals(Country TEXT, Event TEXT, Class TEXT, Place INT, Medal TEXT);

INSERT INTO medals(Country, Event, Class, Place, Medal) VALUES
 ('USA',   '100m Hurdles', 'Female', 1, 'Gold')
,('USA',   '100m Hurdles', 'Male',   2, 'Silver')
,('USA',   '100m Dash',    'Female', 2, 'Silver')
,('UK',    '100m Dash',    'Female', 1, 'Gold')
,('UK',    '100m Hurdles', 'Male',   3, 'Bronze')
,('Sweden','100m Hurdles', 'Male',   1, 'Gold')
;

Now imagine all I wanted to know, is which countries are in the list - it requires no aggregate functions, just a single grouping:

SELECT Country FROM medals GROUP BY Country;
  -- Country 
  -- --------
  -- Sweden  
  -- UK      
  -- USA     

The natural next question is, how many medals did each of those countries win? This calls for an aggregate to count all the rows in each group:

SELECT Country, COUNT(*) FROM medals GROUP BY Country;
  -- Country |  COUNT(*)  
  -- --------|------------
  -- Sweden  |      1     
  -- UK      |      2     
  -- USA     |      3     

Perhaps I'm interested to ask the question another way, such as which events did each country get medals for? Another aggregate query helps with that:

SELECT Country, GROUP_CONCAT(Class||'-'||Event) AS 'Events in Top 3' FROM medals GROUP BY Country;
  -- Country |Events in Top 3                                         
  -- --------|--------------------------------------------------------
  -- Sweden  |Male-100m Hurdles                                       
  -- UK      |Female-100m Dash,Male-100m Hurdles                      
  -- USA     |Female-100m Hurdles,Male-100m Hurdles,Female-100m Dash  

-- or, what is the avergae Place achieved by country...
SELECT Country, AVG(Place) FROM medals GROUP BY Country;
  -- Country |          AVG(Place)
  -- --------|--------------------
  -- Sweden  |                 1.0
  -- UK      |                 2.0
  -- USA     |    1.66666666666667

-- or, which countries achieved medals in every major event?
SELECT Event, GROUP_CONCAT(Country) AS 'Countries with medals' FROM medals GROUP BY Event;
  -- Event         |Countries with medals                
  -- --------------|-------------------
  -- 100m Dash     |USA,UK             
  -- 100m Hurdles  |USA,USA,UK,Sweden  

It can get quite complex from here, but the base premise remains: Any aggregate function added to a GROUP BY query operates on ALL rows that get lumped together (grouped).

i.e. if in your query you simply did:

SELECT Invoice.BillingCountry AS countryofbilling, COUNT(*) AS county

you should be catching ALL the invoice lines - which I imagine is what you wanted. For the COUNT() aggregate function you can further specify which field to count - but unless you have NULLs in that field, all rows are to be counted anyway. For example, consider one of the above queries where we count "Class", of which there are only 2 - Male/Female, but the results are unaltered:

SELECT Country, COUNT(Class) FROM medals GROUP BY Country;
  -- Country |COUNT(Class)
  -- --------|------------
  -- Sweden  |      1     
  -- UK      |      2     
  -- USA     |      3     

You only get different counts if "Class" had some NULL values, in which case those rows would NOT be counted in the case of the above query. I don't see that to be a factor in your query.

Not sure if this cleared anything up, but from your question I'm guessing that it should. - or I am grossly misunderstanding the question.

Please let us know if this makes sense, or ask again if not clear.

(11) By Techcosec Limited (DigitalSplendid) on 2022-05-19 05:09:03 in reply to 10 [link] [source]

Thanks so much for explaining so elaborately that clears so many aspects.