Re: select top_countries and for each country, select top_cities in that country, in 1 query

From: John McKown <john(dot)archie(dot)mckown(at)gmail(dot)com>
To: Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: select top_countries and for each country, select top_cities in that country, in 1 query
Date: 2014-08-18 15:52:25
Message-ID: CAAJSdjgRdi3Wn2AP6U1hftc1wpq32fY+swckF1P9q6hwu6a1Qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Aug 18, 2014 at 9:28 AM, Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com>
wrote:

> I have CREATE TABLE t (country text, city text);
>
> I want to get with 1 query,
>
> select count(country),country GROUP BY country ORDER BY count(country) DESC
>
>
> And for each country, to get the same for cities.
>
> Is it possible ?
>
> Thanks
>

Just to be sure that I understand what you want. Will you have a given
country, city combination occur multiple times? And, if so, you want to
know how many times such a combination occurs (count city by country)? So
you want a result which has one row for each country, city combination
which lists the city name, country name, count(city within country),
count(total times country occurs,regardless of city). It also depends on
what you mean by "one query". I would hope you mean "sending a single
SELECT command to the database server and getting the result set'. I ask
because you could possibly do something like:

SELECT 'city',count(city),city,country FROM t GROUP BY city, country ORDER
BY 2 DESC
UNION
SELECT 'country',count(country), country, NULL FROM t GROUP BY country
ORDER BY 2 DESC

But I'm thinking that's not what you want.

Note: ORDER BY 2 means to order by the second output column, which is just
a shorter way of saying "count(....)". I'm lazy.

===
Now, assuming that what I think you want really is what you want, I have
some "hairy" SQL for you to try. The SQL query below seemed to work for me
on some test data that I just "faked up" on my own.

SELECT avg(b.countcountry)::int as "CountryCount", b.country, a.city,
count(a.city) as "CityCount"
FROM t AS a
INNER JOIN
(SELECT COUNT(country) AS countcountry, country FROM t GROUP BY country) AS
b
ON a.country = b.country
GROUP BY b.country, a.city
ORDER BY 1 DESC,4 DESC;

You may well wonder about that "avg(b.countrycount)::int" in the top line.
That is _funky_, I will admit. Why does it work? Because I group by
b.country. And for each given value of b.country, b.countrycount is the
same value (think about it). So if you add up "n" copies of a specific
integer number, then divide by "n", you get back the same number. This is
to satisfy the SQL requirement that the column either be in the GROUP BY
(which I don't want) or be in an aggregate funciton. So I picked avg() as
an aggregate function that "just happens" (by design) to have the original
value that I need. I'm sneaky! The ::int at the end casts the result from a
floating point number back to an integer. This, to me, is a nicer display
and is, again, guaranteed to be correct due to the way things are being
calculated. Sorry if I'm being a bit "pushy" on this point. This point is
what could be the most confusing to someone else, so I want to try to
explain my thought process as well as I can.

You can rearrange the columns in the SELECT, but if you do be sure to
change the values in the ORDER BY to still be the same column values.
Again, I'm too lazy to type in the column names.

I am not saying the above is the best possible SQL. But it did work for me
in my testing.

--
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! <><
John McKown

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Holger.Friedrich-Fa-Trivadis 2014-08-18 15:54:49 FW: logfile character encoding
Previous Message Holger.Friedrich-Fa-Trivadis 2014-08-18 15:50:48 Re: logfile character encoding