Re: Understanding aggregation and window functions.

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Pól Ua Laoínecháin <linehanp(at)tcd(dot)ie>
Cc: pgsql-novice <pgsql-novice(at)lists(dot)postgresql(dot)org>
Subject: Re: Understanding aggregation and window functions.
Date: 2021-01-06 16:25:41
Message-ID: CAKFQuwZE3RANDXgYkJqwdB5iWwDnaGsqQSdhRH2KQYTP3BtsCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Wed, Jan 6, 2021 at 9:08 AM Pól Ua Laoínecháin <linehanp(at)tcd(dot)ie> wrote:

>
> SELECT
> th.theatre_name AS "Theatre name", s.show_name AS "Show name",
> SUM(t.price) AS "Total sales"
>
> Theatre name Show name Sales/show
> Theatre_1 show_10 2050
> Theatre_1 show_11 77
> Theatre_2 show_20 200
> Theatre_2 show_21 63
>
> Again, fine! But what I want is (theatre_1 2127) and (theatre_2, 263).
>
>
Then why are you grouping on "Show name" if the total you want is "Theatre
name"?

> SUM(t.price) OVER (PARTITION BY th.theatre_id ORDER BY th.theatre_id),
>
> and I receive the error:
>
> ERROR: column "t.price" must appear in the GROUP BY clause or be used
> in an aggregate function LINE 8: SUM(t.price) OVER (PARTITION BY
> th.theatre_id ORDER BY th....
>
>
sum(sum(t.price)) over (...)

When using window function in a aggregate query you need to write one
aggregate function calls to appease the group by (the inner sum) and one
aggregate function call to appease the "over" (the outer sum).

Ignoring the rest of the section, re-ask if you still have questions after
getting clarity for the above.

> ** BONUS QUESTION **
>
> I would also like to include the number of tickets sold by price point
> and the number of tickets sold per theatre and per show - this might
> involve a subselect in the SELECT clause?
>
> SELECT COUNT(t.ticket_show_id)
> FROM show s
> JOIN ticket t ON s.show_id = t.ticket_show_id
> GROUP BY s.show_id
>
> Result
>
> count
> 9
> 7
> 3
> 10
>
>
I try to never write a group by query that doesn't include a column for
each grouped field, then any aggregates.

> But when I tried to integrate it into another query as a SELECT in the
> SELECT statement, I received the error:
>
> ERROR: more than one row returned by a subquery used as an expression

>
Which I can understand, but can't figure out how to solve. For each
> row of my main query, the query above should only return 1 row -
> but...
>

Then you probably need to correlate the outer and inner queries by
referencing a value in the outer query inside of the subquery.

with subquery as (select id, val)
select mainquery.id, (select val from subquery where subquery.id =
mainquery.id)
from mainquery

Though, as can be seen from the above, usually a join works better than a
subquery in cases like this.

David J.

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Stefan Houtzager 2021-01-06 16:32:41 Re: coalesce(json_agg [..] filter where [..], '[]' in left join returning null
Previous Message David G. Johnston 2021-01-06 16:10:30 Re: coalesce(json_agg [..] filter where [..], '[]' in left join returning null