Re: Picking the first of an order in an aggregate query

From: Vincent Veyron <vv(dot)lists(at)wanadoo(dot)fr>
To: Robert James <srobertjames(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Picking the first of an order in an aggregate query
Date: 2013-01-08 17:03:17
Message-ID: 1357664597.2431.40.camel@asus-1001PX.home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Le lundi 31 décembre 2012 à 20:55 -0500, Robert James a écrit :
> On 12/31/12, François Beausoleil <francois(at)teksol(dot)info> wrote:
> >
> > Le 2012-12-31 à 15:38, Robert James a écrit :
> >
> >> DISTINCT is a very simple solution!
> >> But I have one problem: In addition to the FIRST fields, I also do
> >> want some aggregate functions. More accurately, it would be:
> >>
> >> SELECT grouping_field, FIRST(field_a), FIRST(field_b), SUM(field_x),
> >> MAX(field_y)
> >> ...
> >>
> >> How should I do that? Should I do two queries with a join on the
> >> grouping field? Or is there a more direct way?
> >
> > WINDOW functions can help you:
> >
> > SELECT
> > grouping_field
> > , first_value(field_a) OVER (ORDER BY ...)
> > , first_value(field_b) OVER (ORDER BY ...)
> > , sum(field_x) OVER ()
> > , max(field_y) OVER ()
> > FROM ...
> >
> > The empty OVER clauses will make the sum / max work over the full result
> > set, and not a subset. I really recommend reading the window functions
> > section on the site.
> >
>
> I see. Will the optimizer know enough to not repeat the work for each
> first_value I do? Or am I better off using a JOIN of some sort?
>
>

You probably can check with explain analyze; if not, a CTE (common table
expression) might help; something like :

with t1 as (
select grouping_field, sum(field_x) as sum_x, max(field_y) as max_y
FROM ...
group by grouping_field
)
SELECT grouping_field,
first_value(field_a) OVER (Partition by grouping_field ORDER BY ...),
first_value(field_b) OVER (Partition by grouping_field ORDER BY ...),
t1.sum_x,
t1.max_y
FROM ... INNER JOIN t1 using (grouping_field)

'Partition by grouping_field' may or may not be necessary in your case,
depending on what you want; see :

http://www.postgresql.org/docs/current/static/tutorial-window.html

--
Vincent Veyron
http://marica.fr
Logiciel pour département juridique

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Stefan Keller 2013-01-08 17:24:46 Re: [postgis-users] Query with LIMIT but as random result set?
Previous Message James Cowell 2013-01-08 17:00:50 Re: [Solved] Corrupt indexes on slave when using pg_bulkload on master