Re: first_value/last_value

From: Alexey Bashtanov <bashtanov(at)imap(dot)cc>
To: Tom Smith <tomsmith1989sk(at)gmail(dot)com>, Melvin Davidson <melvin6925(at)gmail(dot)com>
Cc: Adam Brusselback <adambrusselback(at)gmail(dot)com>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: first_value/last_value
Date: 2016-05-19 11:10:49
Message-ID: 573D9F39.4040608@imap.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello Tom,

I agree such functions are very useful, as they allow you to use
ordinary aggregation functions such as sum/max/avg
along with first/last ones (traditionally served by DISTINCT ON or
LIMIT) in the same group-by node
which improves performance and readability.

The first/last extension you mentioned
http://pgxn.org/dist/first_last_agg/ has a slight disadvantage:
it relies on postgres providing it sorted data if user asks for defined
ordering.
This makes first/last functions not supporting hashAgg, only groupAgg
which usually requires explicit sort.

If you find first_last_agg performance poor take a look at argm
extension http://pgxn.org/dist/argm/ .
It provides similar functionality but works faster as it does not sort
or make postgres core sort,
but only chooses the first row within each group.

As for the original question unfortunately the way from an extension to
postgres core is not too easy
and normally requires an extension to become popular and to be included
in postgres distribution as a contrib module first.

Regards,
Alexey Bashtanov

On 19/05/16 04:04, Tom Smith wrote:
> It would really save all the troubles for many people if postgresql
> has a built-in first/last function along with sum/avg.
> There is already a C extension and a wiki sample and implemented for
> window function.
> I am curious why these two functions were not added along their window
> implementation counter part,
> for completness and consistency
>
>
> On Wed, May 18, 2016 at 10:42 PM, Melvin Davidson
> <melvin6925(at)gmail(dot)com <mailto:melvin6925(at)gmail(dot)com>> wrote:
>
>
>
> On Wed, May 18, 2016 at 10:36 PM, Adam Brusselback
> <adambrusselback(at)gmail(dot)com <mailto:adambrusselback(at)gmail(dot)com>> wrote:
>
> Here is an example that works in a single query. Since you
> have two different orders you want the data back in, you need
> to use subqueries to get the proper data back, but it works,
> and is very fast.
>
> CREATE TEMPORARY TABLE foo AS
> SELECT generate_series as bar
> FROM generate_series(1, 1000000);
>
> CREATE INDEX idx_foo_bar ON foo (bar);
>
>
> SELECT *
> FROM (
> SELECT bar
> FROM foo
> ORDER BY bar asc
> LIMIT 1
> ) x
> UNION ALL
> SELECT *
> FROM (
> SELECT bar
> FROM foo
> ORDER BY bar desc
> LIMIT 1
> ) y;
>
> DROP TABLE foo;
>
>
> Seems to me SELECT min(<column>), max(<column>) FROM deja.vu ;
> would also work.
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize. Whether or not you
> wish to share my fantasy is entirely up to you.
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Leonardo M. Ramé 2016-05-19 11:43:00 Re: Londiste 3 pgq events_1_1 table huge
Previous Message Matija Lesar 2016-05-19 08:23:44 Re: first_value/last_value