Re: Idea: PostgreSQL equivalent to Oracle's KEEP clause

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ben Clements <benhasgonewalking(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Idea: PostgreSQL equivalent to Oracle's KEEP clause
Date: 2023-03-14 08:01:28
Message-ID: 69BD603E-0CAF-40AD-AA83-D2484F126313@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 7 Mar 2023, at 4:11, David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Mon, Mar 6, 2023 at 7:51 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Tue, 7 Mar 2023 at 12:40, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Ben Clements <benhasgonewalking(at)gmail(dot)com> writes:
> > > As shown above, the following calculated column can bring in the city name,
> > > even though the city name isn't in the GROUP BY:
> > > max(city) keep (dense_rank first order by population desc)
> >
> > You haven't really explained what this does, let alone why it can't
> > be implemented with existing features such as FILTER and ORDER BY.
>
> (It wasn't clear to me until I watched the youtube video.)
>
> Likely KEEP is more flexible than just the given example but I think
> that something similar to the example given could be done by inventing
> a TOP() and BOTTOM() aggregate. Then you could write something like:
>
> select
> country,
> count(*),
> max(population),
> bottom(city, population)
> from
> cities
> group by
> country
> having
> count(*) > 1
>
> the transfn for bottom() would need to remember the city and the
> population for the highest yet seen value of the 2nd arg.
>
> BOTTOM() remembers the highest value?
>
> Where this wouldn't work would be if multiple columns were
> required to tiebreak the sort.
>
> TOP(city, ROW(population, land_area)) ?

What should be the expected behaviour on a tie though?

Say that we count the number of districts or airfields or train stations per city and query for the one(s) with the most or least of them? There could well be multiple cities with the same max number, and there will be many cities with the same minimum number (namely 0).

Should the result be just the first of the maximums (or minimums) through some selection criterium (such as their alphabetical order), should that give each of the tied results, or should there be a means to define that behaviour?

I suppose a combination with FIRST and LAST could solve that issue?

Regards,
Alban Hertroys
--
There is always an exception to always.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thorsten Glaser 2023-03-14 09:27:12 Re: Uppercase version of ß desired
Previous Message Dominique Devienne 2023-03-14 07:49:19 Re: Seq Scan because of stats or because of cast?