From: | Vik Fearing <vik(at)postgresfriends(dot)org> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Subject: | Re: Ordering behavior for aggregates |
Date: | 2022-12-13 13:05:10 |
Message-ID: | abb988eb-6a88-4b63-4703-36f9a7b97133@postgresfriends.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/13/22 13:55, Magnus Hagander wrote:
> On Tue, Dec 13, 2022 at 1:51 PM Vik Fearing <vik(at)postgresfriends(dot)org> wrote:
>
>> The standard only defines an ORDER BY clause inside of an aggregate for
>> ARRAY_AGG(). As an extension to the standard, we allow it for all
>> aggregates, which is very convenient for non-standard things like
>> string_agg().
>>
>> However, it is completely useless for things like AVG() or SUM(). If
>> you include it, the aggregate will do the sort even though it is neither
>> required nor desired.
>>
>> I am proposing something like pg_aggregate.aggordering which would be an
>> enum of behaviors such as f=Forbidden, a=Allowed, r=Required. Currently
>> all aggregates would have 'a' but I am thinking that a lot of them could
>> be switched to 'f'. In that case, if a user supplies an ordering, an
>> error is raised.
>>
>
> Should there perhaps also be an option for "ignored" where we'd allow the
> user to specify it, but not actually do the sort because we know it's
> pointless? Or maybe that should be the behaviour of "forbidden", which
> should then perhaps have a different name?
I did think about that but I can't think of any reason we would want to
silently ignore something the user has written. If the ordering doesn't
make sense, we should forbid it.
> My main motivation behind this is to be able to optimize aggregates that
>> could stop early such as ANY_VALUE(), but also to self-optimize queries
>> written in error (or ignorance).
>>
>> There is recurring demand for a first_agg() of some sort, and that one
>> (whether implemented in core or left to extensions) would use 'r' so
>> that an error is raised if the user does not supply an ordering.
>>
>> I have not started working on this because I envision quite a lot of
>> bikeshedding, but this is the approach I am aiming for.
>>
>> Thoughts?
>>
>
> For consistency, should we have a similar flag for DISITNCT? That could be
> interesting to forbid for something like first_agg() wouldn't it? I'm not
> sure what the usecase would be to require it, but maybe there is one?
I thought about that too, but decided it could be a separate patch
because far fewer aggregates would need it.
--
Vik Fearing
From | Date | Subject | |
---|---|---|---|
Next Message | houzj.fnst@fujitsu.com | 2022-12-13 13:07:06 | RE: Perform streaming logical transactions by background workers and parallel apply |
Previous Message | Magnus Hagander | 2022-12-13 12:55:27 | Re: Ordering behavior for aggregates |