From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Document aggregate functions better w.r.t. ORDER BY |
Date: | 2023-10-26 21:56:44 |
Message-ID: | ZTrgnM4GGNM9SniX@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 25, 2023 at 10:34:10PM -0700, David G. Johnston wrote:
> I would reword the existing note to be something like:
>
> The SQL Standard defines specific aggregates and their properties, including
> which of DISTINCT and/or ORDER BY is allowed. Due to the extensible nature of
> PostgreSQL it accepts either or both clauses for any aggregate.
Uh, is this something in my patch or somewhere else? I don't think
PostgreSQL extensible is an example of syntax flexibility.
> From the most recent patch:
>
> <para>
> - If <literal>DISTINCT</literal> is specified in addition to an
> - <replaceable>order_by_clause</replaceable>, then all the <literal>ORDER BY
> </literal>
> - expressions must match regular arguments of the aggregate; that is,
> - you cannot sort on an expression that is not included in the
> - <literal>DISTINCT</literal> list.
> + If <literal>DISTINCT</literal> is specified with an
> + <replaceable>order_by_clause</replaceable>, <literal>ORDER
> + BY</literal> expressions can only reference columns in the
> + <literal>DISTINCT</literal> list. For example:
> +<programlisting>
> +WITH vals (v1, v2) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
> +SELECT array_agg(DISTINCT v2 ORDER BY v2 DESC) FROM vals;
> + array_agg
> +-------------
> + {Z,T,R,D,A}
> +</programlisting>
>
> The change to a two-column vals was mostly to try and find corner-cases that
> might need to be addressed. If we don't intend to show the error case of
> DISTINCT v1 ORDER BY v2 then we should go back to the original example and just
> add ORDER BY v DESC. I'm fine with not using string_agg here.
>
> + For example:
> +<programlisting>
> +WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
> +SELECT array_agg(DISTINCT v ORDER BY v DESC) FROM vals;
> + array_agg
> +-----------
> + {4,3,2,1}
> +</programlisting>
Okay, good, switched in the attached patch.
> We get enough complaints regarding "apparent ordering" that I would like to
> add:
>
> As a reminder, while some DISTINCT processing algorithms produce sorted output
> as a side-effect, only by specifying ORDER BY is the output order guaranteed.
Well, we need to create a new email thread for this and look at all the
areas is applies to since this is a much larger issue.
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
Attachment | Content-Type | Size |
---|---|---|
agg_order.diff | text/x-diff | 5.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2023-10-26 21:57:57 | Re: [PoC] pg_upgrade: allow to upgrade publisher node |
Previous Message | David Rowley | 2023-10-26 21:53:29 | Re: Making aggregate deserialization (and WAL receive) functions slightly faster |