Re: Are multiple array_aggs going to be in the same order?

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Guyren Howe <guyren(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Are multiple array_aggs going to be in the same order?
Date: 2017-04-10 18:28:38
Message-ID: CAKFQuwah_nXYHuMHXgB=o4WECVFgOjzzP-q8OuXPT144dCVGPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Apr 10, 2017 at 11:02 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Mon, Apr 10, 2017 at 12:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> >> On Sun, Apr 9, 2017 at 4:27 PM, Guyren Howe <guyren(at)gmail(dot)com> wrote:
> >>> If I do a SELECT with a GROUP_BY and multiple ARRAY_AGGs, will the
> ARRAY_AGGs be guaranteed to have entries in the same (ie corresponding)
> order?
> >>>
> >>> eg
> >>>
> >>> SELECT
> >>> u.name,
> >>> ARRAY_AGG(o.order_date) AS order_dates,
> >>> ARRAY_AGG(o.order_total) AS order_totals
> >>> FROM
> >>> user u JOIN
> >>> orders o USING (user_id)
> >>> GROUP BY
> >>> u.user_id
> >
> >> It is unsafe to rely on aggregation order unless specified -- you can
> >> add ORDER BY to the aggregation clause.
> >
> > You definitely can't assume anything about the order in which the FROM
> > clause will deliver rows, but I think that's not quite what the question
> > was. If I read it right, the OP wants to be sure that the two aggregate
> > functions will see the data in the *same* unspecified order. I think
> > that's a pretty safe assumption. The server would have to go way
> > out of its way to do differently, and it doesn't.
>
> Sure, but isn't it fair to consider that an implementation artifact?
> If his code depends on that ordering being the same across aggregate
> functions, and the SQL standard doesn't specify that (I guess it
> might, but I'm skeptical), he ought to specify that for clarify at the
> very least.
>

So, the presence of ORDER BY in the aggregate function call is a PostgreSQL
extension...

It seems reasonable to declare that the order of the values in the
generated array match whatever order the FROM clause supplies the rows. If
that is not acceptable a PostgreSQL-specific ORDER BY modifier can be added
which will cause an additional sort-and-scan of the input relation to occur
(optimized across multiple column invocations when possible). Thus two
aggregate functions w/o an ORDER BY will always see the source rows in the
same order.

SELECT array_agg(v)
FROM (
SELECT * FROM (VALUES (3),(1),(2)) vals (v) ORDER BY 1
) AS src

The only real confusion is whether a query like the above is guaranteed to
supply rows to the outer select target list in order. If that is indeed
the case then the overall behavior seems quite reasonable to explicitly
define like above.

Dave

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2017-04-10 18:35:22 Re: Are multiple array_aggs going to be in the same order?
Previous Message Merlin Moncure 2017-04-10 18:02:23 Re: Are multiple array_aggs going to be in the same order?