Re: PoC: Grouped base relation

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: Grouped base relation
Date: 2017-01-19 03:09:23
Message-ID: CAFjFpReh7pp-0xFNyXXL--ZCh12R5LHkdmk6kVs3DgHas0rmfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 19, 2017 at 12:02 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jan 17, 2017 at 11:33 PM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> I don't think aggcombinefn isn't there because we couldn't write it
>> for array_agg() or string_agg(). I guess, it won't be efficient to
>> have those aggregates combined across parallel workers.
>
> I think there are many cases where it would work fine. I assume that
> David just didn't make it a priority to write those functions because
> it wasn't important to the queries he wanted to optimize. But
> somebody can submit a patch for it any time and I bet it will have
> practical use cases. There might be some performance problems shoving
> large numbers of lengthy values through a shm_mq, but we won't know
> that until somebody tries it.
>
>> Also, the point is naming that kind of function as aggtransmultifn
>> would mean that it's always supposed to multiply, which isn't true for
>> all aggregates.
>
> TransValue * integer = newTransValue is well-defined for any
> aggregate. It's the result of aggregating that TransValue with itself
> a number of times defined by the integer. And that might well be
> significantly faster than using aggcombinefn many times. On the other
> hand, how many queries just sit there are re-aggregate the same
> TransValues over and over again? I am having trouble wrapping my head
> around that part of this.

Not all aggregates have TransValue * integer = newTransValue
behaviour. Example is array_agg() or string_agg() has "TransValue
concatenated integer time" behaviour. Or an aggregate "multiplying"
values across rows will have TransValue (raised to) integer behaviour.
Labelling all of those as "multi" doesn't look good.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2017-01-19 03:23:27 Re: Parallel Index Scans
Previous Message Ryan Murphy 2017-01-19 02:50:54 Re: Re: Clarifying "server starting" messaging in pg_ctl start without --wait