Re: Rethinking representation of partial-aggregate steps

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Rethinking representation of partial-aggregate steps
Date: 2016-06-23 17:25:01
Message-ID: 31354.1466702701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jun 23, 2016 at 12:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm inclined to think that we should aggressively simplify here, perhaps
>> replacing all three of these flags with a three-way enum, say
>>
>> PARTIALAGG_NORMAL /* simple aggregation */
>> PARTIALAGG_PARTIAL /* lower phase of partial aggregation */
>> PARTIALAGG_FINAL /* upper phase of partial aggregation */
>>
>> This embodies a couple of judgments that maybe someone would quibble with:
>> * we don't have any use for intermediate partial aggregation steps;
>> * we don't have any use for partial aggregation in which data transferred
>> up needn't be serialized.

> Hmm, well I guess I would have to disagree with the idea that those
> other modes aren't useful. I seem to recall that David had some
> performance results showing that pushing partial aggregate steps below
> an Append node resulted in a noticeable speed-up even in the absence
> of any parallelism, presumably because it avoids whatever projection
> the Append might do, and also improves icache and dcache locality.

I don't believe that for one second, because introducing another layer of
intermediate aggregation implies another projection step, plus all the
other overhead of a Plan node.

Also, if we are going to support intermediate partial aggregation, the
current representation is broken anyway, since it lacks any ability
to distinguish serialization for the input states from serialization
for the output states, which is something you'd certainly need to
distinguish in this type of plan structure.

> I do agree, however, that the three Boolean flags don't make the code
> entirely easy to read. What I might suggest is that we replace the
> three Boolean flags with integer flags, something like this:

Yeah, that's another way we could go. I had been considering a variant
of that, which was to assign specific code values to the enum constants
and then invent macros that did bit-anding tests on them. That ends up
being just about what you propose except that the compiler understands
the enum-ness of the behavioral alternatives, which seems like a good
thing.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-06-23 17:26:37 Re: Hash Indexes
Previous Message Robert Haas 2016-06-23 17:20:46 Re: Bug in to_timestamp().