Rethinking representation of partial-aggregate steps

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Rethinking representation of partial-aggregate steps
Date: 2016-06-23 16:26:00
Message-ID: 29309.1466699160@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I really dislike this aspect of the partial-aggregate patch:

typedef struct Agg
{
...
bool combineStates; /* input tuples contain transition states */
bool finalizeAggs; /* should we call the finalfn on agg states? */
bool serialStates; /* should agg states be (de)serialized? */
...
} Agg;

Representing the various partial-aggregate behaviors as three bools seems
like a bad idea to me. In the first place, this makes it look like there
are as many as eight possible behaviors, whereas only a subset of those
flag combinations are or ever will be supported (not that the comments
anywhere tell you that, much less which ones). In the second place,
it leads to unreadable code like this:

/* partial phase */
count_agg_clauses(root, (Node *) partial_grouping_target->exprs,
&agg_partial_costs, false, false, true);

/* final phase */
count_agg_clauses(root, (Node *) target->exprs, &agg_final_costs,
true, true, true);
count_agg_clauses(root, parse->havingQual, &agg_final_costs, true,
true, true);

Good luck telling whether those falses and trues actually do what they
should.

I'm inclined to think that we should aggressively simplify here, perhaps
replacing all three of these flags with a three-way enum, say

PARTIALAGG_NORMAL /* simple aggregation */
PARTIALAGG_PARTIAL /* lower phase of partial aggregation */
PARTIALAGG_FINAL /* upper phase of partial aggregation */

This embodies a couple of judgments that maybe someone would quibble with:
* we don't have any use for intermediate partial aggregation steps;
* we don't have any use for partial aggregation in which data transferred
up needn't be serialized.

But I can't really see how either of those would make sense for any
practical use-case, and I don't think we need to have a confusing and
bug-prone API in order to hold the door open to support them.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2016-06-23 16:37:22 Re: Bug in to_timestamp().
Previous Message Alex Ignatov 2016-06-23 16:16:30 Re: Bug in to_timestamp().