From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Cc: | Atri Sharma <atri(dot)jiit(at)gmail(dot)com> |
Subject: | Re: Aggregate function API versus grouping sets |
Date: | 2014-07-02 17:52:04 |
Message-ID: | 87egy3itnx.fsf@news-spur.riddles.org.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> Doing rollup via GroupAggregate by maintaining multiple transition
>> values at a time (one per grouping set) means that the transfn is
>> being called interleaved for transition values in different
>> contexts. So the question becomes: is it wrong for the transition
>> function to assume that aggcontext can be cached this way, or is
>> it necessary for the executor to use a separate flinfo for each
>> concurrent grouping set?
Tom> Hm. We could probably move gcontext into the per-group data.
Tom> I'm not sure though if there are any other dependencies there on
Tom> the groups being evaluated serially. More generally, I wonder
Tom> whether user-defined aggregates are likely to be making
Tom> assumptions that will be broken by this.
The thing is that almost everything _except_ aggcontext and
AggGetPerAggEContext that a transfn might want to hang off fn_extra
really is going to be constant across all groups.
The big question, as you say, is whether this is going to be an issue
for existing user-defined aggregates.
>> Since it seems that the cleanup callback is the sole reason for
>> this function to exist, is it acceptable to remove any implication
>> that the context returned is the overall per-output-tuple one, and
>> simply state that it is a context whose cleanup functions are
>> called at the appropriate time before aggcontext is reset?
Tom> Well, I think the implication is that it's the econtext in which
Tom> the result of the aggregation will be used.
In the rollup case, though, it does not seem reasonable to have
multiple result-tuple econtexts (that would significantly complicate
the projection of rows, the handling of rescans, etc.).
Tom> Another approach would be to remove AggGetPerAggEContext as such
Tom> from the API altogether, and instead offer an interface that
Tom> says "register an aggregate cleanup callback", leaving it to the
Tom> agg/window core code to figure out which context to hang that
Tom> on. I had thought that exposing this econtext and the
Tom> per-input-tuple one would provide useful generality, but maybe
Tom> we should rethink that.
Works for me.
--
Andrew (irc:RhodiumToad)
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2014-07-02 17:52:53 | Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] |
Previous Message | Robert Haas | 2014-07-02 17:08:55 | Re: /proc/self/oom_adj is deprecated in newer Linux kernels |