From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |
Date: | 2014-01-28 20:16:23 |
Message-ID: | 62144D8A-FB0D-420B-ABDC-C478C6A4DA79@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jan27, 2014, at 23:28 , Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> strict transfn vs non-strict inv_transfn
> ----------------------------------------
>
> This case is explicitly forbidden, both in CREATE AGGREGATE and in the
> executor. To me, that seems overly restrictive --- if transfn is
> strict, then you know for sure that no NULL values are added to the
> aggregate state, so surely it doesn't matter whether or not
> inv_transfn is strict. It will never be asked to remove NULL values.
>
> I think there are definite use-cases where a user might want to use a
> pre-existing function as the inverse transition function, so it seems
> harsh to force them to wrap it in a strict function in this case.
>
> AFAICS, the code in advance/retreat_windowaggregate should just work
> if those checks are removed.
True. It didn't use to in earlier version of the patch because the advance
and retreat functions looked at the strict settings directly, but now that
there's an ignore_nulls flag in the executor node, the only requirement
should be that ignore_nulls is set if either of the transition functions
is strict.
I'm not sure that the likelihood of someone wanting to combine a strict
forward with a non-strict inverse function is very hight, but neither
> non-strict transfn vs strict inv_transfn
> ----------------------------------------
>
> At first this seems as though it must be an error --- the forwards
> transition function allows NULL values into the aggregate state, and
> the inverse transition function is strict, so it cannot remove them.
>
> But actually what the patch is doing in this case is treating the
> forwards transition function as partially strict --- it won't be
> passed NULL values (at least in a window context), but it may be
> passed a NULL state in order to build the initial state when the first
> non-NULL value arrives.
Exactly.
> It looks like this behaviour is intended to support aggregates that
> ignore NULL values, but cannot actually be made to have a strict
> transition function. I think typically this is because the aggregate's
> initial state is NULL and it's state type differs from the type of the
> values being aggregated, and so can't be automatically created from
> the first non-NULL value.
Yes. I added this because the alternative would haven been to count
non-NULL values in most of the existing SUM() aggregates.
> That all seems quite ugly though, because now you have a transition
> function that is not strict, which is passed NULL values when used in
> a normal aggregate context, and not passed NULL values when used in a
> window context (whether sliding or not), except for the NULL state for
> the first non-NULL value.
Ugh, true. Clearly, nodeAgg.c needs to follow what nodeWindowAgg.c does
and skip NULL inputs if the aggregate has a strict inverse transition
function. That fact that we never actually *call* the inverse doesn't
matter. Will fix that once we decided on the various strictness issues.
> I'm not sure if there is a better way to do it though. If we disallow
> this case, these aggregates would have to use non-strict functions for
> both the forward and inverse transition functions, which means they
> would have to implement their own non-NULL value counting.
> Alternatively, allowing strict transition functions for these
> aggregates would require that we provide some other way to initialise
> the state from the first non-NULL input, such as a new initfn.
I'm not sure an initfn would really help. It would allow us to return
to the initial requirement that the strict settings match - but you
already deem the check that the forward transition function can only
be strict if the inverse is also overly harsh, so would that really be
an improvement? It's also cost us an additional pg_proc entry per aggregate.
Another idea would be to have an explicit nulls=ignore|process option
for CREATE AGGREGATE. If nulls=process and either of the transition
functions are strict, we could either error out, or simply do what
normal functions calls do and pretend they return NULL for NULL inputs.
Not sure how the rule that forward transition functions may not return
NULL if there's an inverse transition function would fit in if we do
the latter, though.
The question is - is it worth it the effort to add that flag?
best regards,
Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2014-01-28 20:17:22 | Re: proposal: hide application_name from other users |
Previous Message | Stephen Frost | 2014-01-28 20:15:26 | Re: proposal: hide application_name from other users |