Re: Statistical aggregate functions are not working with PARTIAL aggregation

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Statistical aggregate functions are not working with PARTIAL aggregation
Date: 2019-05-20 01:20:21
Message-ID: 20190520012021.c2cz5tezxqda2kbp@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks to all for reporting, helping to identify and finally patch the
problem!

On 2019-05-20 10:36:43 +1200, David Rowley wrote:
> On Mon, 20 May 2019 at 06:36, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
> > > index d01fc4f52e..b061162961 100644
> > > --- a/src/backend/executor/nodeAgg.c
> > > +++ b/src/backend/executor/nodeAgg.c
> > > @@ -2522,8 +2522,9 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
> > > int existing_aggno;
> > > int existing_transno;
> > > List *same_input_transnos;
> > > - Oid inputTypes[FUNC_MAX_ARGS];
> > > + Oid transFnInputTypes[FUNC_MAX_ARGS];
> > > int numArguments;
> > > + int numTransFnArgs;
> > > int numDirectArgs;
> > > HeapTuple aggTuple;
> > > Form_pg_aggregate aggform;
> > > @@ -2701,14 +2702,23 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
> > > * could be different from the agg's declared input types, when the
> > > * agg accepts ANY or a polymorphic type.
> > > */
> > > - numArguments = get_aggregate_argtypes(aggref, inputTypes);
> > > + numTransFnArgs = get_aggregate_argtypes(aggref, transFnInputTypes);
> >
> > Not sure I understand the distinction you're trying to make with the
> > variable renaming. The combine function is also a transition function,
> > no?
>
> I was trying to make it more clear what each variable is for. It's
> true that the combine function is used as a transition function in
> this case, but I'd hoped it would be more easy to understand that the
> input arguments listed in a variable named transFnInputTypes would be
> for the function mentioned in pg_aggregate.aggtransfn rather than the
> transfn we're using. If that's not any more clear then maybe another
> fix is better, or we can leave it... I had to make sense of all this
> code last night and I was just having a go at making it easier to
> follow for the next person who has to.

That's what I guessed, but I'm not sure it really achieves that. How
about we have something roughly like:

int numTransFnArgs = -1;
int numCombineFnArgs = -1;
Oid transFnInputTypes[FUNC_MAX_ARGS];
Oid combineFnInputTypes[2];

if (DO_AGGSPLIT_COMBINE(...)
numCombineFnArgs = 1;
combineFnInputTypes = list_make2(aggtranstype, aggtranstype);
else
numTransFnArgs = get_aggregate_argtypes(aggref, transFnInputTypes);

...

if (DO_AGGSPLIT_COMBINE(...))
build_pertrans_for_aggref(pertrans, aggstate, estate,
aggref, combinefn_oid, aggtranstype,
serialfn_oid, deserialfn_oid,
initValue, initValueIsNull,
combineFnInputTypes, numCombineFnArgs);
else
build_pertrans_for_aggref(pertrans, aggstate, estate,
aggref, transfn_oid, aggtranstype,
serialfn_oid, deserialfn_oid,
initValue, initValueIsNull,
transFnInputTypes, numTransFnArgs);

seems like that'd make the code clearer? I wonder if we shouldn't
strive to have *no* DO_AGGSPLIT_COMBINE specific logic in
build_pertrans_for_aggref (except perhaps for an error check or two).

Istm we shouldn't even need a separate build_aggregate_combinefn_expr()
from build_aggregate_transfn_expr().

> > > @@ -2781,7 +2791,7 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
> > > aggref, transfn_oid, aggtranstype,
> > > serialfn_oid, deserialfn_oid,
> > > initValue, initValueIsNull,
> > > - inputTypes, numArguments);
> > > + transFnInputTypes, numArguments);
> >
> > That means we pass in the wrong input types? Seems like it'd be better
> > to either pass an empty list, or just create the argument list here.
>
> What do you mean "here"? Did you mean to quote this fragment?
>
> @@ -2880,7 +2895,7 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
> Oid aggtransfn, Oid aggtranstype,
> Oid aggserialfn, Oid aggdeserialfn,
> Datum initValue, bool initValueIsNull,
> - Oid *inputTypes, int numArguments)
> + Oid *transFnInputTypes, int numArguments)
>
> I had hoped the rename would make it more clear that these are the
> args for the function in pg_aggregate.aggtransfn. We could pass NULL
> instead when it's the combine func, but I didn't really see the
> advantage of it.

The advantage is that if somebody starts to use the the wrong list in
the wrong context, we'd be more likely to get an error than something
that works in the common cases, but not in the more complicated
situations.

> > I'm inclined to push a minimal fix now, and then a slightly more evolved
> > version fo this after beta1.
>
> Ok

Done that now.

> > > EXPLAIN (COSTS OFF)
> > > - SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
> > > + SELECT variance(unique1::int4), sum(unique1::int8),regr_count(unique1::float8, unique1::float8) FROM tenk1;
> > >
> > > -SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
> > > +SELECT variance(unique1::int4), sum(unique1::int8),regr_count(unique1::float8, unique1::float8) FROM tenk1;
> > >
> > > ROLLBACK;
> >
> > Does this actually cover the bug at issue here? The non-combine case
> > wasn't broken?
>
> The EXPLAIN shows the plan is:

Err, comes from only looking at the diff :(. Missed the previous SETs,
and the explain wasn't included in the context either...

Ugh, I just noticed - as you did before - that numInputs is declared at
the top-level in build_pertrans_for_aggref, and then *again* in the
!DO_AGGSPLIT_COMBINE branch. Why, oh, why (yes, I'm aware that that's in
one of my commmits :(). I've renamed your numTransInputs variable to
numTransArgs, as it seems confusing to have different values in
pertrans->numTransInputs and a local numTransInputs variable.

Btw, the zero input case appears to also be affected by this bug: We
quite reasonably don't emit a strict input check expression step for the
combine function when numTransInputs = 0. But the only zero length agg
is count(*), and while it has strict trans & combine functions, it does
have an initval of 0. So I don't think it's reachable with builtin
aggregates, and I can't imagine another zero argument aggregate.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2019-05-20 01:24:36 Re: Move regression.diffs of pg_upgrade test suite
Previous Message Fujii Masao 2019-05-20 01:17:31 Re: vacuumdb and new VACUUM options