Re: Properly mark NULL returns in numeric aggregates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jesse Zhang <sbjesse(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Denis Smirnov <sd(at)arenadata(dot)io>, Soumyadeep Chakraborty <sochakraborty(at)pivotal(dot)io>
Subject: Re: Properly mark NULL returns in numeric aggregates
Date: 2020-04-13 18:13:40
Message-ID: 11524.1586801620@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jesse Zhang <sbjesse(at)gmail(dot)com> writes:
> On Fri, Apr 10, 2020 at 3:59 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> They can't be strict because the initial iteration needs to produce
>> something from a null state and non-null input. nodeAgg's default
>> behavior won't work for those because nodeAgg doesn't know how to
>> copy a value of type "internal".

> Ah, I think I get it. A copy must happen because the input is likely in
> a shorter-lived memory context than the state, but nodeAgg's default
> behavior of copying a by-value datum won't really copy the object
> pointed to by the pointer wrapped in the datum of "internal" type, so we
> defer to the combine function. Am I right? Then it follows kinda
> naturally that those combine functions have been sloppy on arrival since
> commit 11c8669c0cc .

Yeah, they're relying exactly on the assumption that nodeAgg is not
going to try to copy a value declared "internal", and therefore they
can be loosey-goosey about whether the value pointer is null or not.
However, if you want to claim that that's wrong, you have to explain
why it's okay for some other code to be accessing a value that's
declared "internal". I'd say that the meaning of that is precisely
"keepa u hands off".

In the case at hand, the current situation is that we only expect the
values returned by these combine functions to be read by the associated
final functions, which are on board with the null-pointer representation
of an empty result. Your argument is essentially that it should be
possible to feed the values to the aggregate's associated serialization
function as well. But the core code never does that, so I'm not convinced
that we should add it to the requirements; we'd be unable to test it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-04-13 18:13:46 Re: Parallel copy
Previous Message Kuntal Ghosh 2020-04-13 18:13:34 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions