Re: Properly mark NULL returns in numeric aggregates

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jesse Zhang <sbjesse(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Denis Smirnov <sd(at)arenadata(dot)io>, Soumyadeep Chakraborty <sochakraborty(at)pivotal(dot)io>
Subject: Re: Properly mark NULL returns in numeric aggregates
Date: 2020-04-15 03:48:36
Message-ID: CAApHDvpfiiqfZehmsuSz1Z74gkxac+kv8PY=NT33PQYgfkmzKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 15 Apr 2020 at 03:41, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > For testing, can't we just have an Assert() in
> > advance_transition_function that verifies isnull matches the
> > nullability of the return value for INTERNAL returning transfns? i.e,
> > the attached
>
> FTR, I do not like this Assert one bit. nodeAgg.c has NO business
> inquiring into the contents of internal-type Datums. It has even
> less business enforcing a particular Datum value for a SQL null ---
> we have always, throughout the system, considered that if isnull
> is true then the contents of the Datum are unspecified. I think
> this is much more likely to cause problems than solve any.

OK. the latter case could be ignored by adding an OR condition to the
Assert to allow isnull == false cases to pass without any
consideration to the Datum value, but it sounds like you don't want to
insist that isnull == true returns NULL a pointer.

FWIW, I agree with Jesse that having numeric_combine() return a NULL
pointer without properly setting the isnull flag is pretty bad and it
should be fixed regardless. Not fixing it, even in the absence of
having a good way to test it just seems like we're leaving something
around that we're going to trip up on in the future. Serialization
functions crashing after receiving input from a combine function seems
pretty busted to me, regardless if there is a pathway for the
functions to be called in that order in core or not. I'm not a fan of
leaving it in just because testing for it might not be easy. One
problem with coming up with a way of testing from an SQL level will be
that we'll need to pick some aggregate functions that currently have
this issue and ensure they don't regress. There's not much we can do
to ensure any new aggregates we might create the future don't go and
break this rule. That's why I thought that the Assert might be more
useful.

I don't think it would be impossible to test this using an extension
and using the create_upper_paths_hook. I see that test_rls_hooks
which runs during make check-world does hook into the RLS hooks do
test some behaviour. I don't think it would be too tricky to have a
hook implement a 3-stage aggregate plan with the middle stage doing a
deserial/combine/serial before passing to the Finalize Aggregate node.
That would allow us to ensure serial functions can accept the results
from combine functions, to which nothing in core currently can do.

David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-04-15 03:49:11 Re: documenting the backup manifest file format
Previous Message Amit Kapila 2020-04-15 03:42:37 Re: Vacuum o/p with (full 1, parallel 0) option throwing an error