From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Partial aggregates pushdown |
Date: | 2024-08-22 18:56:55 |
Message-ID: | ZseJ9xsgWk4SRlMI@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Aug 21, 2024 at 05:41:02PM +0200, Tomas Vondra wrote:
> On 8/8/24 13:48, Jelte Fennema-Nio wrote:
> > SUMMARY OF THREAD
> >
> > The design of patch 0001 is agreed upon by everyone on the thread (so
> > far). This adds the PARTIAL_AGGREGATE label for aggregates, which will
> > cause the finalfunc not to run. It also starts using PARTIAL_AGGREGATE
> > for pushdown of aggregates in postgres_fdw. In 0001 PARTIAL_AGGREGATE
> > is only supported for aggregates with a non-internal/pseudo type as
> > the stype.
>
> I don't have a strong opinion on this, but I wonder if someone might
> object this essentially extends the syntax with something that is not
> (and never will be) in the SQL standard. I wonder if there's some
> precedent for encoding such explicit execution instructions into the
> query itself?
>
> That reminds me - the PARTIAL_AGGREGATE label is per aggregate, but I
> don't think it makes much sense to do this only for some aggregates,
> right? Do we have a way to make sure the query is "consistent"? I'm not
> sure if doing this on the source (before pushdown) is enough. Could
> there be a difference in what the remote instance supports?
>
> The only alternative that I can think of (and that I believe was already
> mentioned in this thread) is to set some GUC that forces the top-most
> query level to do this (all aggregates at that level). That's have the
> benefit of always affecting all aggregates.
You make a very good point above. Would there ever be cases where a
targetlist would have multiple aggregates, and some can be pushed down,
and some have to return all matching rows so the sender can compute the
aggregate? If so, how would we handle that? How does parallelism
handle that now?
> > The design for patch 0002 is still under debate. This would expand on
> > the functionality added by adding support for PARTIAL_AGGREGATE for
> > aggregates with an internal stype. This is done by returning a byte
> > array containing the bytes that the serialfunc of the aggregate
> > returns.
> >
> > A competing proposal for 0002 is to instead change aggregates to not
> > use an internal stype anymore, and create dedicated types. The main
> > downside here is that infunc and outfunc would need to be added for
> > text serialization, in addition to the binary serialization. An open
> > question is: Can we change the requirements for CREATE TYPE, so that
> > types can be created without infunc and outfunc.
> >
>
> I think it's +0.5 for the new dedicated data types from me.
>
> I admit I'm too lazy to read the whole thread from scratch, but I
> believe we did discuss the possibility to reuse the serial/deserial
> functions we already have, but the reason against that was the missing
> cross-version stability. Parallel queries always run within a single
> instance, hence there are no concerns about other versions. But this is
> meant to support the remote node having a wildly different version.
It is more than different versions. Different CPU architectures can
store data types differently in binary. It would be kind of interesting
to have a serial/deserial that was in text format. I guess that is
where my RECORD idea came from that I just emailed to the list.
What I would really like is something similar to pg_proc.proargtypes
(which is data type oidvector) for pg_aggregate where we can supply the
oids of the pg_aggregate.aggtranstype and construct a record on the fly
to return to the FDW caller, rather than having to create specialized
functions for every aggregate that needs to return several different
data types, e.g., AVG(interval), #4 in my previous email. I realize
this would require the creation of data types int128 and int128 array.
> I guess we might introduce another pair of serial/deserial functions,
> with this guarantee. I know we (me and Jelte) discussed that in person
> at some point, and there were arguments for doing the data types. But I
> managed to forget the details :-(
>
> > WHAT IS NEEDED?
> >
> > The things needed for this patch are that docs need to be added, and
> > detailed codereview needs to be done.
>
> Yeah, I think the docs are must-have for a proper review.
I think the docs are on hold until we decide on a transfer method.
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2024-08-22 19:26:41 | Re: optimize hashjoin |
Previous Message | Andrei Lepikhov | 2024-08-22 18:46:01 | Consider the number of columns in the sort cost model |