Re: Partial aggregates pushdown

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Cc: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, vignesh C <vignesh21(at)gmail(dot)com>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Subject: Re: Partial aggregates pushdown
Date: 2024-08-15 21:12:50
Message-ID: Zr5vUoXA4sCyZjsX@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 8, 2024 at 01:48:49PM +0200, Jelte Fennema-Nio wrote:
> SUMMARY OF THREAD
>
> The design of patch 0001 is agreed upon by everyone on the thread (so
> far). This adds the PARTIAL_AGGREGATE label for aggregates, which will
> cause the finalfunc not to run. It also starts using PARTIAL_AGGREGATE
> for pushdown of aggregates in postgres_fdw. In 0001 PARTIAL_AGGREGATE
> is only supported for aggregates with a non-internal/pseudo type as
> the stype.
>
> The design for patch 0002 is still under debate. This would expand on
> the functionality added by adding support for PARTIAL_AGGREGATE for
> aggregates with an internal stype. This is done by returning a byte
> array containing the bytes that the serialfunc of the aggregate
> returns.
>
> A competing proposal for 0002 is to instead change aggregates to not
> use an internal stype anymore, and create dedicated types. The main
> downside here is that infunc and outfunc would need to be added for
> text serialization, in addition to the binary serialization. An open
> question is: Can we change the requirements for CREATE TYPE, so that
> types can be created without infunc and outfunc.
>
> WHAT IS NEEDED?
>
> The things needed for this patch are that docs need to be added, and
> detailed codereview needs to be done.
>
> Feedback from more people on the two competing proposals for 0002
> would be very helpful in making a decision.

First, I am sorry to be replying so late --- I have been traveling for
the past four weeks. Second, I consider this feature a big part of
sharding, and I think sharding is Postgres's biggest missing feature. I
talk about this patch often when asked about what Postgres is working
on.

Third, I would like to show a more specific example to clarify what is
being considered above. If we look at MAX(), we can have FDWs return
the max for each FDW, and the coordinator can chose the highest value.
This is the patch 1 listed above. These can return the
pg_aggregate.aggtranstype data type using the pg_type.typoutput text
output.

The second case is for something like AVG(), which must return the SUM()
and COUNT(), and we currently have no way to return multiple text values
on the wire. For patch 0002, we have the option of creating functions
that can do this and record them in new pg_attribute columns, or we can
create a data type with these functions, and assign the data type to
pg_aggregate.aggtranstype.

Is that accurate?

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-08-15 21:13:29 Re: Restart pg_usleep when interrupted
Previous Message Peter Geoghegan 2024-08-15 21:10:01 Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)