From: | Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> |
---|---|
To: | "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, vignesh C <vignesh21(at)gmail(dot)com>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> |
Subject: | Re: Partial aggregates pushdown |
Date: | 2024-06-24 09:08:50 |
Message-ID: | CAGECzQSKMO9C8-9Eo-feE3BJTF68EWKHXY4-xf_5BMKm8yAHPw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, 23 Jun 2024 at 10:24, Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp
<Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp> wrote:
> I attached the POC patch(the second one) which supports avg(int8) whose standard format is _numeric type.
Okay, very interesting. So instead of defining the
serialization/deserialization functions to text/binary, you serialize
the internal type to an existing non-internal type, which then in turn
gets serialized to text. In the specific case of avg(int8) this is
done to an array of numeric (with length 2).
> However, I do not agree that I modify the internal transtype to the native data type. The reasons are the following three.
> 1. Generality
> I believe we should develop a method that can theoretically apply to any aggregate function, even if we cannot implement it immediately. However, I find it exceptionally challenging to demonstrate that any internal transtype can be universally converted to a native data type for aggregate functions that are parallel-safe under the current parallel query feature. Specifically, proving this for user-defined aggregate functions presents a significant difficulty in my view.
> On the other hand, I think that the usage of export and import functions can theoretically apply to any aggregate functions.
The only thing required when doing CREATE TYPE is having an INPUT and
OUTPUT function for the type, which (de)serialize the type to text
format. As far as I can tell by definition that requirement should be
fine for any aggregates that we can do partial aggregation pushdown
for. To clarify I'm not suggesting we should change any of the
internal representation of the type for the current internal
aggregates. I'm suggesting we create new native types (i.e. do CREATE
TYPE) for those internal representations and then use the name of that
type instead of internal.
> 2. Amount of codes.
> It could need more codes.
I think it would be about the same as your proposal. Instead of
serializing to an intermediary existing type you serialize to string
straight away. I think it would probably be slightly less code
actually, since you don't have to add code to handle the new
aggpartialexportfn and aggpartialimportfn columns.
> 3. Concern about performance
> I'm concerned that altering the current internal data types could impact performance.
As explained above in my proposal all the aggregation code would
remain unchanged, only new native types will be added. Thus
performance won't be affected, because all aggregation code will be
the same. The only thing that's changed is that the internal type now
has a name and an INPUT and OUTPUT function.
> I know. In my proposal, the standard format is not seriarized data by serialfn, instead, is text or other native data type.
> Just to clarify, I'm writing this to avoid any potential misunderstanding.
Ah alright, that definitely clarifies the proposal. I was looking at
the latest patch file on the thread and that one was still using
serialfn. Your new one indeed doesn't, so this is fine.
> Basically responding to your advice,
> for now, I prepare two POC patches.
Great! I definitely think this makes the review/discussion easier.
> The first supports case a, currently covering only avg(int4) and other aggregate functions that do not require import or export functions, such as min, max, and count.
Not a full review but some initial notes:
1. Why does this patch introduce aggpartialpushdownsafe? I'd have
expected that any type with a non-pseudo/internal type as aggtranstype
would be safe to partially push down.
2. It seems the int4_avg_import function shouldn't be part of this
patch (but maybe of a future one).
3. I think splitting this patch in two pieces would make it even
easier to review: First adding support for the new PARTIAL_AGGREGATE
keyword (adds the new feature). Second, using PARTIAL_AGGREGATE in
postgres_fdw (starts using the new feature). Citus would only need the
first patch not the second one, so I think the PARTIAL_AGGREGATE
feature has merit to be added on its own, even without the
postgres_fdw usage.
4. Related to 3, I think it would be good to have some tests of
PARTIAL_AGGREGATE that don't involve postgres_fdw at all. I also
spotted some comments too that mention FDW, even though they apply to
the "pure" PARTIAL_AGGREGATE code.
5. This comment now seems incorrect:
- * Apply the agg's finalfn if one is provided, else return transValue.
+ * If the agg's finalfn is provided and PARTIAL_AGGREGATE keyword is
+ * not specified, apply the agg's finalfn.
+ * If PARTIAL_AGGREGATE keyword is specified and the transValue type
+ * is internal, apply the agg's serialfn. In this case the agg's
+ * serialfn must not be invalid. Otherwise return transValue.
6. These errors are not on purpose afaict (if they are a comment in
the test would be good to explain why)
+SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b ORDER BY 1;
+ERROR: could not connect to server "loopback"
+DETAIL: invalid connection option "partial_aggregate_support"
> The second supports case b and commonly used functions like sum and avg. Currently, it only includes avg(int8).
From | Date | Subject | |
---|---|---|---|
Next Message | Matthias van de Meent | 2024-06-24 09:11:00 | Re: Improve EXPLAIN output for multicolumn B-Tree Index |
Previous Message | Dave Page | 2024-06-24 08:54:51 | Re: Meson far from ready on Windows |