Re: Eager aggregation, take 3

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tender Wang <tndrwang(at)gmail(dot)com>, Paul George <p(dot)a(dot)george19(at)gmail(dot)com>, Andy Fan <zhihuifan1213(at)163(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Eager aggregation, take 3
Date: 2025-01-21 16:36:16
Message-ID: CA+TgmoYa-zexdbc5nO_D6oxPMZYs06hkYwZK5Dufq+4Hhe6uNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 21, 2025 at 3:33 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> I've been thinking about this proposal, and it's quite appealing. It
> would significantly reduce both the planning effort and implementation
> complexity, while still yielding reasonable planning results.
>
> One concern I have with this proposal is that, as we climb up higher
> and higher in the join tree, the assumption that a path with smaller
> row count and higher cost is better than one with larger row count and
> lower cost may gradually no longer hold. It's true that a path with a
> smaller row count is generally better for upper join nodes, as it
> feeds fewer rows to upper join nodes. However, as there are fewer and
> fewer upper join nodes left, the efficiency gained from the smaller
> row count could likely no longer justify the high cost of that path
> itself.
>
> Here's an example I found that can help illustrate what I mean.

Thanks for the example. What seems to be happening here is that each
of the three joins increases the number of rows by a multiple of
either 166 or 333. Aggregating reduces the number of rows to 3. I am
not sure that we should be too concerned about this kind of case,
because I don't think it will be common to have multiple joins that
dramatically increase the row count. If you did have that, you must
want to aggregate multiple times. We don't have the code for an
IntermediateAggregate or CombineAggregate node right now, I believe,
but in this query it would likely make sense to apply such a step
after every join; then you'd never have more than three rows.

Honestly, I'm not sure how much we should worry about a case like
this. I think that if a user is writing queries that use joins to
vastly inflate the row count and then aggregate the result, perhaps
they need to think about rewriting the queries. In this instance, it
feels a bit like the user is emulating multiplication using an
iterated SUM(), which is probably never going to work out all that
well.

But I bet it's possible to construct an example using only
row-reducing joins. Let's say we start with 10k rows that aggregate to
10 rows; after performing a join, we end up with 9k rows that
aggregate to 9 rows. So if we partially aggregate first, we have to
aggregate 1000 extra rows, but if we join first, we have to join 1000
extra rows. I don't think we can say a priori which will be cheaper,
but my idea would make the path that partially aggregates after the
join win unconditionally.

> Yeah, you're right that the join search process for grouped paths
> basically mirrors what we do for non-grouped paths, which indeed
> involves a lot of planner effort. I've been exploring potential
> heuristics to limit the search space for grouped paths, but so far, I
> haven't found any effective solutions. Currently, the heuristic used
> in the patch is to only consider grouped paths that dramatically
> reduce the number of rows. All others are just discarded. The
> rationale is that if a grouped path does not reduce the number of rows
> enough, it is highly unlikely to result in a competitive final plan
> during the upper planning stages, so it doesn't make much sense to
> consider it. The current threshold is set to 50%, meaning that if the
> number of rows returned by PartialAgg(t1 JOIN t2) is not less than 50%
> of the rows returned by (t1 JOIN t2), no Aggregate paths will be
> generated on top of the t1/t2 join. If we notice significant
> regressions in planning time, we might consider further increasing
> this threshold, say, to 80%, so that only grouped paths that reduce
> the rows by more than 80% will be considered. This heuristic also
> ensures that, once a plan with eager aggregation is chosen, it is
> highly likely to result in performance improvements, due to the
> significant data reduction before joins.

To be honest, I was quite surprised this was a percentage like 50% or
80% and not a multiple like 2 or 5. And I had thought the multiplier
might even be larger, like 10 or more. The thing is, 50% means we only
have to form 2-item groups in order to justify aggregating twice.
Maybe SUM() is cheap enough to justify that treatment, but a more
expensive aggregate might not be, especially things like string_agg()
or array_agg() where aggregation creates bigger objects.

Another thing to consider is that when the number of groups is small
enough that we don't need to do a Sort+GroupAggregate, it doesn't seem
so bad to perform marginally-useful partial aggregation, but sometimes
that won't be the case. For example, imagine that the user wants to
join orders to order_lines and then compute SUM(order_lines.quantity)
for each orders.customer_id. If the size of the order_lines tables is
large relative to work_mem, we're going to need to sort it in order
to partially aggregate, which is expensive. If it turns out that the
orders table is also quite big, then maybe we'll end up performing a
merge join and the same sort order can be used for both operations,
but if not, we could've just done a hash join with orders as the build
table. In that kind of case, partial aggregation has to save quite a
lot to justify itself.

Now, maybe we shouldn't worry about that when applying this heuristic
cutoff; after all, it's the job of the cost model to understand that
sorting is expensive, and this cutoff should just be there to make
sure we don't even try the cost model in cases where it's clearly
unpromising. But I do suspect that in queries where the average group
size is 2, this will often be a marginal technique. In addition to the
problems already mentioned, it could be that the average group size is
2 but a lot of groups are actually of size 1 and then there are some
larger groups. In such cases I'm even less sure that the partial
aggregation technique will be a winner. Building many 1-element groups
sounds inefficient.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2025-01-21 16:46:43 Re: [PoC] Federated Authn/z with OAUTHBEARER
Previous Message Fujii Masao 2025-01-21 16:31:53 Re: Enhancing Memory Context Statistics Reporting