Re: Eager aggregation, take 3

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tender Wang <tndrwang(at)gmail(dot)com>, Paul George <p(dot)a(dot)george19(at)gmail(dot)com>, Andy Fan <zhihuifan1213(at)163(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Eager aggregation, take 3
Date: 2024-09-27 03:53:43
Message-ID: CAMbWs48YGAEsHXR4s2q11QGOgTfBtxr9LjaocxGwY42KE89_HQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 25, 2024 at 11:20 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> Look at the Nested Loop node:
>
> -> Nested Loop (cost=1672.00..1840.60 rows=1000000 width=12)
>
> How can a 10-row outer path joining a 1000-row inner path generate
> 1000000 rows? This is because we are using the plan of the first path
> described above, and the rowcount estimate of the second path. What a
> kluge!
>
> To address this issue, one solution I’m considering is to recalculate
> the row count estimate for a grouped join path using its outer and
> inner paths. While this may seem expensive, it might not be that bad
> since we will cache the results of the selectivity calculation. In
> fact, this is already the approach we take for parameterized join
> paths (see get_parameterized_joinrel_size).

Here is an updated version of this patch that fixes the rowcount
estimate issue along this routine. (see set_joinpath_size.)

Now the Nested Loop node looks like:

-> Nested Loop (cost=1672.00..1840.60 rows=1000 width=12)
(actual time=119.685..122.841 rows=1000 loops=1)

Its rowcount estimate looks much more sane now.

But wait, why are we using nestloop here? My experience suggests that
hashjoin typically outperforms nestloop with input paths of this size
on this type of dataset.

The thing is, the first path (join-then-aggregate one) of the t1/t2
grouped join relation has a much fewer rowcount but more expensive
costs:

:path.rows 10
:path.disabled_nodes 0
:path.startup_cost 1672
:path.total_cost 1672.1

And the second path (aggregate-then-join one) has cheaper costs but
more rows.

:jpath.path.rows 10000
:jpath.path.disabled_nodes 0
:jpath.path.startup_cost 25.75
:jpath.path.total_cost 156.75

Both paths have survived the add_path() tournament for this relation,
and the second one is selected as the cheapest path by set_cheapest,
which mainly uses costs and then pathkeys as the selection criterion.
The rowcount estimate is not taken into account, which is reasonable
because unparameterized paths for the same relation usually have the
same rowcount estimate. And when creating hashjoins, we only consider
the cheapest input paths. This is why we are unable to generate a
hashjoin with the first path.

However, the situation changes with grouped relations, as different
paths of a grouped relation can have very different row counts. To
cope with this, I modified set_cheapest() to also find the fewest-row
unparameterized path if the relation is a grouped relation, and
include it in the cheapest_parameterized_paths list. It could be
argued that this will increase the overall planning time a lot because
it adds one more path to cheapest_parameterized_paths. But in many
cases the fewest-row-path is the same path as cheapest_total_path, in
which case we do not need to add it again.

And now the plan becomes:

explain (costs on)
select sum(t2.c) from t t1 join t t2 on t1.a = t2.a join t t3 on t2.b
= t3.b group by t3.a;
QUERY PLAN
---------------------------------------------------------------------------------------------
Finalize HashAggregate (cost=1706.97..1707.07 rows=10 width=12)
Group Key: t3.a
-> Hash Join (cost=1672.22..1701.97 rows=1000 width=12)
Hash Cond: (t3.b = t2.b)
-> Seq Scan on t t3 (cost=0.00..16.00 rows=1000 width=8)
-> Hash (cost=1672.10..1672.10 rows=10 width=12)
-> Partial HashAggregate (cost=1672.00..1672.10
rows=10 width=12)
Group Key: t2.b
-> Hash Join (cost=28.50..1172.00 rows=100000 width=8)
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t t1 (cost=0.00..16.00
rows=1000 width=4)
-> Hash (cost=16.00..16.00 rows=1000 width=12)
-> Seq Scan on t t2
(cost=0.00..16.00 rows=1000 width=12)
(13 rows)

I believe this is the most optimal plan we can find for this query on
this dataset.

I also made some changes to how grouped relations are stored in this
version of the patch.

Thanks
Richard

Attachment Content-Type Size
v12-0001-Implement-Eager-Aggregation.patch application/octet-stream 172.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2024-09-27 04:01:12 Re: query_id, pg_stat_activity, extended query protocol
Previous Message Yugo NAGATA 2024-09-27 03:42:34 Re: MAINTAIN privilege -- what do we need to un-revert it?