Quick Links

Re: Aggregates push-down to partitions

From:	Maksim Milyutin <milyutinma(at)gmail(dot)com>
To:	Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Aggregates push-down to partitions
Date:	2017-11-09 18:50:04
Message-ID:	73ca1970-d182-0889-4d63-ec369a68d788@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Konstantin!

09.11.17 20:14, Konstantin Knizhnik wrote:
> It is still far from ideal plan because each worker is working with
> all partitions, instead of spitting partitions between workers and
> calculate partial aggregates for each partition.
>
> But if we add FDW as a child of parent table, then parallel scan can
> not be used and we get the worst possible plan:
>
> postgres=# create foreign table derived_fdw() inherits(base) server
> pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE
> postgres=# explain select sum(x) from base;
>                                     QUERY PLAN
> ----------------------------------------------------------------------------------
>
> Aggregate (cost=34055.07..34055.08 rows=1 width=8)
>    -> Append (cost=0.00..29047.75 rows=2002926 width=4)
>          -> Seq Scan on base (cost=0.00..0.00 rows=1 width=4)
>          -> Seq Scan on derived1 (cost=0.00..14425.00 rows=1000000
> width=4)
>          -> Seq Scan on derived2 (cost=0.00..14425.00 rows=1000000
> width=4)
>          -> Foreign Scan on derived_fdw (cost=100.00..197.75
> rows=2925 width=4)
> (6 rows)
>
> So we sequentially pull all data to this node and compute aggregates
> locally.
> Ideal plan will calculate in parallel partial aggregates at all nodes
> and then combine partial results.
> It requires two changes:
> 1. Replace Aggregate->Append with
> Finalize_Aggregate->Append->Partial_Aggregate
> 2. Concurrent execution of Append. It also can be done in two
> different ways: we can try to use existed parallel workers
> infrastructure and
> replace Append with Gather. It seems to be the best approach for local
> partitioning. In case of remote (FDW) partitions, it is enough
> to split starting of execution (PQsendQuery in postgres_fdw) and
> getting results. So it requires some changes in FDW protocol.
>
>
> I wonder if somebody already investigate this problem or working in
> this direction.
> May be there are already some patches proposed?
> I have searched hackers archive, but didn't find something relevant...
> Are there any suggestions about the best approach to implement this
> feature?
>

Maybe in this thread[1] your described problem are solved through
introducing Parallel Append node?

1.
https://www.postgresql.org/message-id/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1%2BS%2BvRuUQ%40mail.gmail.com

--
Regards,
Maksim Milyutin

In response to

Aggregates push-down to partitions at 2017-11-09 17:14:31 from Konstantin Knizhnik

Responses

Re: Aggregates push-down to partitions at 2017-11-10 09:15:31 from Ashutosh Bapat

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Stephen Frost	2017-11-09 18:52:08	Re: Simplify ACL handling for large objects and removal of superuser() checks
Previous Message	Robert Haas	2017-11-09 18:39:34	Re: Simplify ACL handling for large objects and removal of superuser() checks