From: | Maksim Milyutin <milyutinma(at)gmail(dot)com> |
---|---|
To: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Aggregates push-down to partitions |
Date: | 2017-11-09 18:50:04 |
Message-ID: | 73ca1970-d182-0889-4d63-ec369a68d788@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Konstantin!
09.11.17 20:14, Konstantin Knizhnik wrote:
> It is still far from ideal plan because each worker is working with
> all partitions, instead of spitting partitions between workers and
> calculate partial aggregates for each partition.
>
> But if we add FDW as a child of parent table, then parallel scan can
> not be used and we get the worst possible plan:
>
> postgres=# create foreign table derived_fdw() inherits(base) server
> pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE
> postgres=# explain select sum(x) from base;
> QUERY PLAN
> ----------------------------------------------------------------------------------
>
> Aggregate (cost=34055.07..34055.08 rows=1 width=8)
> -> Append (cost=0.00..29047.75 rows=2002926 width=4)
> -> Seq Scan on base (cost=0.00..0.00 rows=1 width=4)
> -> Seq Scan on derived1 (cost=0.00..14425.00 rows=1000000
> width=4)
> -> Seq Scan on derived2 (cost=0.00..14425.00 rows=1000000
> width=4)
> -> Foreign Scan on derived_fdw (cost=100.00..197.75
> rows=2925 width=4)
> (6 rows)
>
> So we sequentially pull all data to this node and compute aggregates
> locally.
> Ideal plan will calculate in parallel partial aggregates at all nodes
> and then combine partial results.
> It requires two changes:
> 1. Replace Aggregate->Append with
> Finalize_Aggregate->Append->Partial_Aggregate
> 2. Concurrent execution of Append. It also can be done in two
> different ways: we can try to use existed parallel workers
> infrastructure and
> replace Append with Gather. It seems to be the best approach for local
> partitioning. In case of remote (FDW) partitions, it is enough
> to split starting of execution (PQsendQuery in postgres_fdw) and
> getting results. So it requires some changes in FDW protocol.
>
>
> I wonder if somebody already investigate this problem or working in
> this direction.
> May be there are already some patches proposed?
> I have searched hackers archive, but didn't find something relevant...
> Are there any suggestions about the best approach to implement this
> feature?
>
Maybe in this thread[1] your described problem are solved through
introducing Parallel Append node?
--
Regards,
Maksim Milyutin
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2017-11-09 18:52:08 | Re: Simplify ACL handling for large objects and removal of superuser() checks |
Previous Message | Robert Haas | 2017-11-09 18:39:34 | Re: Simplify ACL handling for large objects and removal of superuser() checks |