From: | Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Aggregate Push Down - Performing aggregation on foreign server |
Date: | 2016-10-24 10:46:02 |
Message-ID: | CAM2+6=Xky1e6T9Wsj10Xv0m4Mmosxcetnn8JWU2HGZrEhJEsjw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Oct 22, 2016 at 9:09 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> brolga is still not terribly happy with this patch: it's choosing not to
> push down the aggregates in one of the queries. While I failed to
> duplicate that result locally, investigation suggests that brolga's result
> is perfectly sane; in fact it's not very clear why we're not getting that
> from multiple critters, because the plan brolga is choosing is not
> inferior to the expected one.
>
> The core of the problem is this subquery:
>
> contrib_regression=# explain verbose select min(13), avg(ft1.c1),
> sum(ft2.c1) from ft1 right join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12;
>
> QUERY PLAN
> ------------------------------------------------------------
> ------------------------------------------------------------
> ---------------------------------
> Foreign Scan (cost=108.61..108.64 rows=1 width=44)
> Output: (min(13)), (avg(ft1.c1)), (sum(ft2.c1))
> Relations: Aggregate on ((public.ft1) INNER JOIN (public.ft2))
> Remote SQL: SELECT min(13), avg(r1."C 1"), sum(r2."C 1") FROM ("S 1"."T
> 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" =
> 12))))
> (4 rows)
>
> If you look at the estimate to just fetch the data, it's:
>
> contrib_regression=# explain verbose select ft1.c1, ft2.c1 from ft1 right
> join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12;
> QUERY PLAN
> ------------------------------------------------------------
> --------------------------------------------------------------------------
> Foreign Scan (cost=100.55..108.62 rows=1 width=8)
> Output: ft1.c1, ft2.c1
> Relations: (public.ft1) INNER JOIN (public.ft2)
> Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN
> "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" = 12))))
> (4 rows)
>
> Note we're expecting only one row out of the join. Now the cost of doing
> three aggregates on a single row of input is not a lot. Comparing these
> local queries:
>
> regression=# explain select min(13),avg(q1),sum(q2) from int8_tbl where
> q2=456;
> QUERY PLAN
> ---------------------------------------------------------------
> Aggregate (cost=1.07..1.08 rows=1 width=68)
> -> Seq Scan on int8_tbl (cost=0.00..1.06 rows=1 width=16)
> Filter: (q2 = 456)
> (3 rows)
>
> regression=# explain select (q1),(q2) from int8_tbl where q2=456;
> QUERY PLAN
> ---------------------------------------------------------
> Seq Scan on int8_tbl (cost=0.00..1.06 rows=1 width=16)
> Filter: (q2 = 456)
> (2 rows)
>
> we seem to have startup = input cost + .01 and then another .01
> for total. So the estimate to do the above remote scan and then
> aggregate locally should have been 108.63 startup and 108.64 total,
> give or take. The estimate for aggregating remotely is a hair better,
> but it's not nearly better enough to guarantee that the planner won't
> see it as fuzzily the same cost.
>
> In short: the problem with this test case is that it's considering
> aggregation over only a single row, which is a situation in which
> pushing down the aggregate actually doesn't save us anything, because
> we're retrieving one row from the remote either way. So it's not at all
> surprising that we don't get a stable plan choice. The test query needs
> to be adjusted so that the aggregation is done over multiple rows,
> allowing fdw_tuple_cost to kick in and provide some daylight between
> the cost estimates.
>
Attached patch which performs aggrgation over 1000 rows as suggested by Tom.
I believe it will have a stable output/plan now.
Thanks
>
> regards, tom lane
>
--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
agg_push_down_fix_testcase.patch | binary/octet-stream | 5.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Victor Wagner | 2016-10-24 11:03:22 | Re: Patch: Implement failover on libpq connect level. |
Previous Message | Thom Brown | 2016-10-24 10:36:31 | Re: Patch: Implement failover on libpq connect level. |