From: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
---|---|
To: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Wrong cost estimation for foreign tables join with use_remote_estimate disabled |
Date: | 2018-06-22 10:36:27 |
Message-ID: | CAFjFpRcFVSg7_F27SwVWHnckd6iXFVKLO7qnnFK5vzrA1W1k-Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Sorry here's preview link [1]
On Fri, Jun 22, 2018 at 4:00 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> On Fri, Jun 22, 2018 at 11:56 AM, Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>>
>>
>> On 21.06.2018 20:08, Tom Lane wrote:
>>>
>>> Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> writes:
>>>>
>>>> The following very simple test reduce the problem with wrong cost
>>>> estimation:
>>>> create foreign table t1_fdw(x integer, y integer) server pg_fdw options
>>>> (table_name 't1', use_remote_estimate 'false');
>>>> create foreign table t2_fdw(x integer) server pg_fdw options (table_name
>>>> 't2', use_remote_estimate 'false');
>>>> It is possible to force Postgres to use correct plan by setting
>>>> "fdw_startup_cost" to some very large value (100000000 for example).
>>>> ...
>>>> Also correct plan is used when use_remote_estimate is true.
>>>
>>> If you are unhappy about the results with use_remote_estimate off, don't
>>> run it that way. The optimizer does not have a crystal ball.
>>
>>
>> As I wrote, use_remote_estimate can not be used because in this case query
>> compilation time is unacceptable (10 seconds, while time of query execution
>> itself is ~200msec).
>> So the problem can be addressed in two ways:
>>
>> 1. Try to reduce time of remote estimation. I wonder why postgres_fdw sends
>> so much queries to remote server. For join of two tables there are 7
>> queries.
>> I suspect that for ~20 joined tables in the original query number of calls
>> is more than hundred, so on wonder that it takes so much time.
>> 2. Try to make optimizer make better estimation of join cost based on local
>> statistic (please notice that ANALYZE is explicitly called for all foreign
>> tables and number of rows in the result was correctly calculated).
>>
>
> I think estimate_path_cost_size() is too pessimistic about how many
> times the join conditions are evaluated (Sorry, I have written that
> code when I was worked on join pushdown for postgres_fdw.)
>
> /* Estimate of number of rows in cross product */
> nrows = fpinfo_i->rows * fpinfo_o->rows;
>
> and somewhere down in the code
> run_cost += nrows * join_cost.per_tuple;
>
> It assumes that the join conditions are run on the cross-product of
> the joining tables. In reality that never happens for large tables. In
> such cases the optimizer will choose either hash or merge join, which
> will apply join conditions only on a small portion of cross-product.
> But the reason it was written that way was the local server can not
> estimate the fraction of cross product on which the join conditions
> will be applied. May be we could assume that the join conditions will
> be applied to only 1% of the cross product, i.e. run_cost +=
> clamp_rows(nrows/100) * join_cost.per_tuple. With this change I think
> the cost of remote plan will be less than local plan.
>
> Here's a preview of blog, I am planning to publish soon, about this
> issue at [1]. It has a bit more details.
>
> [1] https://www.blogger.com/blogger.g?blogID=5253679863234367862#editor/target=post;postID=4019325618679658571;onPublishedMenu=allposts;onClosedMenu=allposts;postNum=0;src=postname
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
From | Date | Subject | |
---|---|---|---|
Next Message | Konstantin Knizhnik | 2018-06-22 11:09:35 | Re: Wrong cost estimation for foreign tables join with use_remote_estimate disabled |
Previous Message | Ashutosh Bapat | 2018-06-22 10:30:16 | Re: Wrong cost estimation for foreign tables join with use_remote_estimate disabled |