Re: postgres_fdw bug in 9.6

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw bug in 9.6
Date: 2016-12-28 06:54:01
Message-ID: CAFjFpRf2hmbuUa5q1Yo3A9jA3L9ezSCA=vN+ZPGWRp9+Za2R5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 28, 2016 at 9:20 AM, Etsuro Fujita
<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2016/12/27 22:03, Ashutosh Bapat wrote:
>>
>> If mergejoin_allowed is true and mergeclauselist is non-NIL but
>> hashclauselist is NIL (that's rare but there can be types has merge
>> operators but not hash operators), we will end up returning NULL. I
>> think we want to create a merge join in that case. I think the order
>> of conditions should be 1. check hashclause_list then create hash join
>> 2. else check if merge allowed, create merge join. It looks like that
>> would cover all the cases, if there aren't any hash clauses, and also
>> no merge clauses, we won't be able to implement a FULL join, so it
>> will get rejected during path creation itself.
>
>
> Right, maybe we can do that by doing similar things as in match_unsort_outer
> and/or sort_inner_and_outer. But as you mentioned, the case is rare, so the
> problem would be whether it's worth complicating the code (and if it's
> worth, whether we should do that at the first version of the function).

All I am requesting is changing the order of conditions. That doesn't
complicate the code.

>
>>>> I
>>>> am
>>>> wondering whether the easy and possibly correct solution here is to not
>>>> replace
>>>> a ForeignPath with fdw_outerpath in GetExistingLocalJoinPath()? If we
>>>> don't do
>>>> that, there won't be error building merge join plan and
>>>> postgresRecheckForeignScan() would correctly route the EPQ checks to the
>>>> local
>>>> plan available as outer plan.
>
>
>>> That might be fine for PG9.6, but I'd like to propose replacing
>>> GetExistingLocalJoinPath with CreateLocalJoinPath for PG10, because (1)
>>> GetExistingLocalJoinPath might choose an overkill, merge or hash join
>>> path
>>> for INNER/LEFT/SEMI/ANTI, not a nestloop join path, which might cause an
>>> overhead at EPQ rechecks, and
>
>
>> The reason we chose to pick up an existing path was that the
>> discussion in thread [1] concluded the efficiency of the local plan
>> wasn't a concern for EPQ. Are we now saying something otherwise?
>
>
> No, I won't. Usually, the overhead would be negligible, but in some cases
> where there are many concurrent updates, the overhead might not be
> negligible due to many EPQ rechecks. So it would be better to have an
> efficient local plan.
>

All that the EPQ rechecks do is apply the join and other quals again
on the base relation rows. Will choice of plan affect the efficiency?

>>> (2) choosing a local join path randomly from
>>> the rel's pathlist wouldn't be easy to understand.
>
>
>> Easy to understand for whom? Can you please elaborate?
>
>
> Users. I think the ease of understanding for users is important.

I doubt users care much about whether an existing path is returned or
a new one created as long as they get one to stuff in fdw_outerpath.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-12-28 06:54:06 Re: Parallel Index-only scan
Previous Message Rafia Sabih 2016-12-28 06:48:48 Re: Parallel Index-only scan