Re: BUG #15613: Bug in PG Planner for Foreign Data Wrappers

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: srinivasan(dot)sa(at)zohocorp(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15613: Bug in PG Planner for Foreign Data Wrappers
Date: 2019-02-06 11:50:51
Message-ID: 5C5ACA1B.3000207@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

(2019/02/06 13:15), Etsuro Fujita wrote:
> (2019/02/01 3:06), Tom Lane wrote:
>> Etsuro Fujita<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> writes:
>>> (2019/01/31 2:48), Tom Lane wrote:
>>>> So I see two alternatives for fixing this aspect of the problem:
>>>>
>>>> 1. Just change file_fdw and postgres_fdw as above, and hope that
>>>> authors of extension FDWs get the word.
>>>>
>>>> 2. Modify create_foreignscan_path so that it doesn't simply trust
>>>> required_outer to be correct, but forcibly adds rel->lateral_relids
>>>> into it. This would fix the problem without requiring FDWs to be
>>>> on board, but it seems kinda grotty, and it penalizes FDWs that
>>>> have gone to the trouble of computing the right required_outer relids
>>>> in the first place. (It's somewhat amusing that postgres_fdw
>>>> appears to get this right when it generates parameterized paths,
>>>> but not in the base case.)
>>
>>> #2 seems like a good idea, as it would make FDW authors' life easy.
>>
>> I started to fix this, and soon noticed what seems a worse problem:
>> postgres_fdw is using create_foreignscan_path to construct Paths for
>> join relations and upperrels. This is utterly broken. That function
>> was only designed to produce paths for baserels, which is why it uses
>> get_baserel_parampathinfo. At the very least we're getting wrong
>> rowcount estimates for parameterized joinrels (compare
>> get_joinrel_parampathinfo), and it seems possible that we're actually
>> getting wrong plans with the wrong set of movable join clauses being
>> applied. And I have no idea what might go wrong for upperrels, though
>> I think those are never parameterized so it might accidentally not fail.
>
> Ah, you are right. I also noticed that when I proposed parameterized
> foreign joins for postgres_fdw two years ago, but I forgot that.
>
>> We could either split the function into two or three functions, or add
>> still more overhead to it to notice what kind of relation has been
>> passed and adjust its behavior for that. I'm not really thrilled with
>> the latter: the fact that it's called create_foreignSCAN_path means,
>> to me, that it's not supposed to be used for anything but baserel
>> cases.
>
> I don't have any strong opinion on that.

On second thoughts, I think it would be a good idea to split that
function, because we can minimize the parameters list passed to each
function, making it easy to call that function; as you mentioned,
'required_outer' isn't required for upperrels, and 'fdw_outerpath' isn't
required for baserels and upperrels. Not sure we should do that for PG12.

>> I think one big question here is how many external FDWs may have
>> copied postgres_fdw's remote-join handling. If we just have to
>> fix postgres_fdw, my thoughts about what to do are probably
>> different than if we have to try to avoid making third-party callers
>> more broken than they are already.
>
> As far as I know oracle_fdw supports join pushdown the same way as
> postgres_fdw [1], but other than that, I guess there are few if any.

Maybe I'm missing something, but even if we just fix postgres_fdw (and
oracle_fdw), I'm not sure we can fix the foreign-join case fully without
extending the fdw_outerpath infrastructure so that we can create
*parameterized* local join paths.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2019-02-06 12:17:23 Re: BUG #15572: Misleading message reported by "Drop function operation" on DB with functions having same name
Previous Message Amit Langote 2019-02-06 07:35:09 Re: 'update returning *' returns 0 columns instead of empty row with 2 columns when (i) no rows updated and (ii) when applied to a partitioned table with sub-partition