From: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com> |
Subject: | Re: Foreign join pushdown vs EvalPlanQual |
Date: | 2015-11-12 09:53:48 |
Message-ID: | 564461AC.6050400@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert and Kaigai-san,
Sorry, I sent in an unfinished email.
On 2015/11/12 15:30, Kouhei Kaigai wrote:
>> On 2015/11/12 2:53, Robert Haas wrote:
>>> On Sun, Nov 8, 2015 at 11:13 PM, Etsuro Fujita
>>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>> To test this change, I think we should update the postgres_fdw patch so as
>>>> to add the RecheckForeignScan.
>>>>
>>>> Having said that, as I said previously, I don't see much value in adding the
>>>> callback routine, to be honest. I know KaiGai-san considers that that would
>>>> be useful for custom joins, but I don't think that that would be useful even
>>>> for foreign joins, because I think that in case of foreign joins, the
>>>> practical implementation of that routine in FDWs would be to create a
>>>> secondary plan and execute that plan by performing ExecProcNode, as my patch
>>>> does [1]. Maybe I'm missing something, though.
>>> I really don't see why you're fighting on this point. Making this a
>>> generic feature will require only a few extra lines of code for FDW
>>> authors. If this were going to cause some great inconvenience for FDW
>>> authors, then I'd agree it isn't worth it. But I see zero evidence
>>> that this is actually the case.
>> Really? I think there would be not a little burden on an FDW author;
>> when postgres_fdw delegates to the subplan to the remote server, for
>> example, it would need to create a remote join query by looking at
>> tuples possibly fetched and stored in estate->es_epqTuple[], send the
>> query and receive the result during the callback routine.
> I cannot understand why it is the only solution.
I didn't say that.
>> Furthermore,
>> what I'm most concerned about is that wouldn't be efficient. So, my
> You have to add "because ..." sentence here because I and Robert
> think a little inefficiency is not a problem.
Sorry, my explanation was not enough. The reason for that is that in
the above postgres_fdw case for example, the overhead in sending the
query to the remote end and transferring the result to the local end
would not be negligible. Yeah, we might be able to apply a special
handling for the improved efficiency when using early row locking, but
otherwise can we do the same thing?
> Please don't start the sentence from "I think ...". We all knows
> your opinion, but what I've wanted to see is "the reason why my
> approach is valuable is ...".
I didn't say that my approach is *valuable* either. What I think is, I
see zero evidence that there is a good use-case for an FDW to do
something other than doing an ExecProcNode in the callback routine, as I
said below, so I don't see the need to add such a routine while that
would cause maybe not a large, but not a little burden for writing such
a routine on FDW authors.
> Nobody prohibits postgres_fdw performs a secondary join here.
> All you need to do is, picking up a sub-plan tree from FDW's private
> field then call ExecProcNode() inside the callback.
>> As I said before, I know that KaiGai-san considers that
>> that approach would be useful for custom joins. But I see zero evidence
>> that there is a good use-case for an FDW.
>>> From my point of view I'm now
>>> thinking this solution has two parts:
>>>
>>> (1) Let foreign scans have inner and outer subplans. For this
>>> purpose, we only need one, but it's no more work to enable both, so we
>>> may as well. If we had some reason, we could add a list of subplans
>>> of arbitrary length, but there doesn't seem to be an urgent need for
>>> that.
I did the same thing in an earlier version of the patch I posted.
Although I agreed on Robert's comment "The Plan tree and the PlanState
tree should be mirror images of each other; breaking that equivalence
will cause confusion, at least.", I think that that would make code much
simpler, especially the code for setting chgParam for inner/outer
subplans. But one thing I'm concerned about is enable both inner and
outer plans, because I think that that would make the planner
postprocessing complicated, depending on what the foreign scans do by
the inner/outer subplans. Is it worth doing so? Maybe I'm missing
something, though.
>>> (2) Add a recheck callback.
>>>
>>> If the foreign data wrapper wants to adopt the solution you're
>>> proposing, the recheck callback can call
>>> ExecProcNode(outerPlanState(node)). I don't think this should end up
>>> being more than a few lines of code, although of course we should
>>> verify that.
Yeah, I think FDWs would probably need to create a subplan accordingly
at planning time, and then initializing/closing the plan at execution
time. I think we could facilitate subplan creation by providing helper
functions for that, though.
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2015-11-12 10:02:57 | Re: Minor comment improvement to create_foreignscan_plan |
Previous Message | Amit Langote | 2015-11-12 09:38:05 | Re: pglogical_output - a general purpose logical decoding output plugin |