From: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: postgres_fdw bug in 9.6 |
Date: | 2016-12-15 09:40:03 |
Message-ID: | b06b113e-73a1-fa5e-91b6-28bda2551a08@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2016/12/13 23:13, Ashutosh Bapat wrote:
> On Tue, Dec 13, 2016 at 4:15 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I believe there are probably more problems here, or at least if there
>> aren't, it's not clear why not. Because of GetExistingLocalJoinPath's
>> lack of curiosity about what's underneath the join pathnode it picks,
>> it seems to me that it's possible for it to return a path tree that
>> *isn't* all local joins. If we're looking at, say, a hash path for
>> a 4-way join, whose left input is a hash path for a 3-way join, whose
>> left input is a 2-way foreign join, what's stopping that from being
>> returned as a "local" path tree?
> Right, the so called "local join tree" can contain ForeignPath
> representing joins between foreign relations. AFAIU what protects us
> from getting into problem is that postgresRecheckForeignScan calls
> ExecProcNode() on the outer plan. So, even if there are ForeignPaths
> in fdw_outerpath tree, corresponding plans would use a local join plan
> to apply EPQ recheck. The code in GetExistingLocalJoinPath() avoids
> the redirection at least for the inner or outer ForeignPaths.
I think Ashutosh is right.
> Any
> comment claiming that path tree under fdw_outerpath is entirely
> "local" should be removed, if it survives the fix.
+1
>> Likewise, it seems like the code is trying to reject any custom-path join
>> types, or at least this barely-intelligible comment seems to imply that:
>>
>> * Just skip anything else. We don't know if corresponding
>> * plan would build the output row from whole-row references
>> * of base relations and execute the EPQ checks.
>>
>> But this coding fails to notice any such join type that's below the
>> level of the immediate two join inputs.
I wrote the first version of GetExistingLocalJoinPath (and
postgresRecheckForeignScan), to verify that the changes to core for
pushing down foreign/custom joins in 9.5 would work well for EPQ
rechecks. And I rejected custom paths in that version because I
intended to use that function not only for foreign joins but custom
joins. (IIRC, for 9.6, I proposed to put GetExistingLocalJoinPath in a
more common area such as src/backend/optimizer/path/joinpath.c, but that
was ignored...)
>> I kind of wonder why this infrastructure exists at all; it's not the way
>> I'd have foreseen handling EPQ for remote joins. However, while "throw
>> it away and start again" might be good advice going forward, I suppose
>> it won't be very popular for applying to 9.6.
>>
>> One way that we could make things better is to rely on the knowledge
>> that EPQ isn't asked to evaluate joins for more than one row per input
>> relation, and therefore using merge or hash join technology is really
>> overkill. We could make a tree of simple nestloop joins, which aren't
>> going to care about sort order, if we could identify the correct join
>> clauses to apply. At least some of that could be laid off on the FDW,
>> which if it's gotten this far at all, ought to know what join clauses
>> need to be enforced by the foreign join. So I'm thinking a little bit
>> in terms of "just collect the foreign scans for all the base rels
>> included in the join and then build a cross-product nestloop join tree,
>> applying all the join clauses at the top". This would have the signal
>> value that it's guaranteed to succeed and so can be left for later,
>> rather than having to expensively redo it at each level of join planning.
> I am not able to understand how this strategy of applying all join
> clauses on top of cross product would work if there are OUTER joins
> involved. Wouldn't nullable sides change the result?
I'm not able to understand that, either.
>> (Hm, that does sound a lot like "throw it away and start again", doesn't
>> it. But what we've got here is busted enough that I'm not sure there
>> are good alternatives. Maybe for 9.6 we could just rewrite
>> GetExistingLocalJoinPath, and limp along doing a lot of redundant
>> computation during planning.)
An alternative I was thinking was (1) to create an equivalent nestloop
join path for each foreign/custom join path and store it in
fdw_outerpath as-is, except when that join path implements a full join,
in which case a merge or hash join path is created, and (2) to apply
postgresRecheckForeignScan to the outer subplan created from the
fdw_outerpath when executing EPQ rechecks. So, I was thinking to
provide a helper function that creates the equivalent local join path
for a given foreign/custom join path in #1.
> A possible short-term fix may be this: instead of picking any random
> path to stick into fdw_outerpath, we choose a path which covers the
> pathkeys of ForeignPath. If postgres_fdw was able to come up with a
> ForeignPath with those pathkeys, there is high chance that there
> exists a local path with those pathkeys. Since we are yet to call
> add_path() with the ForeignPath being built, that local path should
> exist in the path list. Stick that path in fdw_outerpath. If such a
> path doesn't exist, return NULL from GetExistingLocalJoinPath().
> postgres_fdw wouldn't push down the join in that case and we will
> loose some optimization, but that might be better than throwing an
> error.
Seems reasonable.
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2016-12-15 09:57:41 | Re: pg_authid.rolpassword format (was Re: Password identifiers, protocol aging and SCRAM protocol) |
Previous Message | Alexander Law | 2016-12-15 09:30:17 | Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default |