From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Noah Misch <noah(at)leadboat(dot)com>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: fixing consider_parallel for upper planner rels |
Date: | 2016-06-30 22:33:30 |
Message-ID: | CA+Tgmoavc_WGZR3srg8STAk2oNGP4EGmC9503a0PX7_AK9+cAw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> That's pretty much right, but there are two conceptually separate
>> things. The first is whether or not we actually attempt to spawn
>> workers, and the second is whether or not we enter parallel mode.
>> Entering parallel mode enables various restrictions that make spawning
>> workers safe, so we cannot spawn workers unless we enter parallel
>> mode. We can, however, enter parallel mode without spawning any
>> workers, and force_parallel_mode=on does so. The point of that is
>> that even if the plan doesn't end up being run inside of a worker, it
>> will be run with most of the same restrictions on what it can do that
>> would be in place if a truly parallel plan were chosen.
>
> And the point of that is what, exactly? If the only change is that
> "some restrictions get enforced", I am not clear on why we need such
> a test mode in cases where the planner is afraid to put a top Gather on
> the plan. In particular, given the coding as you now have it, it seems
> like the only case where there's any difference is where we set
> glob->parallelModeOK but nonetheless end up with a not-parallel-safe
> topmost path (that doesn't have a Gather within it). It's not clear
> to me why having the executor switch into parallel mode makes sense at
> all with such a plan.
Suppose you create a PL/pgsql function that does an UPDATE and mark it
PARALLEL RESTRICTED. You wonder whether you've marked it correctly.
You can set force_parallel_mode=on and SELECT myfunc(). The
subsequent ERROR tells you that you've mismarked it.
>>> * I'm still not real happy with the has_parallel_hazard test added to
>>> apply_projection_to_path, and here is why not: if the target path is
>>> not projection-capable, we'll never get to that test. We just pass
>>> the problem off to create_projection_path, which looks no further
>>> than rel->consider_parallel. It seems like we have to add a
>>> has_parallel_hazard test there as well, which is a tad annoying,
>>> not least because all the direct callers of create_projection_path
>>> seem to have made the test already.
>
>> Thanks for noticing that problem; I've fixed it by inserting a test
>> into create_projection_path. As far as the annoyance factor is
>> concerned, you obviously know that there was a flag there to reduce
>> the cost of that, but you insisted on removing it. Maybe you should
>> consider putting it back.
>
> No, that flag was on apply_projection_to_path, and I didn't care for it
> because most of the callers didn't appear to want to go to the trouble of
> setting it correctly. Adding such an argument to create_projection_path
> as well doesn't seem to make that better.
I'm open to suggestions. As I've noted a few times already, though
maybe less clearly than I should have done, I think it's quite likely
to be a good idea to try to avoid the overhead of running
has_parallel_hazard repeatedly on the same tlists, or for that matter,
running it on tlists at all. I don't have any evidence that's
expensive enough to cost, just a hunch. Exactly how to do that best,
I'm not sure.
>> Well, the point of consider_parallel is that there are some things
>> that are specific to the individual path, and there are some that are
>> properties of the RelOptInfo. It seems highly desirable to check
>> things that fall into the latter category exactly once, rather than
>> once per path. You seem to be fighting hard against that idea, and
>> I'm pretty baffled as to why.
>
> What I'm not happy about is that as you've got things constituted,
> the GetForeignUpperPaths hook is broken so far as injecting parallel paths
> is concerned, because the consider_parallel flags for the upperrels
> aren't set yet when it gets called. If we keep consider_parallel with
> its present usage, we're going to have to refactor things to fix that.
I see. It seems to me, and I may be failing to understand something,
that the placement of create_upper_paths_hook is substantially better
than the placement of GetForeignUpperPaths. If the latter were moved
to where the former now is, then consider_parallel would be set
sufficiently early and everything would be fine. We could
alternatively fish out the code to set consider_parallel for the upper
rels and do all of that work before calling that hook. That's a bit
hairy, because we'd basically go set all of the consider_parallel
flags, then call that hook, then circle back around for the core path
generation, but I don't see any intrinsic obstacle to that line of
attack. I'm not very sure that one call for all upper rels is going
to be convenient, though.
>> Updated patch attached. (Official status update: I'll commit this,
>> possibly over the long weekend or in any event no later than Tuesday,
>> unless there are further review comments before then, in which case I
>> will post a further official status update no later than Tuesday.)
>
> Don't have time to re-read this right now, but maybe tomorrow or
> Saturday.
OK, thanks.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2016-06-30 22:46:45 | Re: parallel workers and client encoding |
Previous Message | Robert Haas | 2016-06-30 22:17:18 | Re: fixing consider_parallel for upper planner rels |