Re: fixing consider_parallel for upper planner rels

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fixing consider_parallel for upper planner rels
Date: 2016-06-27 22:04:23
Message-ID: 8562.1467065063@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Jun 27, 2016 at 5:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Seems to me that it should generally be the case that consider_parallel
>> would already be clear on the parent rel if the tlist isn't parallel safe,
>> and if it isn't we probably have a bug elsewhere. If it makes you feel
>> better, maybe you could add Assert(!has_parallel_hazard(...)) here?

> I don't see that this is true. If someone does SELECT
> pg_backend_pid() FROM pgbench_accounts, there's only one RelOptInfo
> and nothing to clear consider_parallel for it anywhere else.

Huh? The final tlist would go with the final_rel, ISTM, not the scan
relation. Maybe we have some rejiggering to do to make that true, though.

> I've really been wondering whether there ought to be a separate
> UPPERREL_FOO whose only purpose is to handle applying scanjoin_target
> to the topmost scan/join rel.

That's another way of thinking about it, I guess.

> Effectively, the current code is trying
> to do that transformation in place. We start with a scan/join rel
> that emits whatever it naturally emits and then frob all of the paths
> and partial paths to emit the scanjoin_target. But this seems quite
> cumbersome.

It'd be no less cumbersome, because you'd end up with all those same paths
surviving into the FOO upperrel. But it might be logically cleaner.

(Thinks for a bit...) Actually, scratch that. It was very intentional on
my part that different Paths for the same relation might produce different
tlists. Without that, there's no way that we're going to get a solution
that allows extracting index expression values from index-only scans in
nontrivial plans. Otherwise I wouldn't have introduced PathTarget to
begin with, because we already had a perfectly good way of representing a
one-size-fits-all result tlist for each RelOptInfo.

So it seems like we should not introduce a dependency here that assumes
that all Paths for a given rel have equivalent parallel_safe settings.
Maybe it'd be okay for the special case of index expressions, because
they are almost certainly going to be parallel safe, but in general it's
a restriction we don't want.

You could still save something by writing code along the line of
if (path->parallel_safe &&
has_parallel_hazard(...))
path->parallel_safe = false;
so as not to run has_parallel_hazard in the case where we already know
we lost.

Doing more than this would probably involve caching parallel-safety
status in PathTarget itself, which is certainly doable but we should
measure first to see if it's worth the trouble.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-06-27 22:05:41 Re: Documentation fixes for pg_visibility
Previous Message Robert Haas 2016-06-27 21:57:29 Re: Documentation fixes for pg_visibility