Re: WIP: Upper planner pathification

From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Upper planner pathification
Date: 2016-03-07 16:34:52
Message-ID: CAM-w4HP=2z1NwbOaaGKVEnDTph20OtkC6nJC9wrrm9XX7SO=3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 7, 2016 at 3:37 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> The currently-committed code generates paths where nested loops and
> hash joins get pushed beneath the Gather node, but does not generate
> paths where merge joins have been pushed beneath the Gather node. And
> the reason I didn't try to generate those paths is because I believe
> they will almost always suck. As of now, what we know how to do is
> build a partial path for a join by joining a partial path for the
> outer input rel against an ordinary path for the inner rel. That
> means that the work of generating the inner rel has to be redone in
> each worker. That's not a problem if we've got something like a
> nested loop with a parameterized inner index scan, because that sort
> of plan redoes all the work for every row anyway. It is a problem for
> a hash join, but it's not too hard for it to be worthwhile anyway if
> the build table is small. For a merge join, though, it seems rather
> unpromising. It's really doubtful that we want each worker to
> independently sort the inner rel and then have them join their own
> subset of the outer rel against their own copy of the sort. *Maybe*
> it could win if the inner path is an index scan, but I wasn't really
> sure that would come up and be a win often enough to be worth the cost
> of generating the path. We tend to only use merge joins when both of
> the relations involved are large, and index-scanning a large relation
> tends to lose to sorting it. So it just seemed like a dead end.

This is the first message on this subthread that actually gave me a
feeling I understood the issue under discussion. It explains the
distinction between plans that are parallel-safe and plans that would
actually do something different under a parallel worker

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2016-03-07 17:02:42 Re: More stable query plans via more predictable column statistics
Previous Message Tom Lane 2016-03-07 16:09:22 Re: WIP: Upper planner pathification