Re: why not parallel seq scan for slow functions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why not parallel seq scan for slow functions
Date: 2017-09-06 19:32:29
Message-ID: CA+TgmoYxwHY3mU0KC+YrEC6d63Kt+GTBg=Q0qE-MMsrin7R-Rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 6, 2017 at 3:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Sep 6, 2017 at 1:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> If somebody's applying apply_projection_to_path to a path that's already
>>> been add_path'd, that's a violation of the documented restriction.
>
>> /me is confused. Isn't that exactly what grouping_planner() is doing,
>> and has done ever since your original pathification commit
>> (3fc6e2d7f5b652b417fa6937c34de2438d60fa9f)? It's iterating over
>> current_rel->pathlist, so surely everything in there has been
>> add_path()'d.
>
> I think the assumption there is that we no longer care about validity of
> the input Relation, since we won't be looking at it any more (and
> certainly not adding more paths to it). If there's some reason why
> that's not true, then maybe grouping_planner has a bug there.

Right, that's sorta what I assumed. But I think that thinking is
flawed in the face of parallel query, because of the fact that
apply_projection_to_path() pushes down target list projection below
Gather when possible. In particular, as Jeff and Amit point out, it
may well be that (a) before apply_projection_to_path(), the cheapest
plan is non-parallel and (b) after apply_projection_to_path(), the
cheapest plan would be a Gather plan, except that it's too late
because we've already thrown that path out.

What we ought to do, I think, is avoid generating gather paths until
after we've applied the target list (and the associated costing
changes) to both the regular path list and the partial path list.
Then the cost comparison is apples-to-apples. The use of
apply_projection_to_path() on every path in the pathlist would be fine
if it were adjusting all the costs by a uniform amount, but it isn't.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-06 19:34:09 Re: Fix performance of generic atomics
Previous Message Andres Freund 2017-09-06 19:27:46 Re: Fix performance of generic atomics