Re: Parallel Queries and PostGIS

From: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Queries and PostGIS
Date: 2016-03-31 19:19:32
Message-ID: CACowWR2xM-f3V1k-4pCB8GgqTykbZYMTL6YhEw4SNWPaoQzqSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 29, 2016 at 12:51 PM, Paul Ramsey <pramsey(at)cleverelephant(dot)ca> wrote:
> On Tue, Mar 29, 2016 at 12:48 PM, Paul Ramsey <pramsey(at)cleverelephant(dot)ca> wrote:
>
>>> On the join case, I wonder if it's possible that _st_intersects is not
>>> marked parallel-safe? If that's not the problem, I don't have a
>>> second guess, but the thing to do would be to figure out whether
>>> consider_parallel is false for the RelOptInfo corresponding to either
>>> of pd and pts, or whether it's true for both but false for the
>>> joinrel's RelOptInfo, or whether it's true for all three of them but
>>> you don't get the desired path anyway.
>>
>> _st_intersects is definitely marked parallel safe, and in fact will
>> generate a parallel plan if used alone (without the operator though,
>> it's impossibly slow). It's the && operator that is the issue... and I
>> just noticed that the PROCEDURE bound to the && operator
>> (geometry_overlaps) is *not* marked parallel safe: could be the
>> problem?
>
> Asked and answered: marking the geometry_overlaps as parallel safe
> gets me a parallel plan! Now to play with costs and see how it behaves
> when force_parallel_mode is not set.

For the record I can get a non-forced parallel join plan, *only* if I
reduce the parallel_join_cost by a factor of 10, from 0.1 to 0.01.

http://blog.cleverelephant.ca/2016/03/parallel-postgis-joins.html

This seems non-optimal. No amount of cranking up the underlying
function COST seems to change this, perhaps because the join cost is
entirely based on the number of expected tuples in the join relation?

In general it seems like function COST values have been considered a
relatively unimportant input to planning in the past, but with
parallel processing it seems like they are now a lot more
determinative about what makes a good plan.

P.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-03-31 19:45:56 Re: [PATCH v9] GSSAPI encryption support
Previous Message Tom Lane 2016-03-31 19:18:23 Re: improving GROUP BY estimation