From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: parallelize queries containing subplans |
Date: | 2017-02-14 09:24:46 |
Message-ID: | CAA4eK1LQA-avGD=H3nNfVC8nMahozgRzEHb3KPpeppCTQ-gayg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 3, 2017 at 4:19 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Dec 28, 2016 at 11:47 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> Now, we can further extend this to parallelize queries containing
>> correlated subplans like below:
>>
>> explain select * from t1 where t1.i in (select t2.i from t2 where t2.i=t1.i);
>> QUERY PLAN
>> -------------------------------------------------------------
>> Seq Scan on t1 (cost=0.00..831049.09 rows=8395 width=12)
>> Filter: (SubPlan 1)
>> SubPlan 1
>> -> Seq Scan on t2 (cost=0.00..97.73 rows=493 width=4)
>> Filter: (i = t1.i)
>> (5 rows)
>>
>> In the above query, Filter on t2 (i = t1.i) generates Param node which
>> is a parallel-restricted node, so such queries won't be able to use
>> parallelism even with the patch. I think we can mark Params which
>> refer to same level as parallel-safe and I think we have this
>> information (node-> varlevelsup/ phlevelsup/ agglevelsup) available
>> when we replace correlation vars (SS_replace_correlation_vars).
>>
>
> I have implemented the above idea which will allow same or immediate
> outer level PARAMS as parallel_safe. The results of above query after
> patch:
>
> postgres=# explain select * from t1 where t1.i in (select t2.i from t2
> where t2.i=t1.i);
> QUERY PLAN
> --------------------------------------------------------------------------
> Gather (cost=0.00..488889.88 rows=8395 width=12)
> Workers Planned: 1
> -> Parallel Seq Scan on t1 (cost=0.00..488889.88 rows=4938 width=12)
> Filter: (SubPlan 1)
> SubPlan 1
> -> Seq Scan on t2 (cost=0.00..97.73 rows=493 width=4)
> Filter: (i = t1.i)
> (7 rows)
>
On further evaluation, it seems this patch has one big problem which
is that it will allow forming parallel plans which can't be supported
with current infrastructure. For ex. marking immediate level params
as parallel safe can generate below type of plan:
Seq Scan on t1
Filter: (SubPlan 1)
SubPlan 1
-> Gather
Workers Planned: 1
-> Result
One-Time Filter: (t1.k = 0)
-> Parallel Seq Scan on t2
In this plan, we can't evaluate one-time filter (that contains
correlated param) unless we have the capability to pass all kind of
PARAM_EXEC param to workers. I don't want to invest too much time in
this patch unless somebody can see some way using current parallel
infrastructure to implement correlated subplans.
Note that still, the other patch [1] in this thread which implements
parallelism for uncorrelated subplan holds good.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Corey Huinker | 2017-02-14 09:33:22 | Re: \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless) |
Previous Message | Seki, Eiji | 2017-02-14 09:13:30 | Re: Proposal: GetOldestXminExtend for ignoring arbitrary vacuum flags |