From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Inserts in CREATE TABLE AS |
Date: | 2020-12-28 12:14:16 |
Message-ID: | CALj2ACX+i-vtmdrS3kzLVJx=3TQLXryoF3zWWgZ-8Qvo3RJyFw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 28, 2020 at 1:16 AM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> For v16-0002-Tuple-Cost-Adjustment-for-Parallel-Inserts-in-CTAS.patch:
>
> + if (ignore &&
> + (root->parse->CTASParallelInsInfo &
> + CTAS_PARALLEL_INS_TUP_COST_CAN_IGN))
>
> I wonder why CTAS_PARALLEL_INS_TUP_COST_CAN_IGN is checked again in the above if since when ignore_parallel_tuple_cost returns true, CTAS_PARALLEL_INS_TUP_COST_CAN_IGN is set already.
Sometimes, we may set the flag CTAS_PARALLEL_INS_TUP_COST_CAN_IGN
before generate_useful_gather_paths, but the
generate_useful_gather_paths can return without reaching cost_gather
where we reset. The generate_useful_gather_paths can return without
reaching cost_gather, in following case
if (rel->partial_pathlist == NIL)
return;
So, for such cases, I'm resetting it here.
> + * In this function we only care Append and Gather nodes.
>
> 'care' -> 'care about'
Done.
> + for (int i = 0; i < aps->as_nplans; i++)
> + {
> + parallel |= PushDownCTASParallelInsertState(dest,
> + aps->appendplans[i],
> + gather_exists);
>
> It seems the loop termination condition can include parallel since we can come out of the loop once parallel is true.
No, we can not come out of the for loop if parallel is true, because
our intention there is to look for all the child/sub plans under
Append, and push the inserts to the Gather nodes wherever possible.
> + if (!allow && tuple_cost_flags && gather_exists)
>
> As the above code shows, gather_exists is only checked when allow is false.
Yes, if at least one gather node exists under the Append for which the
planner would have ignored the tuple cost, and now if we don't allow
parallel inserts, we should assert that the parallelism is not picked
because of wrong parallel tuple cost enforcement.
> + * We set the flag for two cases when there is no parent path will
> + * be created(such as : limit,sort,distinct...):
>
> Please correct the grammar : there are two verbs following 'when'
Done.
> For set_append_rel_size:
>
> + {
> + root->parse->CTASParallelInsInfo |=
> + CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND;
> + }
> + }
> +
> + if (root->parse->CTASParallelInsInfo &
> + CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND)
> + {
> + root->parse->CTASParallelInsInfo &=
> + ~CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND;
>
> In the if block for childrel->rtekind == RTE_SUBQUERY, CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND maybe set. Why is it cleared immediately after ?
Thanks for pointing that out. It's a miss, intention is to reset it
after set_rel_size(). Corrected in the v17 patch.
> + /* Set to this in case tuple cost needs to be ignored for Append cases. */
> + CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND = 1 << 3
>
> Since each CTAS_PARALLEL_INS_ flag is a bit, maybe it's better to use 'turn on' or similar term in the comment. Because 'set to' normally means assignment.
Done.
All the above comments are addressed in the v17 patch set posted
upthread. Please have a look.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2020-12-28 12:15:20 | Re: Parallel Inserts in CREATE TABLE AS |
Previous Message | Bharath Rupireddy | 2020-12-28 12:12:25 | Re: Parallel Inserts in CREATE TABLE AS |