From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Greg Nancarrow <gregn4422(at)gmail(dot)com> |
Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel INSERT (INTO ... SELECT ...) |
Date: | 2020-10-09 12:56:27 |
Message-ID: | CAA4eK1LSu3_dA777+90CQ8bYKSzxjHgj3yiKb-DR_2GE8psNfA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 9, 2020 at 3:51 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Fri, Oct 9, 2020 at 8:09 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > 0001-InsertParallelSelect
> > 1.
> > ParallelContext *pcxt;
> >
> > + /*
> > + * We need to avoid an attempt on INSERT to assign a
> > + * FullTransactionId whilst in parallel mode (which is in
> > + * effect due to the underlying parallel query) - so the
> > + * FullTransactionId is assigned here. Parallel mode must
> > + * be temporarily escaped in order for this to be possible.
> > + * The FullTransactionId will be included in the transaction
> > + * state that is serialized in the parallel DSM.
> > + */
> > + if (estate->es_plannedstmt->commandType == CMD_INSERT)
> > + {
> > + Assert(IsInParallelMode());
> > + ExitParallelMode();
> > + GetCurrentFullTransactionId();
> > + EnterParallelMode();
> > + }
> > +
> >
> > This looks like a hack to me. I think you are doing this to avoid the
> > parallel mode checks in GetNewTransactionId(), right?
>
> Yes, agreed, is a hack to avoid that (mind you, it's not exactly great
> that ExecutePlan() sets parallel-mode for the entire plan execution).
> Also, did not expect that to necessarily remain in a final patch.
>
> >If so, I have
> > already mentioned above [1] that we can change it so that we disallow
> > assigning xids for parallel workers only. The same is true for the
> > check in ExecGatherMerge. Do you see any problem with that suggestion?
> >
>
> No, should be OK I guess, but will update and test to be sure.
>
> > 2.
> > @@ -337,7 +337,7 @@ standard_planner(Query *parse, const char
> > *query_string, int cursorOptions,
> > */
> > if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
> > IsUnderPostmaster &&
> > - parse->commandType == CMD_SELECT &&
> > + (parse->commandType == CMD_SELECT || parse->commandType == CMD_INSERT) &&
> > !parse->hasModifyingCTE &&
> > max_parallel_workers_per_gather > 0 &&
> > !IsParallelWorker())
> >
> > I think the comments above this need to be updated especially the part
> > where we says:"Note that we do allow CREATE TABLE AS, SELECT INTO, and
> > CREATE MATERIALIZED VIEW to use parallel plans, but as of now, only
> > the leader backend writes into a completely new table.". Don't we need
> > to include Insert also?
>
> Yes, Insert needs to be mentioned somewhere there.
>
> >
> > 3.
> > @@ -371,6 +371,7 @@ standard_planner(Query *parse, const char
> > *query_string, int cursorOptions,
> > * parallel-unsafe, or else the query planner itself has a bug.
> > */
> > glob->parallelModeNeeded = glob->parallelModeOK &&
> > + (parse->commandType == CMD_SELECT) &&
> > (force_parallel_mode != FORCE_PARALLEL_OFF);
> >
> > Why do you need this change? The comments above this code should be
> > updated to reflect this change. I think for the same reason the below
> > code seems to be modified but I don't understand the reason for the
> > below change as well, also it is better to update the comments for
> > this as well.
> >
>
> OK, I will update the comments for this.
> Basically, up to now, the "force_parallel_mode" has only ever operated
> on a SELECT.
> But since we are now allowing CMD_INSERT to be assessed for parallel
> mode too, we need to prevent the force_parallel_mode logic from
> sticking a Gather node over the top of arbitrary INSERTs and causing
> them to be run in parallel. Not all INSERTs are suitable for parallel
> operation, and also there are further considerations for
> parallel-safety for INSERTs compared to SELECT. INSERTs can also
> trigger UPDATEs.
>
Sure but in that case 'top_plan->parallel_safe' should be false and it
should stick Gather node atop Insert node. For the purpose of this
patch, the scan beneath Insert should be considered as parallel_safe.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2020-10-09 12:58:08 | Re: Parallel INSERT (INTO ... SELECT ...) |
Previous Message | Amit Kapila | 2020-10-09 12:51:22 | Re: Parallel INSERT (INTO ... SELECT ...) |