Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-13 03:00:26
Message-ID: CAA4eK1JtFzOnQWTCSEn2KGR3Ps1=UvyJTx603S0G6aJD6JdWnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 12, 2021 at 9:33 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> > On Thu, Mar 11, 2021 at 01:01:42PM +0000, houzj(dot)fnst(at)fujitsu(dot)com wrote:
> > > > I guess to have the finer granularity we'd have to go with
> > > > enable_parallel_insert, which then would mean possibly having to
> > > > later add enable_parallel_update, should parallel update have
> > > > similar potential overhead in the parallel-safety checks (which to
> > > > me, looks like it could, and parallel delete may not ...)
> > > >
> > > > It's a shame there is no "set" type for GUC options.
> > > > e.g.
> > > > enable_parallel_dml='insert,update'
> > > > Maybe that's going too far.
> >
> > Isn't that just GUC_LIST_INPUT ?
> > I'm not sure why it'd be going to far ?
> >
> > The GUC-setting assign hook can parse the enable_parallel_dml_list value set
> > by the user, and set an internal int/bits enable_parallel_dml variable with some
> > define/enum values like:
> >
> > GUC_PARALLEL_DML_INSERT 0x01
> > GUC_PARALLEL_DML_DELETE 0x02
> > GUC_PARALLEL_DML_UPDATE 0x04
> >
>
> I think this ideas works, but we still need to consider about the reloption.
> After looking into the reloption, I think postgres do not have a list-like type for reloption.
> And I think it's better that the guc and reloption is consistent.
>

I also think it is better to be consistent here.

> Besides, a list type guc option that only support one valid value 'insert' seems a little weird to me(we only support parallel insert for now).
>
> So, I tend to keep the current style of guc option.
>

+1. I feel at this stage it might not be prudent to predict the
overhead for parallel updates or deletes especially when there doesn't
appear to be an easy way to provide a futuristic guc/reloption and we
don't have any patch on the table which can prove or disprove that
theory. The only thing that we can see that even if parallel
updates/deletes have overhead, it might not be due to similar reasons.
Also, I guess the parallel-copy might need somewhat similar
parallel-safety checking w.r.t partitioned tables and I feel the
current proposed guc/reloption can be used for the same as it is quite
a similar operation.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-03-13 03:00:29 Re: pgbench: option delaying queries till connections establishment?
Previous Message Thomas Munro 2021-03-13 02:49:36 Re: A qsort template