Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-01-19 05:10:58
Message-ID: CAA4eK1KgmcXLp-iKOdJUr5wM_cCU7_fbE7CRr1GkrKOyb8piVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 18, 2021 at 3:50 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Mon, Jan 18, 2021 at 8:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > 1)
> > >
> > > >Here, it seems we are accessing the relation descriptor without any
> > > >lock on the table which is dangerous considering the table can be
> > > >modified in a parallel session. Is there a reason why you think this
> > > >is safe? Did you check anywhere else such a coding pattern?
> > >
> > > Yes, there's a very good reason and I certainly have checked for the
> > > same coding pattern elsewhere, and not just randomly decided that
> > > locking can be ignored.
> > > The table has ALREADY been locked (by the caller) during the
> > > parse/analyze phase.
> > >
> >
> > Fair enough. I suggest adding a comment saying the same so that the
> > reader doesn't get confused about the same.
> >
>
> OK, I'll add a comment.
>
> > > (This is not the case for a partition, in which case the patch code
> > > uses AccessShareLock, as you will see).
> >
> > Okay, but I see you release the lock on partition rel immediately
> > after checking parallel-safety. What if a user added some
> > parallel-unsafe constraint (via Alter Table) after that check?
> >
> > >
>
> I'm not sure. But there would be a similar concern for current
> Parallel SELECT functionality, right?
>

I don't think so because, for Selects, we take locks on the required
partitions and don't release them immediately. We do parallel safety
checks after acquiring those locks. From the code perspective, we lock
individual partitions via
expand_inherited_rtentry->expand_partitioned_rtentry and then check
parallel-safety at a later point via
set_append_rel_size->set_rel_consider_parallel. Also, I am not sure if
there is anything we check for Selects at each partition relation
level that can be changed by a concurrent session. Do you have a
different understanding?

Similarly, we do retain locks on indexes, see get_relation_info, which
we are not doing in the patch.

> My recollection is that ALTER TABLE obtains an exclusive lock on the
> table which it retains until the end of the transaction, so that will
> result in blocking at certain points, during parallel-checks and
> execution, but there may still be a window.
>

Once the Select has acquired locks in the above code path, I don't
think Alter for a particular partition would be able to proceed unless
those locks are non-conflicting.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-01-19 05:15:26 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message tsunakawa.takay@fujitsu.com 2021-01-19 05:06:34 RE: POC: postgres_fdw insert batching