Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <akapila(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode
Date: 2021-03-24 12:48:02
Message-ID: CA+TgmobaQFUBkVOsmdOt9ag=3g0WjKfefP8ys1TV5GhHELfLOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On Wed, Mar 24, 2021 at 12:48 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> That seems like an argument for a pg_class attribute about parallel
> safety of the current relation *and* its children. It'd definitely mean
> recursing higher in the partition tree during DDL if the action on a
> child partition causes the safety to flip.

That could be an issue, both because locking parents is not very nice
from a concurrency standpoint, and also because it introduces deadlock
hazards.

> Although this specific hack doesn't seem too terrible to me. If you
> execute a parallel insert the likelihood to end up not needing an xid is
> pretty low. Implementing it concurrently does seem like it'd end up
> needing another lwlock nested around xid assignment, or some more
> complicated scheme with already holding XidGenLock or retries. But maybe
> I'm missing an easy solution here.

I don't think you need to do anything that is known outside the group
of processes involved in the parallel query. I think you just need to
make sure that only one of them is trying to acquire an XID at a time,
and that all the others find out about it. I haven't thought too hard
about the timing: if one process acquires an XID for the transaction,
is it OK if the others do an arbitrary amount of work before they
realize that this has happened? Also, there's the problem that the
leader has the whole transaction stack and the workers don't, so the
recursive nature of XID acquisition is a problem. I suspect these are
all pretty solvable problems; I just haven't put in the energy. But,
it could also be that I'm missing something.

I think we should be trying, though. Some of the limitations of
parallel query are unavoidable: there's always going to be a certain
amount of parallel-unsafe stuff out there, and we just have to not use
parallelism in those cases. But, some things - and I think this is
probably one of them - are just limitations of the current
implementation, and we should be looking to fix those. If we just
accept that the infrastructure limitations are what they are and skate
around them to make a few more things work, we're going to get less
and less real improvement from every new project, and any future
infrastructure improvements that somebody does want to make are going
to have to deal with all the odd special cases we've introduced to get
those improvements.

Now, just to be clear, I'm completely in favor of incremental
improvements in cases where that can be done without too much
ugliness. There have been some great examples of that with parallel
query already, like the work David Rowley did to allow parallel
sequential scans to allocate multiple blocks at a time, or, looking
back further, his work on parallel aggregate. I'm not saying those
projects were easy; I know they were hard. But, they could basically
use the infrastructure that had been created by previous commits to
new stuff in a way that was reasonably localized. But, when Thomas
Munro worked on parallel hash join, he had to first invent a whole new
way of allocating memory. Trying to do that project without inventing
a new way to allocate memory would have been a really bad decision. It
would have boxed us into all sorts of unpleasant corners where a lot
of stuff didn't really work and finding any further improvements was
hard. But, because he did invent that new way, it can now be used for
lots of other things, and already has been. That infrastructure work
made future projects *easier*. And one of the big problems that I have
with this feature is that, as it seems to me right now, there wasn't
any real new infrastructure built, even though doing this right really
requires it.

I think there is room for debate about whether the particular thing
that I'm complaining about here is mandatory for this feature,
although I am strongly of the position that should be tried. We might
consider forcing an XID assignment to be acceptable for a bulk-insert
case, but what happens when somebody wants to extend this to updates
and deletes? There's going to be a strong temptation to just double
down on the same design, and that's a very dangerous direction in my
opinion. But, even if you don't think that this *particular*
infrastructure improvement ought to be the job of this patch, I think
this patch clearly needed to do more infrastructure improvement than
it did. Your comments about the IndexOptInfo stuff are another
example, and the rewriter bug is a third.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2021-03-24 14:50:50 Re: pgsql: Move tablespace path re-creation from the makefiles to pg_regres
Previous Message Robert Haas 2021-03-24 12:14:53 Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode