Re: ExecGather() + nworkers

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ExecGather() + nworkers
Date: 2016-01-11 06:01:58
Message-ID: CAFj8pRBWFRwHi=syrqBqSb0xOhhd+Tm5q9ge2Ebk++C8P=ygSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> > More importantly, I have other, entirely general concerns. Other major
> > RDBMSs have settings that are very similar to max_parallel_degree,
> > with a setting of 1 effectively disabling all parallelism. Both Oracle
> > and SQL Server have settings that they both call the "maximum degree
> > or parallelism". I think it's a bit odd that with Postgres,
> > max_parallel_degree = 1 can still use parallelism at all. I have to
> > wonder: are we conflating controlling the resources used by parallel
> > operations with how shared memory is doled out?
>
> We could redefined things so that max_parallel_degree = N means use N
> - 1 workers, with a minimum value of 1 rather than 0, if there's a
> consensus that that's better. Personally, I prefer it the way we've
> got it: it's real darned clear in my mind that max_parallel_degree=0
> means "not parallel". But I won't cry into my beer if a consensus
> emerges that the other way would be better.
>
>
when max_parallel_degree will be renamed to max_query_workers or some
similar, then the new metric has sense. And can be more intuitive.

Regards

Pavel

> > I could actually "give back" my parallel worker slots early if I
> > really wanted to (that would be messy, but the then-acquiesced workers
> > do nothing for the duration of the merge beyond conceptually owning
> > the shared tape temp files). I don't think releasing the slots early
> > makes sense, because I tend to think that hanging on to the workers
> > helps the DBA in managing the server's resources. The still-serial
> > merge phase is likely to become a big bottleneck with parallel sort.
>
> Like I say, the sort code better not know anything about this
> directly, or it's going to break when embedded in a query.
>
> > With parallel sequential scan, a max_parallel_degree of 8 could result
> > in 16 processes scanning in parallel. That's a concern, and not least
> > because it happens only sometimes, when things are timed just right.
> > The fact that only half of those processes are "genuine" workers seems
> > to me like a distinction without a difference.
>
> This seems dead wrong. A max_parallel_degree of 8 means you have a
> leader and 8 workers. Where are the other 7 processes coming from?
> What you should have is 8 processes each of which is participating in
> both the parallel seq scan and the parallel sort, not 8 processes
> scanning and 8 separate processes sorting.
>
> >> I think that's probably over-engineered. I mean, it wouldn't be that
> >> hard to have the workers just exit if you decide you don't want them,
> >> and I don't really want to make the signaling here more complicated
> >> than it really needs to be.
> >
> > I worry about the additional overhead of constantly starting and
> > stopping a single worker in some cases (not so much with parallel
> > index build, but other uses of parallel sort beyond 9.6). Furthermore,
> > the coordination between worker and leader processes to make this
> > happen seems messy -- you actually have the postmaster launch
> > processes, but they must immediately get permission to do anything.
> >
> > It wouldn't be that hard to offer a general way of doing this, so why
> not?
>
> Well, if these things become actual problems, fine, we can fix them.
> But let's not decide to add the API before we're agreed that we need
> it to solve an actual problem that we both agree we have. We are not
> there yet.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-01-11 06:02:59 Re: checkpointer continuous flushing
Previous Message Michael Paquier 2016-01-11 05:30:36 Re: PATCH: add pg_current_xlog_flush_location function