Quick Links

Re: Parallel Append implementation

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Parallel Append implementation
Date:	2017-02-15 13:03:25
Message-ID:	CA+TgmoZO40rg-nuQV6XRjD6PdWVc_+zkvmMYw4+gC=tsSA3+jg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Feb 15, 2017 at 2:33 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> The only reason I combined the soft limit and the hard limit is
> because it is not necessary to have two different fields. But of
> course this is again under the assumption that allocating more than
> parallel_workers would never improve the speed, in fact it can even
> slow it down.

That could be true in extreme cases, but in general I think it's probably false.

> Do we have such a case currently where the actual number of workers
> launched turns out to be *more* than Path->parallel_workers ?

No.

>> For example, suppose that I have a scan of two children, one
>> of which has parallel_workers of 4, and the other of which has
>> parallel_workers of 3. If I pick parallel_workers of 7 for the
>> Parallel Append, that's probably too high. Had those two tables been
>> a single unpartitioned table, I would have picked 4 or 5 workers, not
>> 7. On the other hand, if I pick parallel_workers of 4 or 5 for the
>> Parallel Append, and I finish with the larger table first, I think I
>> might as well throw all 4 of those workers at the smaller table even
>> though it would normally have only used 3 workers.
>
>> Having the extra 1-2 workers exit does not seem better.
>
> It is here, where I didn't understand exactly why would we want to
> assign these extra workers to a subplan which tells use that it is
> already being run by 'parallel_workers' number of workers.

The decision to use fewer workers for a smaller scan isn't really
because we think that using more workers will cause a regression.
It's because we think it may not help very much, and because it's not
worth firing up a ton of workers for a relatively small scan given
that workers are a limited resource. I think once we've got a bunch
of workers started, we might as well try to use them.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Parallel Append implementation at 2017-02-15 07:33:08 from Amit Khandekar

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2017-02-15 13:10:01	Re: Parallel Append implementation
Previous Message	Robert Haas	2017-02-15 12:57:53	Re: Documentation improvements for partitioning