Re: Parallel Append implementation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-02-19 09:29:23
Message-ID: CA+TgmoZ=MSFuc5qPi-g9pgk9+2bujfVwY9Jfg5F3=XkVwHFnFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 17, 2017 at 2:56 PM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> The log2(num_children)+1 formula which you proposed does not take into
> account the number of workers for each of the subplans, that's why I
> am a bit more inclined to look for some other logic. May be, treat the
> children as if they belong to partitions, and accordingly calculate
> the final number of workers. So for 2 children with 4 and 5 workers
> respectively, Append parallel_workers would be : log3(3^4 + 3^5) .

In general this will give an answer not different by more than 1 or 2
from my answer, and often exactly the same. In the case you mention,
whether we get the same answer depends on which way you round:
log3(3^4+3^5) is 5 if you round down, 6 if you round up.

My formula is more aggressive when there are many subplans that are
not parallel or take only 1 worker, because I'll always use at least 5
workers for an append that has 9-16 children, whereas you might use
only 2 if you do log3(3^0+3^0+3^0+3^0+3^0+3^0+3^0+3^0+3^0). In that
case I like my formula better. With lots of separate children, the
chances of being able to use as many as 5 workers seem good. (Note
that using 9 workers as Ashutosh seems to be proposing would be a
waste if the different children have very unequal execution times,
because the workers that run children with short execution times can
be reused to run additional subplans while the long ones are still
running. Running a separate worker for each child only works out if
the shortest runtime is more than 50% of the longest runtime, which
may sometimes be true but doesn't seem like a good bet in general.)

Your formula is more aggressive when you have 3 children that all use
the same number of workers; it'll always decide on <number of workers
per child>+1, whereas mine won't add the extra worker in that case.
Possibly your formula is better than mine in that case, but I'm not
sure. If you have as many as 9 children that all want N workers, your
formula will decide on N+2 workers, but since my formula guarantees a
minimum of 5 workers in such cases, I'll probably be within 1 of
whatever answer you were getting.

Basically, I don't believe that the log3(n) thing is anything very
special or magical. The fact that I settled on that formula for
parallel sequential scan doesn't mean that it's exactly right for
every other case. I do think it's likely that increasing workers
logarithmically is a fairly decent strategy here, but I wouldn't get
hung up on using log3(n) in every case or making all of the answers
100% consistent according to some grand principal. I'm not even sure
log3(n) is right for parallel sequential scan, so insisting that
Parallel Append has to work that way when I had no better reason than
gut instinct for picking that for Parallel Sequential Scan seems to me
to be a little unprincipled. We're still in the early stages of this
parallel query experiment, and a decent number of these algorithms are
likely to change as we get more sophisticated. For now at least, it's
more important to pick things that work well pragmatically than to be
theoretically optimal.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message husttripper 2017-02-19 09:35:42 Adding new output parameter of pg_stat_statements to identify operation of the query.
Previous Message jasonysli (李跃森) 2017-02-19 09:28:53 Adding new output parameter of pg_stat_statements to identify operation of the query.