Quick Links

Re: Parallel Append implementation

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Parallel Append implementation
Date:	2017-10-02 12:51:35
Message-ID:	CA+TgmobG-Q=-TvdXxSxwXr6nOc-fdTDR57mYrsNKneq2H+HH0g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Oct 1, 2017 at 9:55 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Isn't it for both? I mean it is about comparing the non-partial paths
> for child relations of the same relation and also when there are
> different relations involved as in Union All kind of query. In any
> case, the point I was trying to say is that generally non-partial
> relations will have relatively smaller scan size, so probably should
> take lesser time to complete.

I don't think that's a valid inference. It's true that a relation
could fail to have a partial path because it's small, but that's only
one reason among very many. The optimal index type could be one that
doesn't support parallel index scans, for example.

> Sure, I think it is quite good if we can achieve that but it seems to
> me that we will not be able to achieve that in all scenario's with the
> patch and rather I think in some situations it can result in leader
> ended up picking the costly plan (in case when there are all partial
> plans or mix of partial and non-partial plans). Now, we are ignoring
> such cases based on the assumption that other workers might help to
> complete master backend. I think it is quite possible that the worker
> backends picks up some plans which emit rows greater than tuple queue
> size and they instead wait on the master backend which itself is busy
> in completing its plan. So master backend will end up taking too much
> time. If we want to go with a strategy of master (leader) backend and
> workers taking a different strategy to pick paths to work on, then it
> might be better if we should try to ensure that master backend always
> starts from the place which has cheapest plans irrespective of whether
> the path is partial or non-partial.

I agree that it's complicated to get this right in all cases; I'm
realizing that's probably an unattainable ideal.

However, I don't think ignoring the distinction between partial and
non-partial plans is an improvement, because the argument that other
workers may be able to help the leader is a correct one. You are
correct in saying that the workers might fill up their tuple queues
while the leader is working, but once the leader returns one tuple it
will switch to reading from the queues. Every other tuple can be
returned by some worker. On the other hand, if the leader picks a
non-partial plan, it must run that plan through to completion.

Imagine (a) a non-partial path with a cost of 1000 returning 100
tuples and (b) a partial path with a cost of 10,000 returning 1,000
tuples. No matter which path the leader picks, it has about 10 units
of work to do to return 1 tuple. However, if it picks the first path,
it is committed to doing 990 more units of work later, regardless of
whether the workers have filled the tuple queues or not. If it picks
the second path, it again has about 10 units of work to do to return 1
tuple, but it hasn't committed to doing all the rest of the work of
that path. So that's better.

Now it's hard to get all of the cases right. If the partial path in
the previous example had a cost of 1 crore then even returning 1 tuple
is more costly than running the whole non-partial path. When you
introduce partition-wise join and parallel hash, there are even more
problems. Now the partial path might have a large startup cost, so
returning the first tuple may be very expensive, and that work may
help other workers (if this is a parallel hash) or it may not (if this
is a non-parallel hash). I don't know that we can get all of these
cases right, or that we should try. However, I still think that
picking partial paths preferentially is sensible -- we don't know all
the details, but we do know that they're typically going to at least
try to divide up the work in a fine-grained fashion, while a
non-partial path, once started, the leader must run it to completion.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Parallel Append implementation at 2017-10-01 13:55:13 from Amit Kapila

Responses

Re: Parallel Append implementation at 2017-10-05 10:29:00 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2017-10-02 13:04:30	Re: CLUSTER command progress monitor
Previous Message	Maksim Milyutin	2017-10-02 12:43:45	Re: [BUG] Cache invalidation for queries that contains const of temporary composite type