From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-02-07 22:16:12 |
Message-ID: | CA+TgmoY_grYf9S3zf6bBsRK_8UudtKrhZdrkDzsEtAALZVHkbw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Feb 7, 2015 at 4:30 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2015-02-06 22:57:43 -0500, Robert Haas wrote:
>> On Fri, Feb 6, 2015 at 2:13 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> > My first comment here is that I think we should actually teach
>> > heapam.c about parallelism.
>>
>> I coded this up; see attached. I'm also attaching an updated version
>> of the parallel count code revised to use this API. It's now called
>> "parallel_count" rather than "parallel_dummy" and I removed some
>> stupid stuff from it. I'm curious to see what other people think, but
>> this seems much cleaner to me. With the old approach, the
>> parallel-count code was duplicating some of the guts of heapam.c and
>> dropping the rest on the floor; now it just asks for a parallel scan
>> and away it goes. Similarly, if your parallel-seqscan patch wanted to
>> scan block-by-block rather than splitting the relation into equal
>> parts, or if it wanted to participate in the synchronized-seqcan
>> stuff, there was no clean way to do that. With this approach, those
>> decisions are - as they quite properly should be - isolated within
>> heapam.c, rather than creeping into the executor.
>
> I'm not convinced that that reasoning is generally valid. While it may
> work out nicely for seqscans - which might be useful enough on its own -
> the more stuff we parallelize the *more* the executor will have to know
> about it to make it sane. To actually scale nicely e.g. a parallel sort
> will have to execute the nodes below it on each backend, instead of
> doing that in one as a separate step, ferrying over all tuples to
> indivdual backends through queues, and only then parallezing the
> sort.
>
> Now. None of that is likely to matter immediately, but I think starting
> to build the infrastructure at the points where we'll later need it does
> make some sense.
Well, I agree with you, but I'm not really sure what that has to do
with the issue at hand. I mean, if we were to apply Amit's patch,
we'd been in a situation where, for a non-parallel heap scan, heapam.c
decides the order in which blocks get scanned, but for a parallel heap
scan, nodeParallelSeqscan.c makes that decision. Maybe I'm an old
fuddy-duddy[1] but that seems like an abstraction violation to me. I
think the executor should see a parallel scan as a stream of tuples
that streams into a bunch of backends in parallel, without really
knowing how heapam.c is dividing up the work. That's how it's
modularized today, and I don't see a reason to change it. Do you?
Regarding tuple flow between backends, I've thought about that before,
I agree that we need it, and I don't think I know how to do it. I can
see how to have a group of processes executing a single node in
parallel, or a single process executing a group of nodes we break off
from the query tree and push down to it, but what you're talking about
here is a group of processes executing a group of nodes jointly. That
seems like an excellent idea, but I don't know how to design it.
Actually routing the tuples between whichever backends we want to
exchange them between is easy enough, but how do we decide whether to
generate such a plan? What does the actual plan tree look like?
Maybe we designate nodes as can-generate-multiple-tuple-streams (seq
scan, mostly, I would think) and can-absorb-parallel-tuple-streams
(sort, hash, materialize), or something like that, but I'm really
fuzzy on the details.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
[1] Actually, there's not really any "maybe" about this.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-02-07 23:23:01 | Re: perplexing error message |
Previous Message | Tom Lane | 2015-02-07 21:35:03 | Re: perplexing error message |