From: | Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> |
---|---|
To: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Subject: | Re: [DESIGN] ParallelAppend |
Date: | 2015-07-29 02:02:23 |
Message-ID: | 9A28C8860F777E439AA12E8AEA7694F80111F9F0@BPXM15GP.gisp.nec.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On 2015-07-28 PM 09:58, Kouhei Kaigai wrote:
> >>
> >> From my understanding of parallel seqscan patch, each worker's
> >> PartialSeqScan asks for a block to scan using a shared parallel heap scan
> >> descriptor that effectively keeps track of division of work among
> >> PartialSeqScans in terms of blocks. What if we invent a PartialAppend
> >> which each worker would run in case of a parallelized Append. It would use
> >> some kind of shared descriptor to pick a relation (Append member) to scan.
> >> The shared structure could be the list of subplans including the mutex for
> >> concurrency. It doesn't sound as effective as proposed
> >> ParallelHeapScanDescData does for PartialSeqScan but any more granular
> >> might be complicated. For example, consider (current_relation,
> >> current_block) pair. If there are more workers than subplans/partitions,
> >> then multiple workers might start working on the same relation after a
> >> round-robin assignment of relations (but of course, a later worker would
> >> start scanning from a later block in the same relation). I imagine that
> >> might help with parallelism across volumes if that's the case.
> >>
> > I initially thought ParallelAppend kicks fixed number of background workers
> > towards sub-plans, according to the estimated cost on the planning stage.
> > However, I'm now inclined that background worker picks up an uncompleted
> > PlannedStmt first. (For more details, please see the reply to Amit Kapila)
> > It looks like less less-grained worker's job distribution.
> > Once number of workers gets larger than number of volumes / partitions,
> > it means more than two workers begin to assign same PartialSeqScan, thus
> > it takes fine-grained job distribution using shared parallel heap scan.
> >
>
> I like your idea of using round-robin assignment of partial/non-partial
> sub-plans to workers. Do you think there are two considerations of cost
> here: sub-plans themselves could have parallel paths to consider and (I
> think) your proposal introduces a new consideration - a plain old
> synchronous Append path vs. parallel asynchronous Append with Funnel
> (below/above?) it. I guess the asynchronous version would always be
> cheaper. So, even if we end up with non-parallel sub-plans do we still add
> a Funnel to make Append asynchronous? Am I missing something?
>
I expect Funnel itself will get Append capability but run sub-plans in
background workers, to simplify path constructions. So, if Funnel with
multiple sub-plans have cheaper cost than Append, it will replace the
AppendPath by FunnelPath.
Regarding to the cost estimation, I don't think parallel version is always
cheaper than traditional Append, because of the cost to launch background
workers. It increases startup cost to process the relation, thus, if upper
node prefers small startup cost (like Limit), traditional Append still has
advantages.
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-07-29 02:07:57 | Re: A little RLS oversight? |
Previous Message | Michael Paquier | 2015-07-29 02:01:55 | Re: [PATCH] Reload SSL certificates on SIGHUP |