From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | robertmhaas(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PoC] Asynchronous execution again (which is not parallel) |
Date: | 2015-12-14 08:34:18 |
Message-ID: | 20151214.173418.241781654.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello, thank you for the comment.
At Tue, 8 Dec 2015 10:40:20 -0500, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmobLEaho40e9puy3pLbeUx_a6hKBoDUqDNQO4rwORUM-eA(at)mail(dot)gmail(dot)com>
> On Mon, Nov 30, 2015 at 7:47 AM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > "Asynchronous execution" is a feature to start substantial work
> > of nodes before doing Exec*. This can reduce total startup time
> > by folding startup time of multiple execution nodes. Especially
> > effective for the combination of joins or appends and their
> > multiple children that needs long time to startup.
> >
> > This patch does that by inserting another phase "Start*" between
> > ExecInit* and Exec* to launch parallel processing including
> > pgworker and FDWs before requesting the very first tuple of the
> > result.
>
> I have thought about this, too, but I'm not very convinced that this
> is the right model. In a typical case involving parallelism, you hope
> to have the Gather node as close to the top of the plan tree as
> possible. Therefore, the start phase will not happen much before the
> first execution of the node, and you don't get much benefit.
Obeying the Init-Exec semantics, Gather node cannot execute
underlying, say, Sort node before the upper node requests for the
first tuple. Async execution also potentially works for the case.
On the other hand, the patch is currently desined considering
Gahter as driven-all-time node. Since it has the same
characteristic with Append or MergeAppend in the sense that it
potentially executes multiple (and various kinds of) underlying
nodes, the patch should be redesigned following that but as far
as I can see for now that Gather executes multiple same (or
divided) scan nodes so I haven't make Gather
"asynch-aware". (If I didn't take it wrongly.)
And if necessary, we can mark the query as 'async requested' in
planning phase.
> Moreover, I think that prefetching can be useful not only at the start
> of the query - which is the only thing that your model supports - but
> also in mid-query. For example, consider an Append of two ForeignScan
> nodes. Ideally we'd like to return the results in the order that they
> become available, rather than serially. This model might help with
> that for the first batch of rows you fetch, but not after that.
Yeah, async-exec can have the similar mechanism as Gahter to
fetch tuples from underlying nodes.
> There are a couple of other problems here that are specific to this
> example. You get a benefit here because you've got two Gather nodes
> that both get kicked off before we try to read tuples from either, but
> that's generally something to avoid - you can only use 3 processes and
> typically at most 2 of those will actually be running (as opposed to
Yes, it is one of the reason why I said the example as artificial.
> sleeping) at the same time: the workers will run to completion, and
> then the leader will wake up and do its thing. I'm not saying our
> current implementation of parallel query scales well to a large number
> of workers (it doesn't) but I think that's more about improving the
> implementation than any theoretical problem, so this seems a little
> worse. Also, currently, both merge and hash joins have an
> optimization wherein if the outer side of the join turns out to be
> empty, we avoid paying the startup cost for the inner side of the
> join; kicking off the work on the inner side of the merge join
> asynchronously before we've gotten any tuples from the outer side
> loses the benefit of that optimization.
It is a matter of comparson, async wins if the startup time of
the outer is longer (to some extent) than the time to build the
inner hash. But it requries planner part. I'll take it into
account if async exec itself is found to be useful.
> I suspect there is no single paradigm that will help with all of the
> cases where asynchronous execution is useful. We're going to need a
> series of changes that are targeted at specific problems. For
> example, here it would be useful to have one side of the join confirm
> at the earliest possible stage that it will definitely return at least
> one tuple eventually, but then return control to the caller so that we
> can kick off the other side of the join. The sort node never
> eliminates anything, so as soon as the sequential scan underneath it
> coughs up a tuple, we're definitely getting a return value eventually.
It's quite impressive. But it might be a business of the planner.
> At that point it's safe to kick off the other Gather node. I don't
> quite know how to design a signalling system for that, but it could be
> done.
I agree. I'll make further considertaion on that.
> But is it important enough to be worthwhile? Maybe, maybe not. I
> think we should be working toward a world where the Gather is at the
> top of the plan tree as often as possible, in which case
> asynchronously kicking off a Gather node won't be that exciting any
> more - see notes on the "parallelism + sorting" thread where I talk
> about primitives that would allow massively parallel merge joins,
> rather than 2 or 3 way parallel.
Could you give me the subject of the thread? Or important message
of that.
> From my point of view, the case
> where we really need some kind of asynchronous execution solution is a
> ForeignScan, and in particular a ForeignScan which is the child of an
> Append. In that case it's obviously really useful to be able to kick
> off all the foreign scans and then return a tuple from whichever one
> coughs it up first. Is that the ONLY case where asynchronous
> execution is useful? Probably not, but I bet it's the big one.
Yes, the most significant and obvious (but hard to estimate the
benefit) target of async execution is (Merge)Append-ForeignScan,
which is narrow but freuquently used. And this patch has started
from it.
It is because of the startup-heavy nature of FDW. So I involved
sort as a target later then redesigned to give the ability on all
nodes. If it is obviously over-done for the (currently) expected
benefit and if it is preferable to shrink this patch so as to
touch only the portion where async-exec has a benefit, I'll do
so.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2015-12-14 08:34:21 | Re: W-TinyLfu for cache eviction |
Previous Message | Albe Laurenz | 2015-12-14 08:30:01 | Re: Fdw cleanup |