From: | Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Asynchronous execution on FDW |
Date: | 2015-07-23 09:38:39 |
Message-ID: | 9A28C8860F777E439AA12E8AEA7694F80111BCEC@BPXM15GP.gisp.nec.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > If we have ParallelAppend node that kicks a background worker process for
> > each underlying child node in parallel, does ForeignScan need to do something
> > special?
>
> Although I don't see the point of the background worker in your
> story but at least for ParalleMergeAppend, it would frequently
> discontinues to scan by upper Limit so one more state, say setup
> - which mans a worker is allocated but not started- would be
> useful and the driver node might need to manage the number of
> async execution. Or the driven nodes might do so inversely.
>
I expected workloads like single shot scan on a partitioned large
fact table on DWH system. Yep, if workload is expected to rescan
so frequently, its expected cost shall be higher (by the cost to
launch bgworker) than existing Append, then planner will kick out
this path.
Regarding of interaction between Limit and ParallelMergeAppend,
it is probably the best scenario, isn't it? If Limit picks up
the least 1000rows from a partitioned table consists of 20 child
tables, ParallelMergeAppend can launch 20 parallel jobs that
picks up the least 1000rows from the child relations for each.
Probably, it is same job done in pass_down_bound() of nodeLimit.c.
> As for ForeignScan, it is merely an API for FDW and does nothing
> substantial so it would have nothing special to do. As for
> postgres_fdw, current patch restricts one execution per one
> foreign server at once by itself. We would have to provide
> another execution management if we want to have two or more
> simultaneous scans per one foreign server at once.
>
Yep, your 4th patch defines a new callback to FdwRoutines and
5th patch implements postgres_fdw specific portion.
It shall work for distributed / shaded database environment well,
however, its benefit is around ForeignScan only.
Once management node kicks underlying SeqScan, ForeignScan or
others in parallel, it also enables to run local heap scan
asynchronously.
> Sorry for the focusless discussion but does this answer some of
> your question?
>
Hmm... Its advantage is still unclear for me. However, it is not
fair to hijack this thread by my idea.
I'll submit my design proposal about ParallelAppend towards the
next commit-fest. Please comment on.
> > Expected waste of CPU or I/O is common problem to be solved, however, it does
> > not need to add a special case handling to ForeignScan, I think.
> > How about your opinion?
>
> I agree with you that ForeignScan as the wrapper for FDWs don't
> need anything special for the case. I suppose for now that
> avoiding the penalty from abandoning too many speculatively
> executed scans (or other works on bg worker like sorts) would be
> a business of the upper node of FDWs, or somewhere else.
>
> However, I haven't dismissed the possibility that some common
> works related to resource management could be integrated into
> executor (or even into planner), but I see none for now.
>
I also agree with it is "eventually" needed, but may not be supported
in the first version.
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
From | Date | Subject | |
---|---|---|---|
Next Message | Thakur, Sameer | 2015-07-23 09:44:17 | Re: [PROPOSAL] VACUUM Progress Checker. |
Previous Message | Teodor Sigaev | 2015-07-23 09:18:01 | Re: compress method for spgist - 2 |