From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-05-06 13:40:49 |
Message-ID: | CA+TgmoZLaoKOWtYXxzbJt4900E=-bNy_P2=fco+1NgwpA+gRcw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 6, 2015 at 7:55 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> - I believe the separation of concerns between ExecFunnel() and
>>> ExecEndFunnel() is not quite right. If the scan is shut down before
>>> it runs to completion (e.g. because of LIMIT), then I think we'll call
>>> ExecEndFunnel() before ExecFunnel() hits the TupIsNull(slot) path. I
>>> think you probably need to create a static subroutine that is called
>>> both as soon as TupIsNull(slot) and also from ExecEndFunnel(), in each
>>> case cleaning up whatever resources remain.
>
>> Right, will fix as per suggestion.
>
> I observed one issue while working on this review comment. When we
> try to destroy the parallel setup via ExecEndNode (as due to Limit
> Node, it could not destroy after consuming all tuples), it waits for
> parallel
> workers to finish (WaitForParallelWorkersToFinish()) and parallel workers
> are waiting for master backend to signal them as their queue is full.
> I think in such a case master backend needs to inform workers either when
> the scan is discontinued due to limit node or while waiting for parallel
> workers to finish.
Isn't this why TupleQueueFunnelShutdown() calls shm_mq_detach()?
That's supposed to unstick the workers; any impending or future writes
will just return SHM_MQ_DETACHED without waiting.
>> That's technically true, but the incremental work involved in
>> supporting a new worker is extremely small compare with worker startup
>> times. I'm guessing that the setup cost is going to be on the order
>> of hundred-thousands or millions and and the startup cost is going to
>> be on the order of tens or ones.
>
> Can we safely estimate the cost of restoring parallel state (GUC's,
> combo CID, transaction state, snapshot, etc.) in each worker as a setup
> cost? There could be some work like restoration of locks (acquire all or
> relevant locks at start of parallel worker, if we follow your proposed
> design
> and even if we don't follow that there could be some similar substantial
> work)
> which could be substantial and we need to do the same for each worker.
> If you think restoration of parallel state in each worker is a pretty
> small work, then what you say makes sense to me.
Well, all the workers restore that state in parallel, so adding it up
across all workers doesn't really make sense. But anyway, no, I don't
think that's a big cost. I think the big cost is going to the
operating system overhead of process creation. The new process will
incur lots of page faults as it populates its address space and
dirties pages marked copy-on-write. That's where I expect most of the
expense to be.
>> And I actually hope you *can't* present some contrary evidence.
>> Because if you can, then that might mean that we need to cost every
>> possible path from 0 up to N workers and let the costing machinery
>> decide which one is better.
>
> Not necesarally, we can follow a rule that number of workers
> that need to be used for any parallel statement are equal to degree of
> parallelism (parallel_seqscan_degree) as set by user. I think we
> need to do some split up of number workers when there are multiple
> parallel operations in single statement (like sort and parallel scan).
Yeah. I'm hoping we will be able to use the same pool of workers for
multiple operations, but I realize that's a feature we haven't
designed yet.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2015-05-06 13:50:09 | Re: Manipulating complex types as non-contiguous structures in-memory |
Previous Message | Pavel Stehule | 2015-05-06 13:16:05 | Re: is possible to upgrade from 9.2 to 9.4 with pg_upgrade |