From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-11-17 19:29:48 |
Message-ID: | CA+TgmoY=bGOFUHQQRxOwSsjfbbgmjZiCti_jA2QnZYWgt2Vqew@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Nov 12, 2015 at 10:23 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Thanks for the report. The reason for this problem is that instrumentation
> information from workers is getting aggregated multiple times. In
> ExecShutdownGatherWorkers(), we call ExecParallelFinish where it
> will wait for workers to finish and then accumulate stats from workers.
> Now ExecShutdownGatherWorkers() could be called multiple times
> (once we read all tuples from workers, at end of node) and it should be
> ensured that repeated calls should not try to redo the work done by first
> call.
> The same is ensured for tuplequeues, but not for parallel executor info.
> I think we can safely assume that we need to call ExecParallelFinish() only
> when there are workers started by the Gathers node, so on those lines
> attached patch should fix the problem.
I suggest that we instead fix ExecParallelFinish() to be idempotent.
Add a "bool finished" flag to ParallelExecutorInfo and return at once
if it's already set. Get rid of the exposed
ExecParallelReinitializeTupleQueues() interface and have
ExecParallelReinitialize(pei) instead. Have that call
ReinitializeParallelDSM(), ExecParallelSetupTupleQueues(pei->pcxt,
true), and set pei->finished = false. I think that would give us a
slightly cleaner separation of concerns between nodeGather.c and
execParallel.c.
Your fix seems a little fragile. You're relying on node->reader !=
NULL to tell you whether the readers need to be cleaned up, but in
fact node->reader is set to a non-NULL value AFTER the pei has been
created. Granted, we currently always create a reader unless we don't
get any workers, and if we don't get any workers then failing to call
ExecParallelFinish is currently harmless, but nonetheless I think we
should be more explicit about this so it doesn't accidentally get
broken later.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2015-11-17 19:34:03 | Re: proposal: multiple psql option -c |
Previous Message | Tom Lane | 2015-11-17 19:25:25 | Re: proposal: multiple psql option -c |