Re: Query running for very long time (server hanged) with parallel append

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amitdkhan(dot)pg(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, thomas(dot)munro(at)enterprisedb(dot)com, rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query running for very long time (server hanged) with parallel append
Date: 2018-02-06 04:41:47
Message-ID: 20180206.134147.120552029.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 06 Feb 2018 13:34:19 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20180206(dot)133419(dot)02213593(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> At Mon, 5 Feb 2018 15:29:27 +0530, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote in <CAJ3gD9eFR8=kxjPYBEHe34uT9+RYET0VbhGEfSt79eZx3L9E1Q(at)mail(dot)gmail(dot)com>
> > On 2 February 2018 at 20:46, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > > On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> > >> The query is actually hanging because one of the workers is in a small
> > >> loop where it iterates over the subplans searching for unfinished
> > >> plans, and it never comes out of the loop (it's a bug which I am yet
> > >> to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
> > >> each iteration; it's a small loop that does not pass control to any
> > >> other functions .
> > >
> > > Uh, sounds like we'd better fix that bug.
> >
> >
> > The scenario is this : One of the workers w1 hasn't yet chosen the
> > first plan, and all the plans are already finished. So w1 has it's
> > node->as_whichplan equal to -1. So the below condition in
> > choose_next_subplan_for_worker() never becomes true for this worker :
> >
> > if (pstate->pa_next_plan == node->as_whichplan)
> > {
> > /* We've tried everything! */
> > pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
> > LWLockRelease(&pstate->pa_lock);
> > return false;
> > }
> >
> > What I think is : we should save the information about which plan we
> > started the search with, before the loop starts. So, we save the
> > starting plan value like this, before the loop starts:
> > initial_plan = pstate->pa_next_plan;
> >
> > And then break out of the loop when we come back to to this initial plan :
> > if (initial_plan == pstate->pa_next_plan)
> > break;
> >
> > Now, suppose it so happens that initial_plan is a non-partial plan.
> > And then when we wrap around to the first partial plan, we check
> > (initial_plan == pstate->pa_next_plan) which will never become true
> > because initial_plan is less than first_partial_plan.
> >
> > So what I have done in the patch is : have a flag 'should_wrap_around'
> > indicating that we should not wrap around. This flag is true when
> > initial_plan is a non-partial plan, in which case we know that we will
> > have covered all plans by the time we reach the last plan. So break
> > from the loop if this flag is false, or if we have reached the initial
> > plan.
> >
> > Attached is a patch that fixes this issue on the above lines.
>
> The patch adds two new variables and always sets them but warp
> around can happen at most once per call so I think it is enough
> to arrange to stop at the wrap around time. Addition to that the
> patch is forgetting the case of no partial plan. The loop can
> overrun on pa_finished in the case.

Sorry, the patch works fine. But still the new variables don't
seem needed.

> I think something like the following would work.
>
> @@ -485,6 +485,13 @@ choose_next_subplan_for_worker(AppendState *node)
> {
> /* Loop back to first partial plan. */
> pstate->pa_next_plan = append->first_partial_plan;
> +
> + /*
> + * Arrange to bail out if we are trying to fetch the first partial
> + * plan
> + */
> + if (node->as_whichplan < append->first_partial_plan)
> + node->as_whichplan = append->first_partial_plan;
> }
> else
>
> > >> But I am not sure about this : while the workers are at it, why the
> > >> backend that is waiting for the workers does not come out of the wait
> > >> state with a SIGINT. I guess the same issue has been discussed in the
> > >> mail thread that you pointed.
> > >
> > > Is it getting stuck here?
> > >
> > > /*
> > > * We can't finish transaction commit or abort until all of the workers
> > > * have exited. This means, in particular, that we can't respond to
> > > * interrupts at this stage.
> > > */
> > > HOLD_INTERRUPTS();
> > > WaitForParallelWorkersToExit(pcxt);
> > > RESUME_INTERRUPTS();
> >
> > Actually the backend is getting stuck in
> > choose_next_subplan_for_leader(), in LWLockAcquire(), waiting for the
> > hanging worker to release the pstate->pa_lock. I think there is
> > nothing wrong in this, because it is assumed that LWLock wait is going
> > to be for very short tiime, and because of this bug, the lwlock waits
> > forever.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2018-02-06 04:56:33 Re: update tuple routing and triggers
Previous Message Kyotaro HORIGUCHI 2018-02-06 04:34:19 Re: Query running for very long time (server hanged) with parallel append