From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, luis(dot)roberto(at)siscobra(dot)com(dot)br, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>, "alan(dot)formagi" <alan(dot)formagi(at)siscobra(dot)com(dot)br>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Subject: | Re: segfault with incremental sort |
Date: | 2020-11-25 14:57:37 |
Message-ID: | CAAaqYe9=YecnSFRx6LUysG1Hw1Z5QxZLbrC-E_48L_-bDCey-w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Wed, Nov 25, 2020 at 9:05 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Nov 25, 2020 at 7:57 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 23, 2020 at 9:44 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Nov 23, 2020 at 7:23 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > >
> > >
> > > Yeah, this doesn't require any communication between leader and worker
> > > but to enable it for such cases, we need to identify when we have
> > > correlated subquery where we can make it parallel-safe and then
> > > probably allow it. IIRC, we only allow cases of un-correlated subplans
> > > and initplans when those are at the same or a level above the Gather
> > > (Merge) node.
> >
> > In principle do you see any reason why given:
> >
> > select distinct
> > unique1,
> > (select t.unique1 from tenk1 where tenk1.unique1 = t.unique1)
> > from tenk1 t, generate_series(1, 1000);
> >
> > it wouldn't be valid to go from the current (with patch from [1]) plan of:
> >
> > Unique
> > -> Sort
> > Sort Key: t.unique1, ((SubPlan 1))
> > -> Gather
> > Workers Planned: 2
> > -> Nested Loop
> > -> Parallel Index Only Scan using tenk1_unique1 on tenk1 t
> > -> Function Scan on generate_series
> > SubPlan 1
> > -> Index Only Scan using tenk1_unique1 on tenk1
> > Index Cond: (unique1 = t.unique1)
> >
> > to this plan?
> >
> > Unique
> > -> Gather Merge
> > Workers Planned: 2
> > -> Sort
> > Sort Key: t.unique1, ((SubPlan 1))
> > -> Nested Loop
> > -> Parallel Index Only Scan using tenk1_unique1 on tenk1 t
> > -> Function Scan on generate_series
> > SubPlan 1
> > -> Index Only Scan using tenk1_unique1 on tenk1
> > Index Cond: (unique1 = t.unique1)
> >
> > My intuition is that it would be safe (not just in the parallel sense
> > but also in the sense of safety for pushing down expression evaluation
> > into the target list to push the sort down), but I want to make sure
> > that's correct before proceeding.
> >
>
> I don't see any problem with respect to parallel-safety but why do we
> want to generate parallel-plans for correlated sub-queries without
> doing more analysis on which kind of cases it would be safe apart from
> this particular case?
I should clarify: the questions in my last email aren't about what we
should do in PG13, but I wanted to make sure I understand the
theoretical/project policy bounds before pursuing additional work
here.
> > But I also have a bigger question:
> >
> > I've been stepping through this in the debugger, and have noticed that
> > the only reason we aren't selecting the second plan (again, with the
> > fix from [1]) is that the plan for "Index Only Scan using
> > tenk1_unique1 on tenk1" when passed into build_subplan has
> > "plan->parallel_safe = false". Earlier "consider_parallel = false" has
> > been set on the path by set_rel_consider_parallel because
> > is_parallel_safe doesn't find any safe_param_ids, and thus the
> > correlated subquery having a PARAM_EXEC param fails the safe_param_ids
> > member check in max_parallel_hazard_walker since the param isn't from
> > this level.
> >
> > So far that's correct, but then when we come back around to checking
> > parallel safety of the expression for a parallel sort we call
> > is_parallel_safe, and the max_parallel_hazard_walker SubPlan case
> > checks subplan->parallel_safe, finds it false
> > ("max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context)" returns
> > true), and thus doesn't proceed to actually checking the subplan's
> > testexpr or args.
> >
> > That seems a bit of miss to me, because while such a subplan is
> > parallel unsafe in isolation, it is not in this context.
> >
>
> It is possible but we don't that the context unless subplan is marked
> parallel-safe. I think if we need to generate parallel-safe plans for
> correlated queries, we might want to see how the subplan will be
> marked parallel-safe.
The one possibility I can imagine right now is maintaining some kind
of flag that marks a subplan as "parallel safe except for params" and
then recheck the params to see if the specific usage is safe. Does
something like that seem reasonable? Other than the param, the subplan
would be marked safe by the existing code.
> > That is, if
> > we re-run the checks on testexpr and args in this context, then we'll
> > not find anything parallel unsafe about them (the param can safely be
> > found in the current context), so we'll be free to execute the subplan
> > in workers.
> >
> > > Now, I think it is quite possible that in PG-13 we have
> > > unintentionally allowed plans containing correlated sub-queries but
> > > missed to take care of it at all required places.
> >
> > Yes, we're discussing a fix in [1] for PG13's missing checks for
> > parallel safety here.
> >
>
> So, are you trying to make such plans (which have correlated
> sub-queries) parallel-safe or parallel-unsafe?
In PG13 we accidentally made this case generate a parallel plan
because we were missing a parallel safety check when choosing a sort
expression. So for PG13's next patch release we'll be looking to get
in that parallel safety check, which will result in this particular
plan being excluded.
Separately I'm looking into why this case wasn't considered parallel
safe anyways (since it obviously is in reality) with the possibility
of working on a patch to improve that situation for PG14.
James
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-11-25 15:03:53 | Re: BUG #16743: psql doesn't show whole expression in stored column |
Previous Message | Amit Kapila | 2020-11-25 14:06:25 | Re: segfault with incremental sort |