Re: fixing subplan/subquery confusion

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fixing subplan/subquery confusion
Date: 2016-07-02 04:12:42
Message-ID: CAA4eK1+zec=H2SKt4iGp2QtZOhh_Y1fj7hTjCaSOchWqjCyvGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 2, 2016 at 12:29 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jun 30, 2016 at 6:49 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I haven't had a chance to do this yet, so I'm going to do it tomorrow instead.
>
> I dug into this a bit and found more problems. I wondered why Tom's
> patch did this:
>
> ! if (has_parallel_hazard((Node *) rte->subquery, false))
> ! return;
> ! break;
>
> Instead of just doing this:
>
> - return;
> + break;
>
> After all, the code that built subquery paths ought to be sufficient
> to find any parallel hazards during subquery planning; there seems to
> be no especially-good reason why we should walk the whole query tree
> searching for the again. So I changed that and ran the regression
> tests. With force_parallel_mode=on, things got unhappy on exactly one
> query:
>
> SELECT * FROM
> (SELECT a || b AS ab FROM t1
> UNION ALL
> SELECT ab FROM t2) t
> ORDER BY 1 LIMIT 8;
>
> This failed with a complaint about parallel workers trying to touch
> temporary relations, which turns out to be pretty valid since t1 and
> t2 are BOTH temporary relations. This turns out to be another facet
> of my ignorance about how subqueries can be pulled up to create
> appendrels with crazy things in them. set_rel_consider_parallel()
> looks at the appendrel and thinks everything is fine, because the
> reference to temporary tables are buried inside the appendrel members,
> which are note examined.
>
> I think it's hard to avoid the conclusion that
> set_rel_consider_parallel() needs to run on other member rels as well
> as baserels. We might be able to do that only for cases where the
> parent is a subquery RTE, since if the parent is a baserel then I
> think we must have just a standard inheritance hierarchy and things
> might be OK, but even then, I fear there might be problems with RLS.
> Anyway, the attached patch solves the problem in a fairly "brute
> force" fashion. We loop over all basrels and other member rels and
> set consider_parallel according to their properties. Then, when
> setting base rel sizes, we clear consider_parallel for any parents if
> it's clear for any of the children. Finally, before setting base rel
> pathlists for appendrel children, we clear the flag for the child if
> it's meanwhile been cleared for the parent. Thus, the parents and
> children always agree and only consider parallel paths for any of the
> rels if they're all OK. This seems a bit grotty to me so suggestions
> are welcome.
>

@@ -966,8 +985,9 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
continue;
}
- /* Copy consider_parallel flag from parent. */
- childrel->consider_parallel = rel->consider_parallel;
+ /* If any childrel is not parallel-safe, neither is parent. */
+ if (!childrel->consider_parallel)
+ rel->consider_parallel = false;

Doing this way misses the point that adjust_appendrel_attrs() can
change the targetlist. We should consider it for parallel-safety after
it gets modified. I think that point can be addressed if try to set
consider_parallel for child rels as I have done in patch [1].

/*
+ * If the parent isn't eligible for parallelism, there's no point
+ * in considering it for the children. (This might change someday
+ * if we have a way to build an Append plan where some of the child
+ * plans are forced to run in the parent and others can run in any
+ * process, but for now there's no point in expending cycles building
+ * childrel paths we can't use.)
+ */

+ if (!rel->consider_parallel)
+ childrel->consider_parallel = false;
+

What exactly makes Append plan to not able to run some of the child
nodes is other process?

[1] - https://www.postgresql.org/message-id/CAA4eK1Jg_GvaTEjJSaV5vZY6acDmi-B3iXWvdiXa%2BpUFbnkyTg%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2016-07-02 04:40:31 Re: Forthcoming SQL standards about JSON and Multi-Dimensional Arrays (FYI)
Previous Message Noah Misch 2016-07-02 03:37:06 Re: Bug in batch tuplesort memory CLUSTER case (9.6 only)