From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Noah Misch <noah(at)leadboat(dot)com>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: fixing consider_parallel for upper planner rels |
Date: | 2016-06-30 00:33:46 |
Message-ID: | CA+TgmoaSjfi4W4nQWr2MtAnVBEDKEv2YPj-SD7n6yndq+9Na+w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 27, 2016 at 4:49 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> A few specific comments:
>
> * Can't we remove wholePlanParallelSafe altogether, in favor of just
> examining best_path->parallel_safe in standard_planner?
>
> * In grouping_planner, I'm not exactly convinced that
> final_rel->consider_parallel can just be copied up without any further
> restrictions. Easiest counterexample is a parallel restricted function in
> LIMIT.
>
> * Why have the try_parallel_path flag at all in create_grouping_paths,
> rather than just setting or not setting grouped_rel->consider_parallel?
> If your thinking is that this is dealing with implementation restrictions
> that might not apply to paths added by an extension, then OK, but the
> variable needs to have a less generic name. Maybe try_parallel_aggregation.
>
> * Copy/paste error in comment in create_distinct_paths: s/window/distinct/
>
> * Not following what you did to apply_projection_to_path, and the comment
> therein isn't helping.
Here is a new patch addressing (I hope) the above comments and a few
other things. Unfortunately, it causes the by-now-familiar regression
test failure in the aggregates test:
*** /Users/rhaas/pgsql/src/test/regress/expected/aggregates.out
2016-06-29 20:04:03.000000000 -0400
--- /Users/rhaas/pgsql/src/test/regress/results/aggregates.out
2016-06-29 20:06:28.000000000 -0400
***************
*** 577,590 ****
explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN
! ---------------------------------------------------------------------------
! Result
! InitPlan 1 (returns $0)
! -> Limit
! -> Index Only Scan Backward using tenk1_unique1 on tenk1
! Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
! (5 rows)
select max(unique1) from tenk1 where unique1 > 42000;
max
--- 577,588 ----
explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN
! ----------------------------------------------------
! Aggregate
! -> Index Only Scan using tenk1_unique1 on tenk1
! Index Cond: (unique1 > 42000)
! (3 rows)
select max(unique1) from tenk1 where unique1 > 42000;
max
======================================================================
The problem, of course, is that scanning tenk1_unique1 forward to find
0 rows at the end of the index and scanning it backward to find 0 rows
at the end of the index cost almost exactly the same amount, which is
probably a good thing when you stop to think about it. On my system,
the expected plan costs 4.31 and the plan that actually shows up costs
4.32, so we end up picking the latter because, since it does not
involve subqueries, it's parallel-safe.
It's tempting to remove the test on the theory that there's actually
little point in verifying which of two essentially-equivalent plans
get chosen, but I think that's probably missing the point: the next
test runs the same query without EXPLAIN, presumably to verify that
the MinMaxAggPath approach gets the right *answer* when there are no
matching rows, and in order for that test to mean anything, we need to
first verify that we're actually getting the plan whose execution-time
behavior we want to check.
Unfortunately, I can't think of a clean way to make the "right" thing
happen here. In general, it's right to prefer a parallel-safe plan
over an ever-so-slightly more expensive plan that isn't, because that
might let us join it to a partial path later on. However, if we're at
the very top of the plan tree, that argument holds no force, so we
could try to install some kind of exception for that case. That seems
like it would be fairly invasive and fairly ugly, though. It seems
like a better option would be to try to kludge the test case somehow
so that the backward index scan looks at least 1% cheaper than the
forward index scan, but I can't see how to do that. I don't see, for
example, that changing any of the usual costing factors would help
much. We could do something very specialized here, like adding a GUC
that makes MinMaxAggPath always win, but that seems ugly, too.
Suggestions?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
upperrel-consider-parallel-v2.patch | invalid/octet-stream | 14.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2016-06-30 01:50:56 | Re: A couple of cosmetic changes around shared memory code |
Previous Message | Tom Lane | 2016-06-29 23:21:58 | Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready) |