Re: [HACKERS] Partition-wise aggregation/grouping

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-03-06 14:22:10
Message-ID: CAM2+6=W+16AGny3i8g+vTNH1jeGyczpHaykKuw==Xo0Cqo7hjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 6, 2018 at 4:59 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> Hi Jeevan,
> I am back reviewing this. Here are some comments.
>
> @@ -1415,7 +1413,8 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
> * the unparameterized Append path we are constructing for the
> parent.
> * If not, there's no workable unparameterized path.
> */
> - if (childrel->cheapest_total_path->param_info == NULL)
> + if (childrel->pathlist != NIL &&
> + childrel->cheapest_total_path->param_info == NULL)
> accumulate_append_subpath(childrel->cheapest_total_path,
> &subpaths, NULL);
> else
> @@ -1683,6 +1682,13 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
> RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
> Path *subpath;
>
> + if (childrel->pathlist == NIL)
> + {
> + /* failed to make a suitable path for this child */
> + subpaths_valid = false;
> + break;
> + }
> +
> When can childrel->pathlist be NIL?
>
> diff --git a/src/backend/optimizer/plan/createplan.c
> b/src/backend/optimizer/plan/createplan.c
> index 9ae1bf3..f90626c 100644
> --- a/src/backend/optimizer/plan/createplan.c
> +++ b/src/backend/optimizer/plan/createplan.c
> @@ -1670,7 +1670,15 @@ create_sort_plan(PlannerInfo *root, SortPath
> *best_path, int flags)
> subplan = create_plan_recurse(root, best_path->subpath,
> flags | CP_SMALL_TLIST);
>
> - plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> NULL);
> + /*
> + * In find_ec_member_for_tle(), child EC members are ignored if they
> don't
> + * belong to the given relids. Thus, if this sort path is based on a
> child
> + * relation, we must pass the relids of it. Otherwise, we will end-up
> into
> + * an error requiring pathkey item.
> + */
> + plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> + IS_OTHER_REL(best_path->subpath->parent)
> ?
> + best_path->path.parent->relids : NULL);
>
> copy_generic_path_info(&plan->plan, (Path *) best_path);
>
> Please separate this small adjustment in a patch of its own, with some
> explanation of why we need it i.e. now this function can see SortPaths from
> child (other) relations.
>
> + if (child_data)
> + {
> + /* Must be other rel as all child relations are marked OTHER_RELs
> */
> + Assert(IS_OTHER_REL(input_rel));
>
> I think we should check IS_OTHER_REL() and Assert(child_data). That way we
> know
> that the code in the if block is executed for OTHER relation.
>
> - if ((root->hasHavingQual || parse->groupingSets) &&
> + if (!child_data && (root->hasHavingQual || parse->groupingSets) &&
>
> Degenerate grouping will never see child relations, so instead of checking
> for
> child_data, Assert (!IS_OTHER_REL()) inside this block. Add a comment there
> explaining the assertion.
>
> + *
> + * If we are performing grouping for a child relation, fetch can_sort
> from
> + * the child_data to avoid re-calculating same.
> */
> - can_sort = (gd && gd->rollups != NIL)
> - || grouping_is_sortable(parse->groupClause);
> + can_sort = child_data ? child_data->can_sort : ((gd &&
> gd->rollups != NIL) ||
> +
> grouping_is_sortable(parse->groupClause));
>
> Instead of adding a conditional here, we can compute these values before
> create_grouping_paths() is called from grouping_planner() and then pass
> them
> down to try_partitionwise_grouping(). I have attached a patch which
> refactors
> this code. Please see if this refactoring is useful. In the attached
> patch, I
> have handled can_sort, can_hash and partial aggregation costs. More on the
> last
> component below.
>

Changes look good to me and refactoring will be useful for partitionwise
patches.

However, will it be good if we add agg_costs into the GroupPathExtraData
too?
Also can we pass this to the add_partial_paths_to_grouping_rel() and
add_paths_to_grouping_rel() to avoid passing can_sort, can_hash and costs
related details individually to them?

>
>
> /*
> * Figure out whether a PartialAggregate/Finalize Aggregate execution
> @@ -3740,10 +3789,8 @@ create_grouping_paths(PlannerInfo *root,
> * partial paths for partially_grouped_rel; that way, later code can
> * easily consider both parallel and non-parallel approaches to
> grouping.
> */
> - if (try_parallel_aggregation)
> + if (!child_data && !(agg_costs->hasNonPartial ||
> agg_costs->hasNonSerial))
> {
> - PathTarget *partial_grouping_target;
> -
> [... clipped ...]
> + get_agg_clause_costs(root, havingQual,
> AGGSPLIT_FINAL_DESERIAL,
> - &agg_final_costs);
> + agg_final_costs);
> }
> + }
>
> With this change, we are computing partial aggregation costs even in
> the cases when
> those will not be used e.g. when there are no children and parallel paths
> can
> not be created. In the attached patch, I have refactored the code such that
> they are computed when they are needed the first time and re-used later.
>
> + if (child_data)
> + {
> + partial_grouping_target = child_data->partialRelTarget;
> + partially_grouped_rel->reltarget = partial_grouping_target;
> + agg_partial_costs = child_data->agg_partial_costs;
> + agg_final_costs = child_data->agg_final_costs;
> + }
>
> I think, with the refactoring, we can get rid of the last two lines here. I
> think we can get rid of this block entirely, but I have not reviewed the
> entire
> code to confirm that.
>
> static PathTarget *
> -make_partial_grouping_target(PlannerInfo *root, PathTarget
> *grouping_target)
> +make_partial_grouping_target(PlannerInfo *root,
> + PathTarget *grouping_target,
> + Node *havingQual)
> This looks like a refactoring change. Should go to one of the refactoring
> patches or in a patch of its own.
>
> This isn't full review. I will continue reviewing this further.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>

--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2018-03-06 14:25:48 Re: Re: [HACKERS] Support to COMMENT ON DATABASE CURRENT_DATABASE
Previous Message Aleksandr Parfenov 2018-03-06 14:18:29 Re: Flexible configuration for full-text search