Re: Partition-wise aggregation/grouping

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise aggregation/grouping
Date: 2017-10-28 09:37:36
Message-ID: CA+TgmoYx_RqnX6bX+TnPTNC7CEfVwi-vO3cmWme1WuZFro=Axg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 27, 2017 at 1:01 PM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> 1. Added separate patch for costing Append node as discussed up-front in the
> patch-set.
> 2. Since we now cost Append node, we don't need
> partition_wise_agg_cost_factor
> GUC. So removed that. The remaining patch hence merged into main
> implementation
> patch.
> 3. Updated rows in test-cases so that we will get partition-wise plans.

With 0006 applied, cost_merge_append() is now a little bit confused:

/*
* Also charge a small amount (arbitrarily set equal to operator cost) per
* extracted tuple. We don't charge cpu_tuple_cost because a MergeAppend
* node doesn't do qual-checking or projection, so it has less overhead
* than most plan nodes.
*/
run_cost += cpu_operator_cost * tuples;

/* Add MergeAppend node overhead like we do it for the Append node */
run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples;

The first comment says that we don't add cpu_tuple_cost, and the
second one then adds half of it anyway.

I think it's fine to have a #define for DEFAULT_APPEND_COST_FACTOR,
because as you say it's used twice, but I don't think that should be
exposed in cost.h; I'd make it private to costsize.c and rename it to
something like APPEND_CPU_COST_MULTIPLIER. The word DEFAULT, in
particular, seems useless to me, since there's no provision for it to
be overridden by a different value.

What testing, if any, can we think about doing with this plan to make
sure it doesn't regress things? For example, if we do a TPC-H run
with partitioned tables and partition-wise join enabled, will any
plans change with this patch? Do they get faster or not? Anyone have
other ideas for what to test?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-10-28 09:50:46 Re: Parallel Append implementation
Previous Message Andrey Borodin 2017-10-28 09:25:16 Re: Reading timeline from pg_control on replication slave