Re: Performance Issue on Query 18 of TPC-H Benchmark

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Performance Issue on Query 18 of TPC-H Benchmark
Date: 2024-10-15 21:56:08
Message-ID: CAApHDvpbwaYyOwvUmNfaxpejDE2MO6C7=RsScrHv0m-=sm5d7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, 16 Oct 2024 at 07:29, Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu> wrote:
> While I found that if we disable the path sorting here:
> diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
> index 0c7273b9cc..91320f473a 100644
> --- a/src/backend/optimizer/plan/planner.c
> +++ b/src/backend/optimizer/plan/planner.c
> @@ -6928,7 +6928,7 @@ make_ordered_path(PlannerInfo *root, RelOptInfo *rel, Path *path,
> path->pathkeys,
> &presorted_keys);
>
> - if (!is_sorted)
> + if (false)
> {
> /*
> * Try at least sorting the cheapest path and also try incrementally
>
> Both the performance and estimated cost are reduced around 40%:

> -> Partial GroupAggregate (cost=214938.45..317625.48 rows=211468 width=71) (actual time=1616.827..1648.580 rows=3 loops=3)
> Group Key: customer.c_custkey, orders.o_orderkey
> -> Nested Loop (cost=214938.45..313396.12 rows=211468 width=44) (actual time=1609.796..1648.571 rows=21 loops=3)
> -> Parallel Hash Join (cost=214938.02..249103.36 rows=52844 width=43) (actual time=1609.777..1648.532 rows=3 loops=3)
> Hash Cond: (orders.o_custkey = customer.c_custkey)
> -> Hash Join (cost=209931.77..243958.39 rows=52844 width=24) (actual time=1573.950..1612.634 rows=3 loops=3)
> Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)
> -> Parallel Seq Scan on orders (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.016..32.211 rows=500000 loops=3)

GroupAggreate needs sorted input, so what you've done isn't valid.

David

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Fabio R. Sluzala 2024-10-15 22:24:21 Re: BUG #18657: Using JSON_OBJECTAGG with volatile function leads to segfault
Previous Message Ba Jinsheng 2024-10-15 18:28:47 Performance Issue on Query 18 of TPC-H Benchmark