Quick Links

Re: Performance Issue on Query 18 of TPC-H Benchmark

From:	Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To:	Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject:	Re: Performance Issue on Query 18 of TPC-H Benchmark
Date:	2024-10-16 16:42:52
Message-ID:	955eefa0-2158-4604-b79b-68a087646e91@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On 16.10.2024 17:56, Ba Jinsheng wrote:
> >I would like to know if you can improve that case by switching from
> the sorted group to a hashed one.
>
> I used this patch to enable the first HashAggregate only in the query
> plan:
> diff --git a/src/backend/optimizer/plan/planner.c
> b/src/backend/optimizer/plan/planner.c
> index 0c7273b9cc..b410452df1 100644
> --- a/src/backend/optimizer/plan/planner.c
> +++ b/src/backend/optimizer/plan/planner.c
> @@ -6983,8 +6983,9 @@ add_paths_to_grouping_rel(PlannerInfo *root,
> RelOptInfo *input_rel,
> bool can_sort = (extra->flags &
> GROUPING_CAN_USE_SORT) != 0;
> List *havingQual = (List *) extra->havingQual;
> AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
> -
> - if (can_sort)
> + static int call_count = 0;
> + call_count++;
> + if (can_sort && call_count != 2)
> {
> /*
> * Use any available suitably-sorted path as input,
> and also consider
>
> And got this query plan:
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=693083.48..*693083.73 *rows=100 width=71) (actual
> time=2624.282..2629.392 rows=9 loops=1)
> -> Sort (cost=693083.48..694352.29 rows=507522 width=71) (actual
> time=2624.281..2629.390 rows=9 loops=1)
> Sort Key: orders.o_totalprice DESC, orders.o_orderdate
> Sort Method: quicksort Memory: 25kB
> -> HashAggregate (cost=658421.05..673686.36 rows=507522
> width=71) (actual time=2624.162..2629.346 rows=9 loops=1)
> Group Key: customer.c_custkey, orders.o_orderkey
> Planned Partitions: 32 Batches: 1 Memory Usage: 793kB
> -> Gather (cost=459569.18..608779.05 rows=507522
> width=44) (actual time=2623.805..2629.229 rows=63 loops=1)
> Workers Planned: 2
> Workers Launched: 2
> -> Nested Loop (cost=458569.18..557026.85
> rows=211468 width=44) (actual time=2581.717..2620.494 rows=21 loops=3)
> -> Parallel Hash Join
> (cost=458568.75..492734.09 rows=52844 width=43) (actual
> time=2581.704..2620.448 rows=3 loops=3)
> Hash Cond: (orders.o_custkey =
> customer.c_custkey)
> -> Hash Join
> (cost=453562.50..487589.13 rows=52844 width=24) (actual
> time=2541.024..2579.759 rows=3 loops=3)
> Hash Cond: (orders.o_orderkey =
> lineitem_1.l_orderkey)
> -> Parallel Seq Scan on orders
> (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.028..32.135
> rows=500000 loops=3)
> -> Hash
> (cost=451977.19..451977.19 rows=126825 width=4) (actual
> time=2515.787..2515.788 rows=9 loops=3)
> Buckets: 131072 Batches:
> 1 Memory Usage: 1025kB
> -> GroupAggregate
> (cost=0.43..451977.19 rows=126825 width=4) (actual
> time=608.052..2515.758 rows=9 loops=3)
> Group Key:
> lineitem_1.l_orderkey
> Filter:
> (sum(lineitem_1.l_quantity) > '314'::numeric)
> Rows Removed by
> Filter: 1499991
> -> Index Scan
> using lineitem_pkey on lineitem lineitem_1 (cost=0.43..416256.96
> rows=6002623 width=9) (actual time=0.043..1399.708 rows=6001215 loops=3)
> -> Parallel Hash
> (cost=4225.00..4225.00 rows=62500 width=23) (actual
> time=39.601..39.602 rows=50000 loops=3)
> Buckets: 262144 Batches: 1
> Memory Usage: 10304kB
> -> Parallel Seq Scan on
> customer (cost=0.00..4225.00 rows=62500 width=23) (actual
> time=0.032..15.561 rows=50000 loops=3)
> -> Index Scan using lineitem_pkey on
> lineitem (cost=0.43..1.06 rows=16 width=9) (actual time=0.012..0.014
> rows=7 loops=9)
> Index Cond: (l_orderkey =
> orders.o_orderkey)
> Planning Time: 1.850 ms
> Execution Time: *2630.023* ms
> (30 rows)
>
> Compared to the query plan with GroupAggregate, both estimated cost
> and execution time are similar and have no significant difference.
>
>
To be honest, I don't quite understand what you are showing. I believe
you are not allowing the optimizer to generate a different aggregation
path (Group Aggregate) because it requires a sort operation. So I think
this is not correct.
>
> >I think it was more interesting when I turned off parallelism and
> tried to build a query plan without AQO, and the execution time there
> was significantly reduced:
> Turning off parallelism only brings this significant performance
> improvement?
>
>
You may notice that disabling parallelism results in improved
cardinality estimation and therefore a better query plan, since the
optimizer selects paths based on their cost. If parallelism is disabled,
query plan have become more correct.

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Re: Performance Issue on Query 18 of TPC-H Benchmark at 2024-10-16 14:56:54 from Ba Jinsheng

Responses

Re: Performance Issue on Query 18 of TPC-H Benchmark at 2024-10-16 18:26:12 from Ba Jinsheng

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Fujii Masao	2024-10-16 16:59:10	Re: [BUGS] BUG #10123: Weird entries in pg_stat_activity
Previous Message	PG Bug reporting form	2024-10-16 15:32:03	BUG #18658: Assert in SerialAdd() due to race condition