Re: Question of Parallel Hash Join on TPC-H Benchmark

From: Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Question of Parallel Hash Join on TPC-H Benchmark
Date: 2024-10-12 18:11:04
Message-ID: SEZPR06MB649483EF863B323B090986AD8A7A2@SEZPR06MB6494.apcprd06.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> Could you provide SQL dump and settings to play with this case locally?
The dump file is too big, so I put it on Google Drive: https://drive.google.com/file/d/1e0s6ZLKLEPbZzS6BzftwpmVspWi7Okd1/view?usp=sharing

I also share my data directory here: https://drive.google.com/file/d/1ZBLHanIRwxbaMQIhRUSPv4I7y8g_0AWi/view?usp=sharing

>Also, I usually force parallel workers with settings like below:

>max_parallel_workers_per_gather = 32
>min_parallel_table_scan_size = 0
>min_parallel_index_scan_size = 0
>max_worker_processes = 64
>parallel_setup_cost = 0.001
>parallel_tuple_cost = 0.0001
I tried these configuration parameters and got the same worse query plan--- the HashJoin in fifth line is still not in parallel and the following HashJoin are in parallel.
However, this is an inefficient query plan.

I changed the code to generate an efficient query plan (only the HashJoin in fifth line is in parallel), so I am wondering whether it is possible to optimize the code to enable this efficient query plan in default? I believe at least, it would improve the performance of PostgreSQL on the standard benchmark TPC-H.
If you need, I can provide my environment in docker for your analysis.

Best regards,

Jinsheng Ba

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-10-12 19:13:47 Re: BUG #18652: Planner can not find pathkey item to sort for query with expression and expression index
Previous Message Sandeep Thakkar 2024-10-12 15:06:42 Re: BUG #18646: The problem with the installer