From: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
---|---|
To: | Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu> |
Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Question of Parallel Hash Join on TPC-H Benchmark |
Date: | 2024-10-13 03:09:29 |
Message-ID: | 59f605ce-00f2-401b-be1e-5b684326ab6e@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 13/10/2024 01:11, Ba Jinsheng wrote:
> I changed the code to generate an efficient query plan (only the
> HashJoin in fifth line is in parallel), so I am wondering whether it is
> possible to optimize the code to enable this efficient query plan in
> default? I believe at least, it would improve the performance of
> PostgreSQL on the standard benchmark TPC-H.
> If you need, I can provide my environment in docker for your analysis.
Thanks for the case! It was a pretty exciting case.
The answer to your question is simple: we have a correlated subquery
here with the clause: 'p_partkey = ps_partkey'. Because of that, the top
JOIN operator has parameterised inner, and parallel workers can't be
used according to the current state of the code. See comments in the code:
/*
* If the inner path is parameterised, we can't use a partial hashjoin.
* Parameterised partial paths are not supported. The caller should
* already have verified that no lateral rels are required here.
*/
Even if you transform the subquery to SEMI JOIN, you will have a
parameterised join, and the parallel plan will be declined.
So, the best you can do here is replace clause
'ps_supplycost = (SELECT ...)' with 'ps_supplycost IN (SELECT ...)'
I got quite a beneficial speedup there (see attached).
So, what can we improve here?
- I am suspicious about the parallel plans for parameterised paths, at
least soon.
- Improve pull-ups for subqueries. Especially if we can prove that the
subquery has a single aggregate returning only one tuple. It looks doable.
--
regards, Andrei Lepikhov
Attachment | Content-Type | Size |
---|---|---|
final_explain.txt | text/plain | 6.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2024-10-13 07:23:44 | BUG #18653: Is it necessary for ATExecDropInherit to acquire an AccessExclusiveLock on the parent table? |
Previous Message | Andrei Lepikhov | 2024-10-13 00:01:29 | Re: Question of Parallel Hash Join on TPC-H Benchmark |