From: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
---|---|
To: | Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Performance of Query 2 in TPC-H |
Date: | 2024-11-05 02:22:35 |
Message-ID: | 024a38ec-18fa-4a2c-b1e2-26c092811fa6@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 11/4/24 15:42, Ba Jinsheng wrote:
> The estimated cost is reduced by 90%, and the execution time is reduced
> by 68%. The second query plan includes the operation Memoize, while the
> first query plan does not. I am wondering if we can optimize the logic
> anywhere to enable the second query plan.
Thanks for the curious case.
I discovered this case superficially yet, but have some results to discuss.
Postgres doesn't consider the Memoize case here because, inside the
JOIN's inner input, we reference the p_partkey column twice, which is
the primary key of the 'part' table. So, the cardinality of such a join
will always be 1. That's what we exactly watch there:
-> Nested Loop (cost=1049.15..16228.66 rows=1 width=34) (actual
time=0.617..39.544 rows=485 loops=1)
And all the joins upstairs are also NestLoop, because of this original
error. So, if you want to find a solution - discuss how to estimate
correctly clauses like:
(x=PRIMARY KEY) AND (y=PRIMARY KEY).
The second thing here is: if you replace '=' with 'IN':
and ps_supplycost = (select min(ps_supplycost) ...
and ps_supplycost IN (select min(ps_supplycost) ...
You will have pulled up a subquery. On my laptop, this trick accelerated
the query from 333ms to 44ms. That's exactly one of the tricks (IMO)
that swarm64 used years ago to show that it is faster than upstream
Postgres.
I want to work around this optimisation a bit later because to do a
pull-up, we need to prove the single result of the subquery beforehand.
Also, playing with AQO, as usual, I found two alternative query plans
that the optimiser can find in the case of more or less correct
cardinality prediction. See these plans in the attachment. I hope they
can be useful for your further analysis.
--
regards, Andrei Lepikhov
Attachment | Content-Type | Size |
---|---|---|
aqo_regular.txt | text/plain | 5.0 KB |
aqo_slow.txt | text/plain | 6.9 KB |
explain_fast.txt | text/plain | 6.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Rafia Sabih | 2024-11-05 13:22:29 | Re: Bloom filters and the planner / parallel execution |
Previous Message | Andrei Lepikhov | 2024-11-04 12:05:44 | Re: Postgresql 14/15/16/17 partition pruning on dependent table during join |