From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | postgresql(at)sjorsgielen(dot)nl, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18365: Inconsistent cost function between materialized and non-materialized CTE |
Date: | 2024-06-03 23:07:20 |
Message-ID: | CAApHDvqBTtdtE_w__HeOvzen6UqKUVzcNpTm74_ioJOfdOJ50A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, 27 Feb 2024 at 02:44, PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> I have run into an issue where a query with a CTE performs a sequential scan
> on a large table (42M rows, ~1min on our prod), while explicitly
> materializing the CTE simply performs an index only scan (~2s). When I set
> `enable_seqscan=off` and compare the costs, it turns out the query planner
> grossly overestimates the cost of a Nested Loop compared to the (as far as I
> can tell) exact same Nested Loop when the CTE is materialized. I know that
> the query planner acts on heuristics, so this might not be considered a bug,
> but the cost values are so wildly different for what should be essentially
> the same operation, that it might warrant further investigation.
> Is this a bug?
I don't see any bugs. What seems to be going on is that the
materialized CTE underestimates the number of rows after making the
CTE distinct so the planner can use a join rather than a semi-join.
That's seen in:
-> HashAggregate (cost=2576.28..2679.33 rows=10305 width=16) (actual
time=765.793..893.761 rows=166060 loops=1)
Due to that row underestimation, the planner thinks a parameterized
nested loop is the best way to join the two relations as it thinks the
stock_history_date_product_id_idx index only needs to be looked up
10305 times rather than 166060 times. With the non-materialized
version, the planner estimates 103051. That causes it to think the
Nested Loop -> index scan on stock_history_date_product_id_idx is too
expensive and hash join with a seq scan is cheaper As it turns out,
looking up the index *is* faster, even when that's done 166060 times.
The two parameters that drive the planner's decision on this are
random_page_cost, (you might want to consider lowering that) and
effective_cache_size. A rough guideline for effective_cache_size is
75% of the machine's RAM. However, it much depends on your
shared_buffer setting and what type of other things run concurrently
on the machine. Some people have found lowering random_page_cost as
low as 1.1 helps. The default is 4.0, which has remained since the HDD
days. For SSDs, it's often too large.
I've attached the EXPLAINs I trimmed down and compared to reach this conclusion.
David
Attachment | Content-Type | Size |
---|---|---|
bug_18365_explain.txt | text/plain | 3.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Haifang Wang (Centific Technologies Inc) | 2024-06-04 00:09:17 | RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607 |
Previous Message | Tom Lane | 2024-06-03 22:54:51 | Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607 |