Re: Unexpected Performance for the Function simplify_function

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Unexpected Performance for the Function simplify_function
Date: 2024-10-24 20:49:18
Message-ID: 2595992.1729802958@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu> writes:
> I am not proposing a fixing patch, as the patch is incorrect. Instead, I just want to show disabling the simplify_function() function brings performance benefit, and it seems unexpected. I am wondering whether we can optimize simplify_function() to make the performance better for this workload?

The improvement here seems purely accidental. Basically, you have
broken the rowcount estimation for the "orders" scan: instead of

> -> Parallel Seq Scan on orders (cost=0.00..35511.00 rows=23858 width=8) (actual time=0.033..62.907 rows=19046 loops=3)
> Filter: ((o_orderdate >= '1993-08-01'::date) AND (o_orderdate < '1993-11-01 00:00:00'::timestamp without time zone))
> Rows Removed by Filter: 480954

we get

> -> Parallel Seq Scan on orders (cost=0.00..37062.50 rows=3125 width=8) (actual time=0.029..63.908 rows=19046 loops=3)
> Filter: ((o_orderdate >= '1993-08-01'::date) AND (o_orderdate < ('1993-08-01'::date + '3 mons'::interval month)))
> Rows Removed by Filter: 480954

The rows estimate is a good deal further from reality, as a
consequence of the fact that scalarltsel is now delivering a default
guess instead of a real estimate based on a constant comparison value.
Somehow, that's nonetheless leading to a better plan choice. It could
be that the underlying problem is a poor cost model for parallel hash
join. Or it could be as simple as random_page_cost not being tuned
for your environment. It looks like the better plan involves a
nestloop with inner indexscan on lineitem, which is something whose
estimated cost depends enormously on random_page_cost. You've given
us exactly zero detail about your test conditions, so it's hard to say
more than that.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2024-10-24 20:51:15 Re: Unexpected Performance for the Function simplify_function
Previous Message Shiv Iyer 2024-10-24 19:48:57 Re: Unexpected Performance for the Function simplify_function