From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Kieran McCusker <kieran(dot)mccusker(at)gmail(dot)com> |
Cc: | Christian Schwaderer <schwaderer(at)ivocotec(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #16223: Performance regression between 11.6 and 12.1 in an SQL query with a recursive CTE based on function |
Date: | 2020-02-03 09:15:46 |
Message-ID: | 20200203091546.auana7ua7mdqwceq@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
On 2020-01-29 15:31:05 +0000, Kieran McCusker wrote:
> I've just checked you query. The slowdown is due to JIT.
>
> *With JIT enabled:*
>
> CTE Scan on "somePeriods" od (cost=10.25..17494052.75 rows=1000 width=16)
> (actual time=724.235..724.235 rows=1 loops=1)
That's clearly a fairly absurd cost for this query. I think it largely
comes from:
> CTE somePeriods
> -> Function Scan on my_test_function f (cost=0.25..10.25 rows=1000
> width=12) (actual time=0.293..0.294 rows=1 loops=1)
> SubPlan 3
> -> Aggregate (cost=17494.01..17494.02 rows=1 width=4) (actual
> time=0.135..0.136 rows=1 loops=1)
This subplan being executed once for each row returned assumed to be
returned by my_test_function() - the default assumption is 1000. It's
only because it ends up only returning 1 row that it's not easily
visible in the plan.
which is compounded by:
> -> CTE Scan on "somePeriods" "somePeriods_1"
> (cost=0.00..20.00 rows=1000 width=8) (actual time=0.000..0.000 rows=1
> loops=1)
i.e. that somePeriods again is assumed to return 1000 rows. So you
basically end up with a quadratic increase in cost.
And the reason that the function doesn't just get inlined, allowing the
planner to have much better insights, is that it's defined as a security
definer function for some reason:
CREATE OR REPLACE FUNCTION public.my_test_function()
RETURNS SETOF record
LANGUAGE sql
IMMUTABLE SECURITY DEFINER
AS $function$
SELECT
1::integer AS id,
'2019-11-20'::date AS "startDate",
'2020-01-01'::date AS "endDate"
$function$;
And we never inline security definer functions.
It's more obvious if one makes the function return two rows:
CTE Scan on "somePeriods" od (cost=10.25..17494052.75 rows=1000 width=16) (actual time=159.875..159.907 rows=2 loops=1)
Output: od."startDate", od."endDate", od.id, (SubPlan 3)
CTE somePeriods
-> Function Scan on public.my_test_function f (cost=0.25..10.25 rows=1000 width=12) (actual time=0.046..0.047 rows=2 loops=1)
Output: f.id, f."startDate", f."endDate"
Function Call: my_test_function()
SubPlan 3
-> Aggregate (cost=17494.01..17494.02 rows=1 width=4) (actual time=0.044..0.044 rows=1 loops=2)
Output: max(prep_1."endDateParam")
CTE prep
-> Recursive Union (cost=0.00..16243.80 rows=55565 width=8) (actual time=0.002..0.004 rows=1 loops=2)
If one makes the function not be a security definer one, the whole thing
looks a lot more reasonable:
CTE Scan on "somePeriods" od (cost=0.03..16.77 rows=2 width=16) (actual time=0.073..0.091 rows=2 loops=1)
Output: od."startDate", od."endDate", od.id, (SubPlan 3)
CTE somePeriods
-> Append (cost=0.00..0.03 rows=2 width=12) (actual time=0.004..0.028 rows=2 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=12) (actual time=0.003..0.024 rows=1 loops=1)
Output: 1, '2019-11-20'::date, '2020-01-01'::date
-> Result (cost=0.00..0.01 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=1)
Output: 2, '2019-11-20'::date, '2020-01-01'::date
SubPlan 3
-> Aggregate (cost=8.34..8.35 rows=1 width=4) (actual time=0.036..0.036 rows=1 loops=2)
Output: max(prep_1."endDateParam")
CTE prep
-> Recursive Union (cost=0.00..7.87 rows=21 width=8) (actual time=0.005..0.028 rows=1 loops=2)
and it obviously doesn't get JIT compiled anymore.
I assume this is just a reproducer? If so, I'm not really sure it's a
good one to look at, because when the costs are 6 orders of magnitude
off due to an artificial plan barrier, and artifical row estimates, you'll
get all sorts of odd behaviour.
> JIT:
> Functions: 18
> Options: Inlining true, Optimization true, Expressions true, Deforming
> true
> Timing: Generation 8.586 ms, Inlining 17.901 ms, Optimization 514.256 ms,
> Emission 191.174 ms, Total 731.917 ms
> Execution Time: 733.276 ms
While not enough to alleviate the problem, I do get significantly
lower costs:
JIT:
Functions: 18
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 7.035 ms, Inlining 43.069 ms, Optimization 175.358 ms, Emission 43.862 ms, Total 269.323 ms
Is this a debug build of LLVM?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Arseny Sher | 2020-02-03 09:20:30 | Re: ERROR: subtransaction logged without previous top-level txn record |
Previous Message | Amit Kapila | 2020-02-03 06:45:21 | Re: ERROR: subtransaction logged without previous top-level txn record |