From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Prajna Shetty <Prajna(dot)Shetty(at)mindtree(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Cc: | Beenu Sharma <Beenu(dot)Sharma(at)mindtree(dot)com>, Pranabesh Saha <Pranabesh(dot)Saha(at)mindtree(dot)com>, Surya Susai <Surya(dot)Susai(at)mindtree(dot)com> |
Subject: | Re: Performance issue post upgrade on Version 13 - Incorrect Estimation Cost choosing Hash Aggregate-Nested Left Loop Join |
Date: | 2022-03-24 11:55:41 |
Message-ID: | 6e31a8c8-f74b-e4b3-f5d1-dce68e68abbf@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-performance |
On 3/22/22 13:57, Prajna Shetty wrote:
> ++ _pgsql-performance(at)postgresql(dot)org_
> <https://www.postgresql.org/list/pgsql-performance/>
>
> Hello Team,
>
> There is change in query plan in 12.4 version and Version 13 resulting
> in performance slowness post upgrade.
>
>
> * In 12.4 version, Sort Operation Group Aggregate is selected which
> results to Merge Join. Query takes ~5 seconds.
> * In 13.5 version, optimizer wrongly estimates and due to new Disk
> Based Hash Aggregate feature, it prefers Hash Aggregate instead of
> Sort Operation which finally blocks merge-join and chooses Nested
> Loop Left Join. Query takes ~5 minutes.
>
>
> *_NOTE: _*Disabling Hash Aggregate on instance level forces optimizer to
> choose merge operation but such instance level modification is not
> possible in terms of Application Functionality.
>
> This performance issue is on all over most of queries. Attached one of
> the query and its plan in both version for reference in case that helps
> for recreating the issue.
>
It's impossible to comment those other queries, but chances are the root
cause is the same.
> Version 13 query plan has lower estimated cost than that of 12.4 which
> implies 13.5 planner thought it found a better plan, but it is running
> slower and actual cost show more.
>
> 12.4 Version:
> "Merge Right Join (cost=*202198.78..295729.10* rows=1 width=8) (actual
> time=1399.727..*5224.574* rows=296 loops=1)"
>
> 13.5 version:-
> "Nested Loop Left Join (cost=*196360.90..287890.45* rows=1 width=8)
> (actual time=3209.577..*371300.693* rows=296 loops=1)"
>
This is not a costing issue, the problem is that we expect 1 row and
calculate the cost for that, but then get 296. And unfortunately a
nested loop degrades much faster than a merge join.
I'm not sure why exactly 12.4 picked a merge join, chances are the
costing formular changed a bit somewhere. But as I said, the problem is
in bogus row cardinality estimates - 12.4 is simply lucky.
The problem most likely stems from this part:
-> GroupAggregate (cost=0.43..85743.24 rows=1830 width=72) (actual
time=1.621..3452.034 rows=282179 loops=3)
Group Key: student_class_detail.aamc_id
Filter: (max((student_class_detail.class_level_cd)::text) = '4'::text)
Rows Removed by Filter: 76060
-> Index Scan using uk_student_class_detail_aamcid_classlevelcd on
student_class_detail (cost=0.43..74747.61 rows=1284079 width=6) (actual
time=1.570..2723.014 rows=1272390 loops=3)
Filter: (class_level_start_dt IS NOT NULL)
Rows Removed by Filter: 160402
The filter is bound to be misestimated, and the error then snowballs.
Try replacing this part with a temporary table (with pre-aggregated
results) - you can run analyze on it, etc. I'd bet that'll make the
issue go away.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2022-03-24 12:24:14 | Re: Performance issue post upgrade on Version 13 - Incorrect Estimation Cost choosing Hash Aggregate-Nested Left Loop Join |
Previous Message | Andrey Borodin | 2022-03-24 09:05:24 | Re: BUG #17444: ERROR: found xmin 215633 from before relfrozenxid 1280585 |
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2022-03-24 12:24:14 | Re: Performance issue post upgrade on Version 13 - Incorrect Estimation Cost choosing Hash Aggregate-Nested Left Loop Join |
Previous Message | Lars Aksel Opsahl | 2022-03-24 09:39:59 | Re: Using system tables directly takes many hours, using temp tables with no indexes takes a few seconds for geometry_columns view. |