From: | Vitaliy Litovskiy <vitaliy(dot)litovskiy(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Distinct performance dropped by multiple times in v16 |
Date: | 2024-06-10 06:59:04 |
Message-ID: | CAD3pRn_BbRs4OkWXP7T=beJ1Pv1euWJ6jBQ3Sd62ZF34E2hrrw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello,
In our application after upgrading from postgresql from v15 to v16
performance for some queries dropped from less than 1-2 seconds to 2+
minutes.
After checking out some blogposts regarding DISTINCT improvements in v16 i
was able to "fix" it by adding order by clauses in subqueries, still it
looks more like a workaround for me and i would like to ask if it is
working as expected now.
I was able to reproduce the issue by the following steps
1. start postgresql v16 in docker
docker run --name some-postgres16 -e
POSTGRES_PASSWORD=mysecretpassword -d postgres:16
2. create tables with data.
- data is originated by northwind db, but tables were created with
different schema by custom software
- each table has 8k rows
- _BDN_Terretories__202406100822.sql
and _BDN_EmployeeTerritories__202406100822.sql files are attached
3. Execute the following query
select distinct tbl1."BDN_EmployeeTerritories_ID", tbl2."BDN_Terretories_ID",
tbl1."Reference_Date" from
(
select "BDN_EmployeeTerritories_ID", "Reference_Date", token
from public."BDN_EmployeeTerritories",
unnest(string_to_array("EMP_TerretoryID", ';')) s(token)
--order by token
) tbl1
join (
select "BDN_Terretories_ID", token from public."BDN_Terretories", unnest(
string_to_array("EMP_TerretoryID", ';')) s(token)
) tbl2 on tbl1.token = tbl2.token Observations:
1. query runs for 4-5 seconds on v16 and less than a second on v15 2. in
v16 it also goes downs to less than a second if 2.1 distinct is removed
2.2 unnest is removed. it is not really needed for this particular data but
this query is autogenerated and unnest makes sense for other data
2.3 "order by token" is uncommented, this is my current way of fixing the
problem I would really appreciate some feedback if that is expected
behaviour and if there are better solutions
--
Best Regards,
Vitalii Lytovskyi
Attachment | Content-Type | Size |
---|---|---|
_BDN_EmployeeTerritories__202406100822.sql | application/octet-stream | 596.6 KB |
_BDN_Terretories__202406100822.sql | application/octet-stream | 720.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Sabino Mullane | 2024-06-10 16:24:29 | Re: Distinct performance dropped by multiple times in v16 |
Previous Message | Satalabaha Postgres | 2024-06-04 06:46:09 | query column in pg_stat_statements and pg_stat_activity |