From: | Achilleas Mantzios - cloud <a(dot)mantzios(at)cloud(dot)gatewaynet(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4 |
Date: | 2024-11-12 10:37:27 |
Message-ID: | b31dbff5-4aae-4860-b17d-29ec0e9e14be@cloud.gatewaynet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 11/10/24 11:45, Achilleas Mantzios wrote:
>
> Στις 9/11/24 17:41, ο/η Tomas Vondra έγραψε:
>>
>> On 11/9/24 15:05, Achilleas Mantzios wrote:
>>> Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε:
>>>> On 11/8/24 20:32, Achilleas Mantzios wrote:
>>>>> Dear All,
>>>>>
>>>>> we have hit a serious performance regression going from 10.23 →
>>>>> 16.4 as
>>>>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story
>>>>> short, I
>>>>> recreated the good fast “old” (pgsql 10) setup on the same vm as the
>>>>> slow “new” (pgsql 16). Here is the bug report on github :
>>>>>
>>>>> https://github.com/tds-fdw/tds_fdw/issues/371
>>>>> Yes, I mean linux-perf. There's a wikipage with some basic
>>>>> instructions:
>>
>> https://wiki.postgresql.org/wiki/Profiling_with_perf
>>
>> But in short, I'd do this:
>>
>> 1) use pg_backend_pid() to get PID of the backend
>>
>> SELECT pg_backend_pid();
>>
>> 2) start recording profile for that PID, say for 1 minute
>>
>> perf record -g -p $PID -- sleep 60
>>
>> 3) run the query in a loop
>>
>> SELECT count(*) as foo FROM mssql_bdynacom."ACDOC" \watch 1
>>
>> 4) once the recording stops, generate report
>>
>> perf report > report.txt
>>
>>
>> Do this for both versions, share the output txt files. You may need to
>> install additional packages with debug symbols to get better profiles,
>> and stuff like that.
>>
>> Also, don't forget to share the explain plans.
Hi Tomas,
By trying to reproduce the situation on the clone vm , we failed
creating a fast postgresql 10 setup, so we just created a slow
postgresql 10. So we thought comparing the slow 10 (on the clone) with
the fast 10 (on the prod system, on port 5410), which maybe more
interesting as now we are comparing :
- same vm specs (Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz , 64GB , SSD)
- same os (Linux smadb 6.7.12+bpo-amd64 #1 SMP PREEMPT_DYNAMIC Debian
6.7.12-1~bpo12+1 (2024-05-06) x86_64 GNU/Linux)
- same pgsql versions (10.23 )
- same setup on the tds_fdw (2.0.4, loaded from identical dump)
Here are the explain outputs for both which look identical :
*slow pgsql10 : *
postgres(at)smadb-clone:~$ /usr/local/pgsql10/bin/psql -p 5410 tds
psql (10.23)
Type "help" for help.
postgres(at)[local]/tds=5410# \dx tds_fdw
List of installed extensions
Name | Version | Schema |
Description
---------+---------+--------+-----------------------------------------------------------------------------------
tds_fdw | 2.0.4 | public | Foreign data wrapper for querying a TDS
database (Sybase or Microsoft SQL Server)
(1 row)
postgres(at)[local]/tds=5410# \timing
Timing is on.
postgres(at)[local]/tds=5410# EXPLAIN (ANALYZE, BUFFERS) select count(*)
from mssql_bdynacom."ACDOC" ;
NOTICE: tds_fdw: Query executed correctly
NOTICE: tds_fdw: Getting results
WARNING: Table definition mismatch: Foreign source has column named ,
but target table does not. Column will be ignored.
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=168334139.12..168334139.13 rows=1 width=8) (actual
time=14580.974..14580.975 rows=1 loops=1)
-> Foreign Scan on "ACDOC" (cost=200.00..168329931.30 rows=1683130
width=0) (actual time=13.989..14445.091 rows=1683130 loops=1)
Planning time: 14144.782 ms
Execution time: 14633.792 ms
(4 rows)
Time: 28779.099 ms (00:28.779)
-- single run
postgres(at)[local]/tds=5410# select count(*) from mssql_bdynacom."ACDOC" ;
NOTICE: tds_fdw: Query executed correctly
NOTICE: tds_fdw: Getting results
WARNING: Table definition mismatch: Foreign source has column named ,
but target table does not. Column will be ignored.
count
---------
1683135
(1 row)
Time: 28740.634 ms (00:28.741)
*fast pgsql10 :*
pgsql10(at)smadb:~$ psql -p 5410 tds
psql (10.23)
Type "help" for help.
pgsql10(at)[local]/tds=5410# \dx tds_fdw
List of installed extensions
Name | Version | Schema |
Description
---------+---------+--------+-----------------------------------------------------------------------------------
tds_fdw | 2.0.4 | public | Foreign data wrapper for querying a TDS
database (Sybase or Microsoft SQL Server)
(1 row)
pgsql10(at)[local]/tds=5410# \timing
Timing is on.
pgsql10(at)[local]/tds=5410# EXPLAIN (ANALYZE, BUFFERS) select count(*)
from mssql_bdynacom."ACDOC" ;
NOTICE: tds_fdw: Query executed correctly
NOTICE: tds_fdw: Getting results
WARNING: Table definition mismatch: Foreign source has column named ,
but target table does not. Column will be ignored.
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=168334339.15..168334339.16 rows=1 width=8) (actual
time=915.693..915.694 rows=1 loops=1)
-> Foreign Scan on "ACDOC" (cost=200.00..168330131.32 rows=1683132
width=0) (actual time=8.520..824.909 rows=1683132 loops=1)
Planning time: 201.217 ms
Execution time: 917.305 ms
(4 rows)
Time: 1119.045 ms (00:01.119)
-- single run
pgsql10(at)[local]/tds=5410# select count(*) from mssql_bdynacom."ACDOC" ;
NOTICE: tds_fdw: Query executed correctly
NOTICE: tds_fdw: Getting results
WARNING: Table definition mismatch: Foreign source has column named ,
but target table does not. Column will be ignored.
count
---------
1683135
(1 row)
Time: 1743.575 ms (00:01.744)
We run perf on both systems for 90 seconds to make sure that it included
at least two runs of the slow system, and we attach both outputs. We run
perf as :
sudo perf record -g -p <PID> -- sleep 90
it strikes me that we dont find similarities between the two, but we are
new to perf.
> Thank you Tomas, as soon
> as I get my hands on the new VM I will do as you suggested and report
> back.
>>
>> regard
>>
>
>
Attachment | Content-Type | Size |
---|---|---|
report10_fast.txt | text/plain | 135.8 KB |
report10_slow.txt | text/plain | 315.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrei Lepikhov | 2024-11-12 10:40:59 | Re: Performance of Query 4 on TPC-DS Benchmark |
Previous Message | Ba Jinsheng | 2024-11-11 10:51:16 | Re: Performance of Query 4 on TPC-DS Benchmark |