Re: track_planning causing performance regression

From: bttanakahbk <bttanakahbk(at)oss(dot)nttdata(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, "Tharakan, Robins" <tharar(at)amazon(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: track_planning causing performance regression
Date: 2020-09-11 07:23:28
Message-ID: a84734eb0dd137b11d7b84fc14ed530c@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-08-19 00:43, Fujii Masao wrote:
>> Yes, I pushed the document_overhead_by_track_planning.patch, but this
>> CF entry is for pgss_lwlock_v1.patch which replaces spinlocks with
>> lwlocks
>> in pg_stat_statements. The latter patch has not been committed yet.
>> Probably attachding the different patches in the same thread would
>> cause
>> this confusing thing... Anyway, thanks for your comment!
>
> To avoid further confusion, I attached the rebased version of
> the patch that was registered at CF. I'd appreciate it if
> you review this version.

I tested pgss_lwlock_v2.patch with 3 workloads. And I couldn't observe
performance
improvement in our environment and I'm afraid to say that even worser in
some case.
- Workload1: pgbench select-only mode
- Workload2: pgbench custom scripts which run "SELECT 1;"
- Workload3: pgbench custom scripts which run 1000 types of different
simple queries

- Workload1
First we set the pg_stat_statements.track_planning to on/off and run the
fully-cached pgbench
select-only mode on pg14head which is installed in on-premises
server(32CPU, 256GB mem).
However in this enveronment we couldn't reproduce 45% performance drop
due to s_lock conflict
(Tharakan-san mentioned in his post on
2895b53b033c47ccb22972b589050dd9(at)EX13D05UWC001(dot)ant(dot)amazon(dot)com).

- Workload2
Then we adopted pgbench custom script "SELECT 1;" which supposed to
increase the s_lock and
make it easier to reproduce the issue. In this case around 10% of
performance decrease
which also shows slightly increase in s_lock (~10%). With this senario,
despite a s_lock
absence, the patch shows more than 50% performance degradation
regardless of track_planning.
And also we couldn't see performance improvement in this workload.

pgbench:
initialization: pgbench -i -s 100
benchmarking : pgbench -j16 -c128 -T180 -r -n -f <script> -h <address>
-U <user> -p <port> -d <db>
# VACUUMed and pg_prewarmed manually before run the benchmark
query:SELECT 1;
> pgss_lwlock_v2.patch track_planning TPS decline rate
> s_lock CPU usage
> - OFF 810509.4 standard
> 0.17% 98.8%(sys24.9%,user73.9%)
> - ON 732823.1 -9.6%
> 1.94% 95.1%(sys22.8%,user72.3%)
> + OFF 371035.0 -49.4% -
> 65.2%(sys20.6%,user44.6%)
> + ON 193965.2 -47.7% -
> 41.8%(sys12.1%,user29.7%)
# "-" is showing that s_lock was not reported from the perf.

- Workload3
Next, there is concern that replacement of LWLock may reduce performance
in some other workloads.
(Fujii-san mentioned in his post on
42a13b4c-e60c-c6e7-3475-8eff8050bed4(at)oss(dot)nttdata(dot)com).
To clarify this, we prepared 1000 simple queries which is supposed to
prevent the conflict of
s_lock and may expect to see the behavior without s_lock. In this case,
no performance decline
was observed and also we couldn't see any additional memory consumption
or cpu usage.

pgbench:
initialization: pgbench -n -i -s 100 --partitions=1000
--partition-method=range
benchmarking : command is same as (Workload1)
query: SELECT abalance FROM pgbench_accounts_xxxx WHERE aid = :aid +
(10000 * :random_num - 10000);
> pgss_lwlock_v2.patch track_planning TPS decline rate CPU
> usage
> - OFF 88329.1 standard
> 82.1%(sys6.5%,user75.6%)
> - ON 88015.3 -0.36%
> 82.6%(sys6.5%,user76.1%)
> + OFF 88177.5 0.18%
> 82.2%(sys6.5%,user75.7%)
> + ON 88079.1 -0.11%
> 82.5%(sys6.5%,user76.0%)

(Environment)
machine:
server/client - 32 CPUs / 256GB # used same machine as server & client
postgres:
version: v14 (6eee73e)
configure: '--prefix=/usr/pgsql-14a' 'CFLAGS=-O2'
GUC param (changed from defaults):
shared_preload_libraries = 'pg_stat_statements, pg_prewarm'
autovacuum = off
checkpoint = 120min
max_connections=300
listen_address='*'
shared_buffers=64GB

Regards,

--
Hibiki Tanaka

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-11 07:34:12 Re: Online checksums verification in the backend
Previous Message Michael Paquier 2020-09-11 07:08:23 Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration