From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pinker <pinker(at)onet(dot)eu> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: core system is getting unresponsive because over 300 cpu load |
Date: | 2017-10-11 00:18:06 |
Message-ID: | 20171011001806.n7biw2lps2iq3yt7@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
On 2017-10-10 13:40:07 -0700, pinker wrote:
> and the total number of connections are increasing very fast (but I suppose
> it's the symptom not the root cause of cpu load) and exceed max_connections
> (1000).
Others mentioned already that that's worth improving.
> System:
> * CentOS Linux release 7.2.1511 (Core)
> * Linux 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux
Some versions of this kernel have had serious problems with transparent
hugepages. I'd try turning that off. I think it defaults to off even in
that version, but also make sure zone_reclaim_mode is disabled.
> * postgresql95-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-contrib-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-docs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-libs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-server-9.5.5-1PGDG.rhel7.x86_64
>
> * 4 sockets/80 cores
9.6 has quite some scalability improvements over 9.5. I don't know
whether it's feasible for you to update, but if so, It's worth trying.
How about taking perf profile to investigate?
> * vm.dirty_background_bytes = 0
> * vm.dirty_background_ratio = 2
> * vm.dirty_bytes = 0
> * vm.dirty_expire_centisecs = 3000
> * vm.dirty_ratio = 20
> * vm.dirty_writeback_centisecs = 500
I'd suggest monitoring /proc/meminfo for the amount of Dirty and
Writeback memory, and see whether rapid changes therein coincide with
periodds of slowdown.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | pinker | 2017-10-11 00:26:26 | Re: core system is getting unresponsive because over 300 cpu load |
Previous Message | Tomas Vondra | 2017-10-10 23:23:26 | Re: core system is getting unresponsive because over 300 cpu load |