From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Krzysztof Olszewski <kolszew73(at)gmail(dot)com> |
Cc: | Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Postgresql server gets stuck at low load |
Date: | 2020-06-05 11:37:48 |
Message-ID: | CAFj8pRDERh_L7qHzRyG4QG4dRTfYZBcYOOTO0-0nTRBHTK+PVA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
pá 5. 6. 2020 v 12:07 odesílatel Krzysztof Olszewski <kolszew73(at)gmail(dot)com>
napsal:
> I have problem with one of my Postgres production server. Server works
> fine almost always, but sometimes without any increase of transactions or
> statements amount, machine gets stuck. Cores goes up to 100%, load up to
> 160%. When it happens then there are problems with connect to database and
> even it will succeed, simple queries works several seconds instead of
> milliseconds.Problem sometimes stops after a period a time (e.g. 35 min),
> sometimes we must restart Postgres, Linux, or even KVM (which exists as
> virtualization host).
>
> My hardware
> 56 cores (Intel Core Processor (Skylake, IBRS))
> 400 GB RAM
> RAID10 with about 40k IOPS
>
> Os
> CentOS Linux release 7.7.1908
> kernel 3.10.0-1062.18.1.el7.x86_64
>
> Databasesize 100 GB (entirely fit in memory :) )
> server_version 10.12
> effective_cache_size 192000 MB
> maintenance_work_mem 2048 MB
> max_connections 150
> shared_buffers 64000 MB
> work_mem 96 MB
>
> On normal state, i have about 500 tps, 5% usage of cores, about 3% of
> load, whole database fits in memory, no reads from disk, only writes on
> about 500 IOPS level, sometimes in spikes on 1500 IOPS level, but on this
> hardware there is no problem with this values (no iowaits on cores). In
> normal state this machine does "nothing". Connections to database are
> created by two app servers based on Java, through connection pools, so
> connections count is limited by configuration of pools and max is 120, is
> lower value than in Postgres configuration (150). On normal state there is
> about 20 connections, when stuck goes into max (120).
>
> In correlation with stucks i see informations in kernel log about
> NMI watchdog: BUG: soft lockup - CPU#25 stuck for 23s! [postmaster:33935]
> but i don't know this is reason or effect of problem
> I made investigation with pgBadger and ... nothing strange happens, just
> normal statements
>
> Any ideas?
>
you can try to install perf + debug symbols for postgres. When you will
have this problem again run "perf top". You can see what routines eat your
CPU.
Maybe it can be a spinlock problem
Can be interesting a reply on Merlin's question from mail/.
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
Regards
Pavel
>
> Thanks,
> Kris
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Imre Samu | 2020-06-05 11:48:19 | Re: When to use PARTITION BY HASH? |
Previous Message | luis.roberto | 2020-06-05 11:16:42 | Re: Postgresql server gets stuck at low load |