Re: Sudden connection and load average spikes with postgresql 9.3

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: eudald_v <reaven(dot)galaeindael(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Sudden connection and load average spikes with postgresql 9.3
Date: 2015-06-30 15:37:20
Message-ID: 28249.1435678640@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

eudald_v <reaven(dot)galaeindael(at)gmail(dot)com> writes:
> This is my very first message to the Postgresql community, and I really hope
> you can help me solve the trouble I'm facing.

> I've an 80 core server (multithread) with close to 500GB RAM.

> My configuration is:
> MaxConn: 1500 (was 850)
> Shared buffers: 188Gb
> work_mem: 110Mb (was 220Mb)
> maintenance_work_mem: 256Mb
> effective_cache_size: 340Gb

> The database is running under postgresql 9.3.9 on an Ubuntu Server 14.04 LTS
> (build 3.13.0-55-generic)

> Two days from now, I've been experiencing that, randomly, the connections
> rise up till they reach max connections, and the load average of the server
> goes arround 300~400, making every command issued on the server take
> forever. When this happens, ram is relatively low (70Gb used), cores
> activity is lower than usual and sometimes swap happens (I've swappiness
> configured to 10%)

You haven't described why you would suddenly be getting more connections,
but if that's just an expected byproduct of transactions taking too long
to finish, then a plausible theory is that something is acquiring a strong
lock on some heavily-used table, causing other transactions to back up
behind it. The pg_stat_activity and pg_locks views may help identify
the culprit.

In any case, a good mitigation plan would be to back off your
max_connections setting, and instead use a connection pooler to manage
large numbers of clients. The usual rule of thumb is that you don't want
number of backends much more than the number of cores. 1500 active
sessions is WAY more than your hardware can support, and even approaching
that will exacerbate whatever the root performance problem is.

I'm also a bit worried by your focus on CPU capacity without any mention
of the disk subsystem. It may well be that the actual bottleneck is disk
throughput, and that you can't even sustain 80 sessions unless they're
read-mostly. It would be a good idea to watch the i/o stats the next
time you see one of these high-load episodes.

It might also be a good idea to decrease the shared_buffers setting and
rely more on OS-level disk caching. Some have found that very large
shared buffer pools tend to increase contention without much payback.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message eudald_v 2015-06-30 16:08:56 Re: Sudden connection and load average spikes with postgresql 9.3
Previous Message eudald_v 2015-06-30 14:52:16 Sudden connection and load average spikes with postgresql 9.3