Re: Sudden connection and load average spikes with postgresql 9.3

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: eudald_v <reaven(dot)galaeindael(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Sudden connection and load average spikes with postgresql 9.3
Date: 2015-07-02 23:04:51
Message-ID: CAOR=d=3pVkF7oxcOCk=fQeFkiRNKpCwbNtQ4sM7F5jndmdgyyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Jun 30, 2015 at 8:52 AM, eudald_v <reaven(dot)galaeindael(at)gmail(dot)com> wrote:
> Hello all,
> This is my very first message to the Postgresql community, and I really hope
> you can help me solve the trouble I'm facing.
>
> I've an 80 core server (multithread) with close to 500GB RAM.
>
> My configuration is:
> MaxConn: 1500 (was 850)

Drop this to 80 to 160, and use pgbouncer for pooling. pg_pooler is
nice, but it's a lot harder to configure. pgbouncer takes literally a
couple of minutes and you're up and running.

> Shared buffers: 188Gb

I have yet to see shared_buffers this big be a big help. I'd drop it
down to 1 to 16G or so but that's just me.

> work_mem: 110Mb (was 220Mb)

110MB*1500connections*1, 2, or 3 sorts per query == disaster. Drop
this down as well. If you have ONE query that needs a lot, create a
user for that query, alter that user for bigger work_mem and then use
it only for that big query.

> maintenance_work_mem: 256Mb
> effective_cache_size: 340Gb
>
> The database is running under postgresql 9.3.9 on an Ubuntu Server 14.04 LTS
> (build 3.13.0-55-generic)

Good kernel.

OK so you've got 80 cores. Have you checked zone_reclaim_mode? Under
no circumstances should that ever be turned on on a database server.

if "sysctl -a|grep zone" returns:
vm.zone_reclaim_mode = 1

then use /etc/sysctl.conf to set it to 0. It's a silent killer and it
will make your machine run REALLY slow for periods of 10 to 30 minutes
for no apparent reason.

But first and foremost GET A POOLER in the loop NOW. Not having one is
making your machine extremely vulnerable to overload, and making it
impossible for you to manage it.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Graeme B. Bell 2015-07-03 14:48:43 Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Previous Message Kevin Grittner 2015-07-02 21:13:47 Re: Sudden connection and load average spikes with postgresql 9.3