Re: MusicBrainz postgres performance issues

From: "michael(at)sqlexec(dot)com" <michael(at)sqlexec(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: MusicBrainz postgres performance issues
Date: 2015-03-15 23:55:23
Message-ID: 55061BEB.5070209@sqlexec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Why is 500 connections "insane". We got 32 CPU with 96GB and 3000 max
connections, and we are doing fine, even when hitting our max concurrent
connection peaks around 4500. At a previous site, we were using 2000
max connections on 24 CPU and 64GB RAM, with about 1500 max concurrent
connections. So I wouldn't be too hasty in saying more than 500 is
asking for trouble. Just as long as you got your kernel resources set
high enough to sustain it (SHMMAX, SHMALL, SEMMNI, and ulimits), and RAM
for work_mem.
> Tomas Vondra <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>
> Sunday, March 15, 2015 7:41 PM
> On 15.3.2015 23:47, Andres Freund wrote:
>> On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:
>>> Here's the problem with a large shared_buffers on a machine that's
>>> getting pushed into swap. It starts to swap BUFFERs. Once buffers
>>> start getting swapped you're not just losing performance, that huge
>>> shared_buffers is now working against you because what you THINK are
>>> buffers in RAM to make things faster are in fact blocks on a hard
>>> drive being swapped in and out during reads. It's the exact opposite
>>> of fast. :)
>> IMNSHO that's tackling things from the wrong end. If 12GB of shared
>> buffers drive your 48GB dedicated OLTP postgres server into swapping
>> out actively used pages, the problem isn't the 12GB of shared
>> buffers, but that you require so much memory for other things. That
>> needs to be fixed.
>
> I second this opinion.
>
> As was already pointed out, the 500 connections is rather insane
> (assuming the machine does not have hundreds of cores).
>
> If there are memory pressure issues, it's likely because many queries
> are performing memory-expensive operations at the same time (might even
> be a bad estimate causing hashagg to use much more than work_mem).
>
>
>> But! We haven't even established that swapping is an actual problem
>> here. The ~2GB of swapped out memory could just as well be the java raid
>> controller management monstrosity or something similar. Those pages
>> won't ever be used and thus can better be used to buffer IO.
>>
>> You can check what's actually swapped out using:
>> grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'
>>
>> For swapping to be actually harmful you need to have pages that are
>> regularly swapped in. vmstat will tell.
>
> I've already asked for vmstat logs, so let's wait.
>
>> In a concurrent OLTP workload (~450 established connections do
>> suggest that) with a fair amount of data keeping the hot data set in
>> shared_buffers can significantly reduce problems. Constantly
>> searching for victim buffers isn't a nice thing, and that will happen
>> if your most frequently used data doesn't fit into s_b. On the other
>> hand, if your data set is so large that even the hottest part doesn't
>> fit into memory (perhaps because there's no hottest part as there's
>> no locality at all), a smaller shared buffers can make things more
>> efficient, because the search for replacement buffers is cheaper with
>> a smaller shared buffers setting.
>
> I've met many systems with max_connections values this high, and it was
> mostly idle connections because of separate connection pools on each
> application server. So mostly idle (90% of the time), but at peak time
> all the application servers want to od stuff at the same time. And it
> all goes KABOOOM! just like here.
>
>
> Andres Freund <mailto:andres(at)2ndquadrant(dot)com>
> Sunday, March 15, 2015 6:47 PM
>
> IMNSHO that's tackling things from the wrong end. If 12GB of shared
> buffers drive your 48GB dedicated OLTP postgres server into swapping out
> actively used pages, the problem isn't the 12GB of shared buffers, but
> that you require so much memory for other things. That needs to be
> fixed.
>
> But! We haven't even established that swapping is an actual problem
> here. The ~2GB of swapped out memory could just as well be the java raid
> controller management monstrosity or something similar. Those pages
> won't ever be used and thus can better be used to buffer IO.
>
> You can check what's actually swapped out using:
> grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'
>
> For swapping to be actually harmful you need to have pages that are
> regularly swapped in. vmstat will tell.
>
> In a concurrent OLTP workload (~450 established connections do suggest
> that) with a fair amount of data keeping the hot data set in
> shared_buffers can significantly reduce problems. Constantly searching
> for victim buffers isn't a nice thing, and that will happen if your most
> frequently used data doesn't fit into s_b. On the other hand, if your
> data set is so large that even the hottest part doesn't fit into memory
> (perhaps because there's no hottest part as there's no locality at all),
> a smaller shared buffers can make things more efficient, because the
> search for replacement buffers is cheaper with a smaller shared buffers
> setting.
>
> Greetings,
>
> Andres Freund
>
> Scott Marlowe <mailto:scott(dot)marlowe(at)gmail(dot)com>
> Sunday, March 15, 2015 2:25 PM
>
> Here's the problem with a large shared_buffers on a machine that's
> getting pushed into swap. It starts to swap BUFFERs. Once buffers
> start getting swapped you're not just losing performance, that huge
> shared_buffers is now working against you because what you THINK are
> buffers in RAM to make things faster are in fact blocks on a hard
> drive being swapped in and out during reads. It's the exact opposite
> of fast. :)
>
>
> Andres Freund <mailto:andres(at)2ndquadrant(dot)com>
> Sunday, March 15, 2015 1:46 PM
>
> That imo doesn't really have anything to do with it. The primary benefit
> of a BBU with writeback caching is accelerating (near-)synchronous
> writes. Like the WAL. But, besides influencing the default for
> wal_buffers, a larger shared_buffers doesn't change the amount of
> synchronous writes.
>
> Greetings,
>
> Andres Freund
>
> Ilya Kosmodemiansky <mailto:ilya(dot)kosmodemiansky(at)postgresql-consulting(dot)com>
> Sunday, March 15, 2015 1:42 PM
> On Sun, Mar 15, 2015 at 8:20 PM, Andres Freund<andres(at)2ndquadrant(dot)com> wrote:
>> On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:
>>> shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so.
>> I think that's a outdated wisdom, i.e. not generally true.
>
> Quite agreed. With note, that proper configured controller with BBU is needed.
>
>
>> A new enough kernel, a sane filesystem
>> (i.e. not ext3) and sane checkpoint configuration takes care of most of
>> the other disadvantages.
>
> Most likely. And better to be sure that filesystem mounted without barrier.
>
> And I agree with Scott - 64MB work mem AND max_connections = 500 is a
> recipe for disaster. The problem could be in session mode of
> pgbouncer. If you can work with transaction mode - do it.
>
>
> Best regards,
> Ilya Kosmodemiansky,
>
> PostgreSQL-Consulting.com
> tel. +14084142500
> cell. +4915144336040
> ik(at)postgresql-consulting(dot)com
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2015-03-16 00:07:36 Re: MusicBrainz postgres performance issues
Previous Message Tomas Vondra 2015-03-15 23:41:54 Re: MusicBrainz postgres performance issues