Re: MusicBrainz postgres performance issues

From: "michael(at)sqlexec(dot)com" <michael(at)sqlexec(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: MusicBrainz postgres performance issues
Date: 2015-03-16 00:17:44
Message-ID: 55062128.7060804@sqlexec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I agree with your counter argument about how high max_connections "can"
cause problems, but max_connections may not part of the problem here.
There's a bunch of "depends stuff" in there based on workload details, #
cpus, RAM, etc.

I'm still waiting to find out how many CPUs on this DB server. Did i
miss it somewhere in the email thread below?

> Tomas Vondra <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>
> Sunday, March 15, 2015 8:07 PM
>
> If all the connections are active at the same time (i.e. running
> queries), they have to share the 32 cores somehow. Or I/O, if that's the
> bottleneck.
>
> In other words, you're not improving the throughput of the system,
> you're merely increasing latencies. And it may easily happen that the
> latency increase is not linear, but grows faster - because of locking,
> context switches and other process-related management.
>
> Imagine you have a query taking 1 second of CPU time. If you have 64
> such queries running concurrently on 32 cores, each gets only 1/2 a CPU
> and so takes >=2 seconds. With 500 queries, it's >=15 seconds per, etc.
>
> If those queries are acquiring the same locks (e.g. updating the same
> rows, or so), you can imagine what happens ...
>
> Also, if part of the query required a certain amount of memory for part
> of the plan, it now holds that memory for much longer too. That only
> increases the change of OOM issues.
>
> It may work fine when most of the connections are idle, but it makes
> storms like this possible.
>
>
> michael(at)sqlexec(dot)com <mailto:michael(at)sqlexec(dot)com>
> Sunday, March 15, 2015 7:55 PM
> Why is 500 connections "insane". We got 32 CPU with 96GB and 3000 max
> connections, and we are doing fine, even when hitting our max
> concurrent connection peaks around 4500. At a previous site, we were
> using 2000 max connections on 24 CPU and 64GB RAM, with about 1500 max
> concurrent connections. So I wouldn't be too hasty in saying more
> than 500 is asking for trouble. Just as long as you got your kernel
> resources set high enough to sustain it (SHMMAX, SHMALL, SEMMNI, and
> ulimits), and RAM for work_mem.
> Tomas Vondra <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>
> Sunday, March 15, 2015 7:41 PM
> On 15.3.2015 23:47, Andres Freund wrote:
>> On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:
>>> Here's the problem with a large shared_buffers on a machine that's
>>> getting pushed into swap. It starts to swap BUFFERs. Once buffers
>>> start getting swapped you're not just losing performance, that huge
>>> shared_buffers is now working against you because what you THINK are
>>> buffers in RAM to make things faster are in fact blocks on a hard
>>> drive being swapped in and out during reads. It's the exact opposite
>>> of fast. :)
>> IMNSHO that's tackling things from the wrong end. If 12GB of shared
>> buffers drive your 48GB dedicated OLTP postgres server into swapping
>> out actively used pages, the problem isn't the 12GB of shared
>> buffers, but that you require so much memory for other things. That
>> needs to be fixed.
>
> I second this opinion.
>
> As was already pointed out, the 500 connections is rather insane
> (assuming the machine does not have hundreds of cores).
>
> If there are memory pressure issues, it's likely because many queries
> are performing memory-expensive operations at the same time (might even
> be a bad estimate causing hashagg to use much more than work_mem).
>
>
>> But! We haven't even established that swapping is an actual problem
>> here. The ~2GB of swapped out memory could just as well be the java raid
>> controller management monstrosity or something similar. Those pages
>> won't ever be used and thus can better be used to buffer IO.
>>
>> You can check what's actually swapped out using:
>> grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'
>>
>> For swapping to be actually harmful you need to have pages that are
>> regularly swapped in. vmstat will tell.
>
> I've already asked for vmstat logs, so let's wait.
>
>> In a concurrent OLTP workload (~450 established connections do
>> suggest that) with a fair amount of data keeping the hot data set in
>> shared_buffers can significantly reduce problems. Constantly
>> searching for victim buffers isn't a nice thing, and that will happen
>> if your most frequently used data doesn't fit into s_b. On the other
>> hand, if your data set is so large that even the hottest part doesn't
>> fit into memory (perhaps because there's no hottest part as there's
>> no locality at all), a smaller shared buffers can make things more
>> efficient, because the search for replacement buffers is cheaper with
>> a smaller shared buffers setting.
>
> I've met many systems with max_connections values this high, and it was
> mostly idle connections because of separate connection pools on each
> application server. So mostly idle (90% of the time), but at peak time
> all the application servers want to od stuff at the same time. And it
> all goes KABOOOM! just like here.
>
>
> Andres Freund <mailto:andres(at)2ndquadrant(dot)com>
> Sunday, March 15, 2015 6:47 PM
>
> IMNSHO that's tackling things from the wrong end. If 12GB of shared
> buffers drive your 48GB dedicated OLTP postgres server into swapping out
> actively used pages, the problem isn't the 12GB of shared buffers, but
> that you require so much memory for other things. That needs to be
> fixed.
>
> But! We haven't even established that swapping is an actual problem
> here. The ~2GB of swapped out memory could just as well be the java raid
> controller management monstrosity or something similar. Those pages
> won't ever be used and thus can better be used to buffer IO.
>
> You can check what's actually swapped out using:
> grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'
>
> For swapping to be actually harmful you need to have pages that are
> regularly swapped in. vmstat will tell.
>
> In a concurrent OLTP workload (~450 established connections do suggest
> that) with a fair amount of data keeping the hot data set in
> shared_buffers can significantly reduce problems. Constantly searching
> for victim buffers isn't a nice thing, and that will happen if your most
> frequently used data doesn't fit into s_b. On the other hand, if your
> data set is so large that even the hottest part doesn't fit into memory
> (perhaps because there's no hottest part as there's no locality at all),
> a smaller shared buffers can make things more efficient, because the
> search for replacement buffers is cheaper with a smaller shared buffers
> setting.
>
> Greetings,
>
> Andres Freund
>
> Scott Marlowe <mailto:scott(dot)marlowe(at)gmail(dot)com>
> Sunday, March 15, 2015 2:25 PM
>
> Here's the problem with a large shared_buffers on a machine that's
> getting pushed into swap. It starts to swap BUFFERs. Once buffers
> start getting swapped you're not just losing performance, that huge
> shared_buffers is now working against you because what you THINK are
> buffers in RAM to make things faster are in fact blocks on a hard
> drive being swapped in and out during reads. It's the exact opposite
> of fast. :)
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gavin Flower 2015-03-16 00:22:52 Re: MusicBrainz postgres performance issues
Previous Message Andres Freund 2015-03-16 00:12:34 Re: MusicBrainz postgres performance issues