From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Christopher Nielsen <cnielsen(at)atlassian(dot)com> |
Cc: | Pg Bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #9342: CPU / Memory Run-away |
Date: | 2014-02-27 00:33:22 |
Message-ID: | CAMkU=1yNVunnJrPHE-BwtT3TH3UjGGRpoDteeQqDg=TSLpA08Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Wed, Feb 26, 2014 at 4:10 PM, Christopher Nielsen <cnielsen(at)atlassian(dot)com
> wrote:
> Thank you, Heikki,
>
> We have core and stack traces available, and would love to collaborate
>>> with
>>> anyone who has time to look into the issue with us, anytime.
>>
>>
>
>> Feel free to post the stack traces on this list, so that everyone can
>> have a look.
>>
>
> Linked is a strace / status / syscall list of what our postgres processes
> were doing, at the time of the run-away.
>
> Archive on Google Drive<https://drive.google.com/a/atlassian.com/file/d/0B9wdVVUs5uWVTVVSSXpZN0JxX3M/edit?usp=sharing> (Will
> bonk on "approve" on request)
>
> Also included, is a copy of the postgresql.conf, for reference.
>
> It seemed funny that a db-wide slow-down should happen, independent of
> increased query / app load.
>
> Thanks again for any insight, or opinions, into what could be going on,
>
Hi Chris,
It is the very high max_connections that is allowing the problem to happen.
If you only have 80 cores, nothing good can happen by allowing 3000
connections. Once the connections start spending most of their time
fighting each over internal locks (spinlocks and lwlocks) than doing what
they came for, you will never recover because queries continue to come in
faster than they can be handled. So by allowing too many connections, you
allow a (probably) transient load spike to turn into a permanent overload
condition.
It looks like you need a connection pooler (or more than one) in front of
your database.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | digoal | 2014-02-27 02:07:35 | BUG #9364: why pg_user_mappings can query by nonsuperuser? |
Previous Message | Christopher Nielsen | 2014-02-27 00:10:24 | Re: BUG #9342: CPU / Memory Run-away |