Re: Question about memory usage

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Preston Hagar <prestonh(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Question about memory usage
Date: 2014-01-10 18:19:47
Message-ID: 14275.1389377987@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Preston Hagar <prestonh(at)gmail(dot)com> writes:
>>> tl;dr: Moved from 8.3 to 9.3 and are now getting out of memory errors
>>> despite the server now having 32 GB instead of 4 GB of RAM and the workload
>>> and number of clients remaining the same.

> Here are a couple of examples from the incident we had this morning:
> 2014-01-10 06:14:40 CST 30176 LOG: could not fork new process for
> connection: Cannot allocate memory
> 2014-01-10 06:14:40 CST 30176 LOG: could not fork new process for
> connection: Cannot allocate memory

That's odd ... ENOMEM from fork() suggests that you're under system-wide
memory pressure.

> [ memory map dump showing no remarkable use of memory at all ]
> 2014-01-10 06:18:46 CST 10.1.1.6 16669 [unknown] production
> 10.1.1.6(36680)ERROR: out of memory
> 2014-01-10 06:18:46 CST 10.1.1.6 16669 [unknown] production
> 10.1.1.6(36680)DETAIL: Failed on request of size 500.

I think that what you've got here isn't really a Postgres issue, but
a system-level configuration issue: the kernel is being unreasonably
stingy about giving out memory, and it's not clear why.

It might be worth double-checking that the postmaster is not being
started under restrictive ulimit settings; though offhand I don't
see how that theory could account for fork-time failures, since
the ulimit memory limits are per-process.

Other than that, you need to burrow around in the kernel settings
and see if you can find something there that's limiting how much
memory it will give to Postgres. It might also be worth watching
the kernel log when one of these problems starts. Plain old "top"
might also be informative as to how much memory is being used.

>> We had originally copied our shared_buffers, work_mem, wal_buffers and
>> other similar settings from our old config, but after getting the memory
>> errors have tweaked them to the following:
>
> shared_buffers = 7680MB
> temp_buffers = 12MB
> max_prepared_transactions = 0
> work_mem = 80MB
> maintenance_work_mem = 1GB
> wal_buffers = 8MB
> max_connections = 350

That seems like a dangerously large work_mem for so many connections;
but unless all the connections were executing complex queries, which
doesn't sound to be the case, that isn't the immediate problem.

>> The weird thing is that our old server had 1/8th the RAM, was set to
>> max_connections = 600 and had the same clients connecting in the same way
>> to the same databases and we never saw any errors like this in the several
>> years we have been using it.

This reinforces the impression that something's misconfigured at the
kernel level on the new server.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Anand Kumar, Karthik 2014-01-10 18:45:06 Re: Sudden slow down and spike in system CPU causes max_connections to get exhausted
Previous Message Preston Hagar 2014-01-10 17:43:47 Re: Question about memory usage