Re: need help on memory allocation

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Rambabu V <ram(dot)wissen(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: need help on memory allocation
Date: 2018-01-23 22:36:53
Message-ID: CAMkU=1zG0EP36=rGWCwrCO6OLCGJdcCuUbv7yDj5AnqvpRZ=Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Jan 23, 2018 at 5:59 AM, Rambabu V <ram(dot)wissen(at)gmail(dot)com> wrote:

> > cat PostgreSQL-2018-01-23_060000.csv|grep FATAL

What about ERROR, not just FATAL? Or grep for "out of memory"

>> *$ free -mh*
>> total used free shared buffers cached
>> Mem: 58G 58G 358M 16G 3.6M 41G
>> -/+ buffers/cache: 16G 42G
>> Swap: 9.5G 687M 8.9G
>>
>
This does not seem like it should be a problem. Is this data collected
near the time of the failure?

> work_mem = 256MB # min 64kB
>> max_connections = 600
>>
>
These look pretty high, especially in combination. Why do you need that
number of connections? Could you use a connection pooler instead? Or do
just have an application bug (leaked connection handles) that needs to be
fixed? Why do you need that amount of work_mem?

> *ps -ef|grep postgres|grep idle|wc -l*
>> 171
>>
>> *ps -ef|grep postgres|wc -l*
>> 206
>>
>
How close to the time of the problem was this recorded? How many of the
idle are 'idle in transaction'?

>> PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
>> 109063 postgres 20 0 16.7G 16.4G 16.3G S 0.0 27.8 39:55.61
>> postgres: test sss 10.20.2.228(55174) idle
>> 24910 postgres 20 0 16.7G 16.4G 16.3G S 0.0 27.8 27:45.35
>> postgres: testl sss 10.20.2.228(55236) idle
>> 115539 postgres 20 0 16.7G 16.4G 16.3G S 0.0 27.8 28:22.89
>> postgres: test sss 10.20.2.228(55184) idle
>> 9816 postgres 20 0 16.7G 16.4G 16.3G S 0.0 27.8 40:19.57
>> postgres: test sss 10.20.2.228(55216) idle
>>
>
How close to the time of the problem was this recorded? Nothing here seems
to be a problem, because almost all the memory they have resident is shared
memory.

It looks like all your clients decide to run a memory hungry query
simultaneously, consume a lot of work_mem, and cause a problem. Then by
the time you notice the problem and start collecting information, they are
done and things are back to normal.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2018-01-23 22:59:23 Re: 8.2 Autovacuum BUG ?
Previous Message Laurenz Albe 2018-01-23 18:36:32 Re: need help on memory allocation