Re: Cannot allocate memory

From: Hannah Huang <hannah(dot)huang(dot)y(at)gmail(dot)com>
To: mjoigny(at)neteven(dot)com
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Cannot allocate memory
Date: 2020-09-23 08:00:16
Message-ID: 3924CB72-367B-45A2-BBBC-419C46C8C0B9@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

> On 23 Sep 2020, at 2:14 pm, Hannah Huang <hannah(dot)huang(dot)y(at)gmail(dot)com> wrote:
>
>
>
>> On 23 Sep 2020, at 12:59 am, JOIGNY Michael @Neteven <mjoigny(at)neteven(dot)com <mailto:mjoigny(at)neteven(dot)com>> wrote:
>>
>> Hi Hannah,
>>
>> Thanks for your feedback.
>>
>> Here are the CPU / IO graphs over the same period.
>>
>> You will find on the graphs the annotation which specifies the moment of the migration.
>>
>>
>> <cpu.png>
>>
>> <io_requests.png>
>>
>> <io_bytes.png>
>>
>> <io_time.png>
>>
>>
>>
>> Le 22/09/2020 à 15:10, Hannah Huang a écrit :
>>>
>>>
>>>> On 22 Sep 2020, at 6:20 pm, JOIGNY Michael @Neteven <mjoigny(at)neteven(dot)com <mailto:mjoigny(at)neteven(dot)com>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I would like to come back to you about my memory problem on postgres 12.
>>>>
>>>> We had the same configuration under postgres 11.8, we disabled JIT (enabled by default under postgres 12) for segfault problems.
>>>>
>>>> To illustrate the change in memory behavior, here is a screenshot before / after migration :
>>>>
>>>> <memory.png>
>>>>
>>>>
>>>> Do you have any idea what could change the behavior in this way? another parameter enabled by default under postgres 12 like JIT?
>>>>
>>>> Regards.
>>>>
>>>> Le 18/09/2020 à 12:03, JOIGNY Michael @Neteven a écrit :
>>>>> Hi Community,
>>>>>
>>>>> I'm asking for your lights because i'm having memory problems with postgres.
>>>>>
>>>>> Examples of logs :
>>>>>
>>>>> FATAL: could not fork new process for connection: Cannot allocate memory could not fork new process for connection: Cannot allocate memory
>>>>> out of memory DETAIL: Failed on request of size 32800 in memory context "HashBatchContext".
>>>>> out of memory DETAIL Failed on request of size 288 in memory context "CacheMemoryContext".
>>>>>
>>>>>
>>>>> We use postgresql (primary/standby) with pgbouncer as a pooler, and repmgr as replication manager.
>>>>>
>>>>> We have ~ 2000 connections at the same time with ~ 20/30 are active. (we need to set a high number of connexion on postgres, because our app uses a lot of different users, and each user on each app server needs multiple and constant connexions).
>>>>>
>>>>> Here is my configuration :
>>>>>
>>>>> system :
>>>>>
>>>>> Debian : 9.13
>>>>> Memory : 380 Go
>>>>> Postgres : 12.4-1.pgdg90+1
>>>>> Pgbouncer : 1.14
>>>>> kernel.shmmax = 202591600640
>>>>> kernel.shmall = 49460840
>>>>> postgres :
>>>>>
>>>>> dynamic_shared_memory_type = posix # the default is the first option
>>>>> max_connections = 2600 # (change requires restart)
>>>>> work_mem = 96MB # min 64kB
>>>>> maintenance_work_mem = 8GB # min 1MB
>>>>> shared_buffers = 64GB # min 128kB
>>>>> temp_buffers = 32MB # min 800kB
>>>>> wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers
>>>>> effective_cache_size = 270GB
>>>>>
>>>>> pgbouncer :
>>>>>
>>>>> max_client_conn = 6000
>>>>> default_pool_size = 2590
>>>>> reserve_pool_size = 5
>>>>> pool_mode = session
>>>>>
>>>>> Do you think that our parameters are not correct compared to our configuration? Do you have an idea ?
>>>>>
>>>>> Best regards.
>>>>>
>>>>> Michael.
>>>>>
>>> System settings:
>>> After 9.2, PostgreSQL switched to POSIX shared memory. So now it requires fewer bytes of System V shared memory.
>>>
>>> You don’t need to configure SHMMAX to that high value which is ~188GB, same as SHMALL, I would set them back to default value.
>>>
>>> But, from the second email you sent it seems like an issue related to the version upgrade (from 11 to 12). So I don’t think you should change OS parameters at this moment. I would suggest you provide more stats on CPU (system, user, wa,ni) and disk IO - before and after.
>>>
>>> Thanks, Hannah
>
> I can see:
>
> CPU stats: CPU it spends more time on user job and the wait time on IO decreases, time spent on kernal space is about same as before.
> IO stats: read drops dramatically, write slightly increases. This explains why CPU wait on IO decreases.
>
> I would like to know how is the performance of application queries. For example, the before and after execution time of the most time consuming queries. Are they still about the same? Do you have any database performance analytics tools installed? I used to use pgbadger to analyze database logs which would show me the past performance statistics of the database. We now have an understanding about how the OS performance looks like. It would be good to have a general picture of the database performance from the db layer.
>
> Thanks,
> Hannah
>
>
>
>

Continue with my question regards application performance - Does database experience performance degradation? Or just simply more failed SQL with the particular memory error thrown out?

Further question:

How often does the memory errors occur? Are they all caused by one particular SQL or a certain type of SQL?

Would you mind sending out the log within a particular time range?

Thanks,
Suya

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message jian xu 2020-09-23 13:31:27 creating gist index on ltree column failed
Previous Message Hannah Huang 2020-09-23 04:14:10 Re: Cannot allocate memory