Re: ERROR: too many dynamic shared memory segments

From: Nicola Contu <nicola(dot)contu(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>, Alessandro Aste <Alessandro(dot)aste(at)gtt(dot)net>
Subject: Re: ERROR: too many dynamic shared memory segments
Date: 2020-01-29 09:37:09
Message-ID: CAMTZZh2d=e247vAiHTDtSYHzA-hZFXjgxSEiQaLGB9fQ6-HhGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

This is the error on postgres log of the segmentation fault :

2020-01-21 14:20:29 GMT [] [42222]: [108-1] db=,user= LOG: server process
(PID 2042) was terminated by signal 11: Segmentation fault
2020-01-21 14:20:29 GMT [] [42222]: [109-1] db=,user= DETAIL: Failed
process was running: select pid from pg_stat_activity where query ilike
'REFRESH MATERIALIZED VIEW CONCURRENTLY matview_vrs_request_stats'
2020-01-21 14:20:29 GMT [] [42222]: [110-1] db=,user= LOG: terminating any
other active server processes
2020-01-21 14:20:29 GMT [127.0.0.1(34256)] [2724]: [4-1]
db=cmdv3,user=admin WARNING: terminating connection because of crash of
another server process
2020-01-21 14:20:29 GMT [127.0.0.1(34256)] [2724]: [5-1]
db=cmdv3,user=admin DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2020-01-21 14:20:29 GMT [127.0.0.1(34256)] [2724]: [6-1]
db=cmdv3,user=admin HINT: In a moment you should be able to reconnect to
the database and repeat your command.
2020-01-21 14:20:29 GMT [127.0.0.1(34026)] [2055]: [5-1]
db=cmdv3,user=admin WARNING: terminating connection because of crash of
another server process

At CentOS level :
Jan 21 14:20:29 usnyh-cmd1.gt-t.net kernel: postmaster[2042]: segfault at 0
ip 000000000048bef4 sp 00007ffdf4955bb0 error 4 in postgres[400000+6c5000]
Jan 21 14:20:29 usnyh-cmd1.gt-t.net pgbouncer: 2020-01-21 14:20:29.905 GMT
[33727] WARNING C-0x21526e8: cmdv3/admin(at)10(dot)151(dot)2(dot)154:39688 pooler error:
server conn crashed?
Jan 21 14:20:29 usnyh-cmd1.gt-t.net pgbouncer: 2020-01-21 14:20:29.905 GMT
[33727] WARNING S-0x1f5ff58: cmdv3/admin(at)127(dot)0(dot)0(dot)1:5432 got packet 'N' from
server when not linked
Jan 21 14:20:29 usnyh-cmd1.gt-t.net pgbouncer: 2020-01-21 14:20:29.905 GMT
[33727] WARNING S-0x25097f0: cmdv3/admin(at)127(dot)0(dot)0(dot)1:5432 got packet 'N' from
server when not linked
Jan 21 14:20:29 usnyh-cmd1.gt-t.net pgbouncer: 2020-01-21 14:20:29.906 GMT
[33727] WARNING S-0x2508b60: cmdv3/admin(at)127(dot)0(dot)0(dot)1:5432 got packet 'N' from
server when not linked
Jan 21 14:20:29 usnyh-cmd1.gt-t.net pgbouncer: 2020-01-21 14:20:29.981 GMT
[33727] ERROR S: login failed: FATAL: the database system is in recovery
mode

> If you're on Linux, you can probably see them with "ls /dev/shm".

I see a lot of files there, and doing a cat they are empty. What can I do
with them?

Those are two different problems I guess, but they are related because
right before the Segmentation Fault I see a lot of shared segment errors in
the postgres log.

Il giorno mer 29 gen 2020 alle ore 10:09 Thomas Munro <
thomas(dot)munro(at)gmail(dot)com> ha scritto:

> On Wed, Jan 22, 2020 at 4:06 AM Nicola Contu <nicola(dot)contu(at)gmail(dot)com>
> wrote:
> > after a few months, we started having this issue again.
> > So we revert the work_mem parameter to 600MB instead of 2GB.
> > But the issue is still there. A query went to segmentation fault, the DB
> went to recovery mode and our app went to read only for a few minutes.
>
> Hi Nicola,
> Hmm, a segmentation fault sounds like a different problem. Can you
> please share the exact error messages from PostgreSQL and OS logs?
>
> > I understand we can increase max_connections so we can have many more
> segments.
> >
> > My question is : is there a way to understand the number of segments we
> reached?
>
> If you're on Linux, you can probably see them with "ls /dev/shm".
>
> > Currently we have 220 max_conn so as your formula is 64 + 2* 220 we have
> about 500 shared segments.
> > We would like to increase that number to 300 or 400 but would be great
> to understand if there is a way to make sure we will solve the issue as it
> requires a restart of the service.
> >
> > I know you were also talking about a redesign this part in PostgreSQL.
> Do you know if anything has changed in any of the newer versions after 11.5?
>
> It's possible that we should increase a couple of constants used the
> formula -- I'll look into that again. But first I'd like to see if
> we're even investigating the right problem here.
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Julian Backes 2020-01-29 10:23:57 Re: ERROR: too many dynamic shared memory segments
Previous Message Thomas Munro 2020-01-29 09:09:16 Re: ERROR: too many dynamic shared memory segments