Re: ERROR: too many dynamic shared memory segments

From: Nicola Contu <nicola(dot)contu(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>, Alessandro Aste <Alessandro(dot)aste(at)gtt(dot)net>
Subject: Re: ERROR: too many dynamic shared memory segments
Date: 2020-01-21 16:14:15
Message-ID: CAMTZZh1cjdHn_QC5N=s5CPZnB8Kp8rxZu66yTuhK4t_M0O0URQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We also reverted this param :

cmdv3=# show max_parallel_workers_per_gather;
max_parallel_workers_per_gather
---------------------------------
2
(1 row)

It was set to 8.

Il giorno mar 21 gen 2020 alle ore 16:06 Nicola Contu <
nicola(dot)contu(at)gmail(dot)com> ha scritto:

> Hey Thomas,
> after a few months, we started having this issue again.
> So we revert the work_mem parameter to 600MB instead of 2GB.
> But the issue is still there. A query went to segmentation fault, the DB
> went to recovery mode and our app went to read only for a few minutes.
>
> I understand we can increase max_connections so we can have many more
> segments.
>
> My question is : is there a way to understand the number of segments we
> reached?
> Currently we have 220 max_conn so as your formula is 64 + 2* 220 we have
> about 500 shared segments.
> We would like to increase that number to 300 or 400 but would be great to
> understand if there is a way to make sure we will solve the issue as it
> requires a restart of the service.
>
> I know you were also talking about a redesign this part in PostgreSQL. Do
> you know if anything has changed in any of the newer versions after 11.5?
>
> Thanks a lot,
> Nicola
>
>
>
>
>
> Il giorno gio 12 set 2019 alle ore 01:01 Thomas Munro <
> thomas(dot)munro(at)gmail(dot)com> ha scritto:
>
>> On Wed, Sep 11, 2019 at 11:20 PM Nicola Contu <nicola(dot)contu(at)gmail(dot)com>
>> wrote:
>> > If the error persist I will try to revert the work_mem.
>> > Thanks a lot
>>
>> Hi Nicola,
>>
>> It's hard to say exactly what the cause of the problem is in your case
>> and how to avoid it, without knowing what your query plans look like.
>> PostgreSQL allows 64 + 2 * max_connections segments to exist a time,
>> and it needs a number of them that depends on work_mem (in the case of
>> Parallel Hash Join and Parallel Bitmap Index Scan), and also depends
>> on the number of Gather nodes that appear in the plan, which in some
>> unusual cases can result from partitioning.
>>
>> I've seen people reaching this error by running a lot of parallel
>> queries concurrently. If that's the cause, then you can definitely
>> get some relief by turning work_mem down, or by turning
>> max_connections up (even though you don't want to allow more
>> connections -- because it influences the formula for deciding on the
>> DSM segment limit). We should probably adjust some of the internal
>> constants to give us more slots, to avoid that problem, as discussed
>> here:
>>
>>
>> https://www.postgresql.org/message-id/flat/CA%2BhUKGL6H2BpGbiF7Lj6QiTjTGyTLW_vLR%3DSn2tEBeTcYXiMKw%40mail.gmail.com
>>
>> I've also seen people reaching this error by somehow coming up with
>> plans that had a very large number of Gather nodes in them,
>> corresponding to partitions; that's probably a bad plan (it'd
>> presumably be better to terminate parallelism higher up in the plan,
>> but these plans do seem to exist in the wild; I don't recall exactly
>> why). I think we need a bit of a redesign so that if there are
>> multiple Gather nodes, they share the same main DSM segment, instead
>> of blowing through this limit.
>>
>> --
>> Thomas Munro
>> https://enterprisedb.com
>>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2020-01-21 16:47:46 Re: Questions about how to streaming replication (pg 9.2)
Previous Message David Steele 2020-01-21 15:45:08 Re: pgbackrest: ERROR: [029]: unable to convert base 10 string '0000000B' to unsigned int