Re: Many queries stuck for waiting MultiXactOffsetControlLock on PostgreSQL 9.6.1 replica

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Rui An <rueian(dot)huang(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Many queries stuck for waiting MultiXactOffsetControlLock on PostgreSQL 9.6.1 replica
Date: 2021-08-02 04:57:03
Message-ID: 781DEB1E-4322-417B-9EB3-C93F34782D07@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi!

> 1 авг. 2021 г., в 00:30, Rui An <rueian(dot)huang(at)gmail(dot)com> написал(а):
>
> Hi, I have recently encountered the problem, as title, on some of my PostgreSQL 9.6.1 hot standby replicas when serving readonly queries.
>
> The problem makes a lot of PostgreSQL processes stuck on waiting for MultiXactOffsetControlLock and eventually runs out of the max_connections quota.
>
> For example:
> mydb=# select count(*) from pg_stat_activity where state = 'active' and wait_event = 'MultiXactOffsetControlLock';
> count
> ------
> 956
> (1 row)
>
> I have tried use pg_terminate_backend to kill the stuck prcoesses, but no help. They still stuck on waiting for MultiXactOffsetControlLock even if the pg_terminate_backend(pid) returns true.
>
> Currently, the only thing I can do to resolve the problem is to restart my replicas.
>
> Can someone help me find out how could the problem occurred and is there any other way to resolve the problem?
>

I highly recommend you to update PG to 9.6.22. Your system lacks almost 5 years of updates. This may or may not be related to the issue you observe, but still...

I suspect that you observe effects of this sleep [0]. Kyotaro Horiguchi wrote draft patch to fix this [1], but I haven't put enough efforts to make it commit-ready yet.
I think that you can mitigate the problem by increasing size of MultixactOffset buffers here [2]. I hope in future this settings will be configurable, thread [1] is just about this.

Thanks!

Best regards, Andrey Borodin.

[0] https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/access/transam/multixact.c#L1289-L1319
[1] https://www.postgresql.org/message-id/flat/20200515.090333.24867479329066911.horikyota.ntt%40gmail.com#855f8bb7205890579a363d2344b4484d
[2] https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/include/access/multixact.h#L32-L33

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2021-08-02 05:48:40 Re: Many queries stuck for waiting MultiXactOffsetControlLock on PostgreSQL 9.6.1 replica
Previous Message Rui An 2021-07-31 19:30:14 Many queries stuck for waiting MultiXactOffsetControlLock on PostgreSQL 9.6.1 replica