Re: Mutex error 22 - Postgres version 14

From: sireesha <sireesha(dot)padmini(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Mutex error 22 - Postgres version 14
Date: 2023-02-09 21:35:51
Message-ID: CAAM4KK3AJT0w0LW3S8sRqYDjV4epoVbuGP4cEm8VAg-OWNFwtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Tom ,

Thanks for the update . It went to my spam folder and missed your earlier
inputs.

Here is the detailed information of the error.
There is an additional error *"Error WriteLocking RWLock!35" along with
Mutex 22 error in the logfile.*

*A description of what you are trying to achieve and what results you
expect.*
Encountered Mutex 22 error with Postgres defunct processes in Postgres
version 14 database. Its an active/standby setup with repmgr and pgbouncer.

The database went into hung state with Postgres defunct process and we had
to restart the server to make database operational again .
I have pasted the error messages found in postgresql.log.

The EXACT PostgreSQL version you are running
*PostgreSQL 14.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.4.1
20200928 (Red Hat 8.4.1-1), 64-bit*

How you installed PostgreSQL

*https://www.postgresql.org/ <https://www.postgresql.org/>*

Changes made to the settings in the postgresql.conf file: see Server
Configuration for a quick way to list them all.

*max_wal_size = 1GBmin_wal_size = 80MB*

*shared_buffers = 10GB*
Operating system and version

*Linux 4.18.0-305.12.1.el8_4.x86_64x86_64 x86_64 x86_64 GNU/Linux*

For questions about any kind of error:
What you were doing when the error happened / how to cause the error.
The database went into hung state with below errors in the postgresql.log
2023-01-24 10:29:45.399 PST [912001] LOG: PID 0 in cancel request did not
match any process
Error WriteLocking RWLock!35
2023-01-24 10:31:21.084 PST [9677] WARNING: worker took too long to start;
canceled
2023-01-24 10:32:21.143 PST [9677] WARNING: worker took too long to start;
canceled
2023-01-24 10:33:21.171 PST [9677] WARNING: worker took too long to start;
canceled
2023-01-24 10:34:21.305 PST [9677] WARNING: worker took too long to start;
canceled
2023-01-24 10:35:21.357 PST [9677] WARNING: worker took too long to start;
canceled
2023-01-24 10:36:21.417 PST [9677] WARNING: worker took too long to start;
canceled
2023-01-24 10:37:21.468 PST [9677] WARNING: worker took too long to start;
canceled
2023-01-24 10:38:21.532 PST [9677] WARNING: worker took too long to start;
canceled
Also we have noticed defunct processes from Postgres. We had to restart the
Server to make the database operational.

*What program you're using to connect to PostgreSQL*
It's active /passive standby setup with repmgr to maintain the cluster and
pgbouncer to connect to the Postgres dtaabase.
Version - pgbouncer-1.15.0
*Is there anything remotely unusual in the PostgreSQL server logs?*
No abnormal errors noticed in the server log.

Regards,
PS

On Wed, Feb 1, 2023 at 4:18 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> > On Wed, Feb 1, 2023 at 3:02 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> 2023-01-24 02:35:45.833 PST [3424807] LOG: PID 0 in cancel request
> did not
> >>> match any process
> >>> *Error locking mutex 22*
>
> > I wonder if 22 might be EINVAL, which is one possible error code used
> > by pthread_mutex_lock().
>
> Maybe, but we still don't know what's reporting the error.
>
> I tried searching for "Error locking mutex" in Debian Code Search,
> and got several hits, but none of them match this exactly --- the
> ones that offer any additional info present it as a string not a
> number.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2023-02-09 21:53:47 Re: Mutex error 22 - Postgres version 14
Previous Message Laurenz Albe 2023-02-09 20:10:27 Re: Cascading Replication - Standby Recovering Faster from Archive rather than Upstream node