From: | Joseph Hammerman <joe(dot)hammerman(at)datadoghq(dot)com> |
---|---|
To: | pgsql-admin(at)lists(dot)postgresql(dot)org |
Subject: | Triaging pg_ctl shutdown hang |
Date: | 2021-12-29 15:16:44 |
Message-ID: | CAHs7QM_yx=KhjwHub7PyqvaosTpb9AQxXzHPFx7Pnu+0hvxLaw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hi pgsql-admins list,
We recently had an incident precipitated by postgres 9.6.22 shutdown -m
fast hanging. There were two processes that were not quitting, the
postmaster and the logger process. We had limited visibility into the
underlying conditions since psql locks out new connections and kicks
everyone out in fast shut down mode. Even when we escalated the shutdown
signal to immediate, the processes were not exiting.
I’m trying to put together a checklist for data for us to capture to
determine the root cause of the hang if we encounter this issue again.
For example, running echo w > /proc/sysrq-trigger to get a list of
processes in uninterruptible sleep, and perform a kernel stack trace on
them. Is it worth stracing the postmaster process and surviving children?
Does pg_controldata surface any useful data?
As a follow up question, is there a way to obtain an administrative
backdoor or leave one open during hanging fast shutdown operations?
Thanks in advance for any clarity or guidance anyone the message board can
provide.
Joe Hammerman
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Rolf | 2021-12-29 16:00:24 | Re: PostgreSQL Replication between Different Major Version (11-13) |
Previous Message | Magnus Hagander | 2021-12-29 13:27:36 | Re: PostgreSQL Replication between Different Major Version (11-13) |