From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Dangling Client Backend Process |
Date: | 2015-10-20 14:58:24 |
Message-ID: | CA+TgmoYUGMn4SQcsA=zScg3kqU1EMiPiRiakgrJd1+eWwMsxKQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 20, 2015 at 12:48 AM, Rajeev rastogi
<rajeev(dot)rastogi(at)huawei(dot)com> wrote:
> On 19 October 2015 21:37, Robert Haas [mailto:robertmhaas(at)gmail(dot)com] Wrote:
>
>>On Sat, Oct 17, 2015 at 4:52 PM, Alvaro Herrera
>><alvherre(at)2ndquadrant(dot)com> wrote:
>>> Andres Freund wrote:
>>>> On 2015-10-14 17:33:01 +0900, Kyotaro HORIGUCHI wrote:
>>>> > If I recall correctly, he concerned about killing the backends
>>>> > running transactions which could be saved. I have a sympathy with
>>>> > the opinion.
>>>>
>>>> I still don't. Leaving backends alive after postmaster has died
>>>> prevents the auto-restart mechanism to from working from there on.
>>>> Which means that we'll potentially continue happily after another
>>>> backend has PANICed and potentially corrupted shared memory. Which
>>>> isn't all that unlikely if postmaster isn't around anymore.
>>>
>>> I agree. When postmaster terminates without waiting for all backends
>>> to go away, things are going horribly wrong -- either a DBA has done
>>> something stupid, or the system is misbehaving. As Andres says, if
>>> another backend dies at that point, things are even worse -- the dying
>>> backend could have been holding a critical lwlock, for instance, or it
>>> could have corrupted shared buffers on its way out. It is not
>>> sensible to leave the rest of the backends in the system still trying
>>> to run just because there is no one there to kill them.
>>
>>Yep. +1 for changing this.
>
> Seems many people are in favor of this change.
> I have made changes to handle backend exit on postmaster death (after they finished their work and waiting for new command).
> Changes are as per approach explained in my earlier mail i.e.
> 1. WaitLatchOrSocket called from secure_read and secure_write function will wait on an additional event as WL_POSTMASTER_DEATH.
> 2. There is a possibility that the command is read without waiting on latch. This case is handled by checking postmaster status after command read (i.e. after ReadCommand).
>
> Attached is the patch.
I don't think that proc_exit(1) is the right way to exit here. It's
not very friendly to exit without at least attempting to give the
client a clue about what has gone wrong. I suggest something like
this:
ereport(FATAL,
(errcode(ERRCODE_ADMIN_SHUTDOWN),
errmsg("terminating connection due to postmaster shutdown")));
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | David Fetter | 2015-10-20 15:03:53 | Re: ROWS FROM(): A Foolish (In)Consistency? |
Previous Message | Robert Haas | 2015-10-20 14:52:05 | Re: ROWS FROM(): A Foolish (In)Consistency? |