From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Better client reporting for "immediate stop" shutdowns |
Date: | 2020-12-22 17:32:00 |
Message-ID: | 615956.1608658320@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Tue, Dec 22, 2020 at 2:29 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>> If I'm correct, quickdie() doesn't access any shared memory because
>> one of the reason we can be in quickdie() is when the shared memory
>> itself is corrupted(the comment down below on why we don't call
>> roc_exit() or atexit() says), in such case, will GetQuitSignalReason()
>> have some problem in accessing the shared memory i.e. + return
>> PMSignalState->sigquit_reason;?
It couldn't really have any problem in physically accessing the field;
we never detach from the main shared memory block. The risk that needs
to be thought about is that shared memory contains garbage --- for
example, imagine that a failing process scribbled in the wrong part of
shared memory before crashing. So the hazard here is that there's a
small chance that sigquit_reason will contain the wrong value, which
would cause the patch to print a misleading message, or more likely
not print anything (since I didn't put a default case in that switch).
That seems fine to me. Also, because the sequence of events would be
(1) failing process scribbles and crashes, (2) postmaster updates
sigquit_reason, (3) other child processes examine sigquit_reason,
it's fairly likely that we'd get the right answer even if the field
got clobbered during (1).
There might be an argument for emitting the "unexpected SIGQUIT"
text if we find garbage in sigquit_reason. Any thoughts about that?
>> AFAIK, errmsg(terminating connection due to administrator command") is
>> emitted when there's no specific reason. But we know exactly why we
>> are terminating in this case, how about having an error message like
>> errmsg("terminating connection due to immediate shutdown request")));
>> ? There are other places where errmsg("terminating connection due to
>> XXXX reasons"); is used. We also log messages when an immediate
>> shutdown request is received errmsg("received immediate shutdown
>> request").
> +1. I definitely think having this message be different can be useful.
OK, will use "terminating connection due to immediate shutdown
request".
> See also the thread about tracking shutdown reasons (connection
> statistics) -- not the same thing, but the same concepts apply.
Hm. I wondered for a bit if that patch could make use of this one
to improve its results. For the specific case of SIGQUIT it seems
moot because we aren't going to let backends send any shutdown
statistics during an emergency stop. But maybe the idea could be
extended to let more-accurate termination reasons be provided in
some other cases.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2020-12-22 17:34:34 | Re: [HACKERS] [PATCH] Generic type subscripting |
Previous Message | Tom Lane | 2020-12-22 16:57:13 | Re: [HACKERS] [PATCH] Generic type subscripting |