From: | Thom Brown <thom(at)linux(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Perform an immediate shutdown if the postmaster.pid file is remo |
Date: | 2015-10-09 14:56:40 |
Message-ID: | CAA-aLv6U6jJjCONKSaYysx=kgrfBPDQNC8eKr2eiebuHQ6s3GA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
On 6 October 2015 at 22:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Perform an immediate shutdown if the postmaster.pid file is removed.
>
> The postmaster now checks every minute or so (worst case, at most two
> minutes) that postmaster.pid is still there and still contains its own PID.
> If not, it performs an immediate shutdown, as though it had received
> SIGQUIT.
>
> The original goal behind this change was to ensure that failed buildfarm
> runs would get fully cleaned up, even if the test scripts had left a
> postmaster running, which is not an infrequent occurrence. When the
> buildfarm script removes a test postmaster's $PGDATA directory, its next
> check on postmaster.pid will fail and cause it to exit. Previously, manual
> intervention was often needed to get rid of such orphaned postmasters,
> since they'd block new test postmasters from obtaining the expected socket
> address.
>
> However, by checking postmaster.pid and not something else, we can provide
> additional robustness: manual removal of postmaster.pid is a frequent DBA
> mistake, and now we can at least limit the damage that will ensue if a new
> postmaster is started while the old one is still alive.
>
> Back-patch to all supported branches, since we won't get the desired
> improvement in buildfarm reliability otherwise.
>
> Branch
> ------
> REL9_3_STABLE
>
> Details
> -------
> http://git.postgresql.org/pg/commitdiff/31bc563b9be306623c5e9a52816b432945fa6df9
>
> Modified Files
> --------------
> src/backend/postmaster/postmaster.c | 52 ++++++++++++++++++++------
> src/backend/utils/init/miscinit.c | 70 +++++++++++++++++++++++++++++++++++
> src/include/miscadmin.h | 1 +
> 3 files changed, 112 insertions(+), 11 deletions(-)
The log contains a misleading output following the removal of the pid file:
2015-10-09 15:39:32 BST [31507]: [4-1] user=,db=,client= LOG: could
not open file "postmaster.pid": No such file or directory
2015-10-09 15:39:32 BST [31507]: [5-1] user=,db=,client= LOG:
performing immediate shutdown because data directory lock file is
invalid
2015-10-09 15:39:32 BST [31507]: [6-1] user=,db=,client= LOG:
received immediate shutdown request
2015-10-09 15:39:32 BST [31556]: [1-1] user=,db=,client= WARNING:
terminating connection because of crash of another server process
2015-10-09 15:39:32 BST [31556]: [2-1] user=,db=,client= DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally
and possibly corrupted shared memory.
2015-10-09 15:39:32 BST [31556]: [3-1] user=,db=,client= HINT: In a
moment you should be able to reconnect to the database and repeat your
command.
Is this anything we need to worry about?
--
Thom
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-10-09 18:33:09 | pgsql: Remove set_latch_on_sigusr1 flag. |
Previous Message | Stephen Frost | 2015-10-09 14:49:17 | pgsql: Handle append_rel_list in expand_security_qual |