Re: pgsql: Perform an immediate shutdown if the postmaster.pid file is remo

From: Thom Brown <thom(at)linux(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers <pgsql-committers(at)postgresql(dot)org>
Subject: Re: pgsql: Perform an immediate shutdown if the postmaster.pid file is remo
Date: 2015-10-09 14:56:40
Message-ID: CAA-aLv6U6jJjCONKSaYysx=kgrfBPDQNC8eKr2eiebuHQ6s3GA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On 6 October 2015 at 22:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Perform an immediate shutdown if the postmaster.pid file is removed.
>
> The postmaster now checks every minute or so (worst case, at most two
> minutes) that postmaster.pid is still there and still contains its own PID.
> If not, it performs an immediate shutdown, as though it had received
> SIGQUIT.
>
> The original goal behind this change was to ensure that failed buildfarm
> runs would get fully cleaned up, even if the test scripts had left a
> postmaster running, which is not an infrequent occurrence. When the
> buildfarm script removes a test postmaster's $PGDATA directory, its next
> check on postmaster.pid will fail and cause it to exit. Previously, manual
> intervention was often needed to get rid of such orphaned postmasters,
> since they'd block new test postmasters from obtaining the expected socket
> address.
>
> However, by checking postmaster.pid and not something else, we can provide
> additional robustness: manual removal of postmaster.pid is a frequent DBA
> mistake, and now we can at least limit the damage that will ensue if a new
> postmaster is started while the old one is still alive.
>
> Back-patch to all supported branches, since we won't get the desired
> improvement in buildfarm reliability otherwise.
>
> Branch
> ------
> REL9_3_STABLE
>
> Details
> -------
> http://git.postgresql.org/pg/commitdiff/31bc563b9be306623c5e9a52816b432945fa6df9
>
> Modified Files
> --------------
> src/backend/postmaster/postmaster.c | 52 ++++++++++++++++++++------
> src/backend/utils/init/miscinit.c | 70 +++++++++++++++++++++++++++++++++++
> src/include/miscadmin.h | 1 +
> 3 files changed, 112 insertions(+), 11 deletions(-)

The log contains a misleading output following the removal of the pid file:

2015-10-09 15:39:32 BST [31507]: [4-1] user=,db=,client= LOG: could
not open file "postmaster.pid": No such file or directory
2015-10-09 15:39:32 BST [31507]: [5-1] user=,db=,client= LOG:
performing immediate shutdown because data directory lock file is
invalid
2015-10-09 15:39:32 BST [31507]: [6-1] user=,db=,client= LOG:
received immediate shutdown request
2015-10-09 15:39:32 BST [31556]: [1-1] user=,db=,client= WARNING:
terminating connection because of crash of another server process
2015-10-09 15:39:32 BST [31556]: [2-1] user=,db=,client= DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally
and possibly corrupted shared memory.
2015-10-09 15:39:32 BST [31556]: [3-1] user=,db=,client= HINT: In a
moment you should be able to reconnect to the database and repeat your
command.

Is this anything we need to worry about?

--
Thom

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2015-10-09 18:33:09 pgsql: Remove set_latch_on_sigusr1 flag.
Previous Message Stephen Frost 2015-10-09 14:49:17 pgsql: Handle append_rel_list in expand_security_qual