From: | "MauMau" <maumau307(at)gmail(dot)com> |
---|---|
To: | <pgsql-hackers(at)postgresql(dot)org> |
Subject: | [bug fix] "pg_ctl stop" times out when it should respond quickly |
Date: | 2013-12-03 12:45:53 |
Message-ID: | DF2AB03E91D547319F29A21458EA868E@maumau |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
I've encountered a small bug and fixed it. I guess this occurs on all major
releases. I saw this happen on 9.2 and 9.4devel. Please find attached the
patch and commit this.
[Problem]
If I mistakenly set an invalid value to listen_addresses, say '-1', and
start the database server, it fails to start as follows. In my environment
(RHEL6 for Intel64), it takes about 15 seconds before postgres prints the
messages. This is OK.
[maumau(at)myhost pgdata]$ pg_ctl -w start
waiting for server to start........................LOG: could not translate
host name "-1", service "5450" to address: Temporary failure in name
resolution
WARNING: could not create listen socket for "-1"
FATAL: could not create any TCP/IP sockets
stopped waiting
pg_ctl: could not start server
Examine the log output.
[maumau(at)myhost pgdata]$
When I start the server without -w and try to stop it, "pg_ctl stop" waits
for 60 seconds and timed out before it fails. This is what I'm seeing as a
problem. I expected "pg_ctl stop" to respond quickly with success or
failure depending on the timing.
[maumau(at)myhost pgdata]$ pg_ctl start
server starting
...(a few seconds later)
[maumau(at)myhost ~]$ pg_ctl stop
waiting for server to shut
down.................................................
.............. failed
pg_ctl: server does not shut down
HINT: The "-m fast" option immediately disconnects sessions rather than
waiting for session-initiated disconnection.
[maumau(at)myhost ~]$
[Cause]
The problem occurs in the sequence below:
1. postmaster creates $PGDATA/postmaster.pid.
2. postmaster tries to resolve the value of listen_addresses to IP
addresses. This took about 15 seconds in my failure scenario.
3. During 2, pg_ctl sends SIGTERM to postmaster.
4. postmaster terminates immediately without deleting
$PGDATA/postmaster.pid. This is because it hasn't set signal handlers yet.
5. "pg_ctl stop" waits in a loop until $PGDATA/postmaster.pid disappears.
But the file does not disappear and it times out.
[Fix]
Make pg_ctl check if postmaster is still alive, because postmaster might
have crashed unexpectedly.
Regards
MauMau
Attachment | Content-Type | Size |
---|---|---|
pg_stop_fail.patch | application/octet-stream | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-12-03 12:55:45 | Re: Skip hole in log_newpage |
Previous Message | Heikki Linnakangas | 2013-12-03 12:20:17 | Re: Skip hole in log_newpage |