From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se> |
Subject: | Re: Adding CI to our tree |
Date: | 2022-01-19 20:05:44 |
Message-ID: | 647439.1642622744@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> This test attempt revealed another problem too: the standby never
> shut down, and thus the calling "make" never quit, until I intervened
> manually. I'm not sure why. I see that Cluster::promote uses
> system_or_bail() to run "pg_ctl promote" ... could it be that
> BAIL_OUT causes the normal script END hooks to not get run?
> But it seems like we'd have noticed that long ago.
I failed to reproduce any failure in the promote step, and I now
think I was mistaken and it happened during the standby's initial
start. I can reproduce that very easily by setting PGCTLTIMEOUT
to a second or two; with fsync enabled, it takes the standby more
than that to reach a consistent state. And the cause of that
is obvious: Cluster::start thinks that if "pg_ctl start" failed,
there couldn't possibly be a postmaster running. So it doesn't
bother to update self->_pid, and then the END hook thinks there
is nothing to do.
Now, leaving an idle postmaster hanging around isn't a mortal sin,
since it'll go away by itself shortly after the next cycle of
testing does an "rm -rf" on its data directory. But it's ugly,
and conceivably it could cause problems for later testing on
machines with limited shmem or semaphore space.
The attached simple fix gets rid of this problem. Any objections?
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
check-for-postmaster-even-after-pg_ctl-failure.patch | text/x-diff | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2022-01-19 20:09:41 | Re: Schema variables - new implementation for Postgres 15 |
Previous Message | Bossart, Nathan | 2022-01-19 19:57:23 | Re: do only critical work during single-user vacuum? |