From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: The real reason why TAP testing isn't ready for prime time |
Date: | 2015-06-19 15:07:42 |
Message-ID: | 5184.1434726462@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> Now if we look at RewindTest.pm, there is the following code:
> if ($test_master_datadir)
> {
> system
> "pg_ctl -D $test_master_datadir -s -m immediate stop
> 2> /dev/null";
> }
> if ($test_standby_datadir)
> {
> system
> "pg_ctl -D $test_standby_datadir -s -m immediate
> stop 2> /dev/null";
> }
> And I think that the problem is triggered because we are missing a -w
> switch here, meaning that we do not wait until the confirmation that
> the server has stopped, and visibly if stop is slow enough the next
> server to use cannot start because the port is already taken by the
> server currently stopping.
After I woke up a bit more, I remembered that -w is already the default
for "pg_ctl stop", so your diagnosis here is incorrect.
I suspect that the real problem is the arbitrary decision to use -m
immediate. The postmaster would ordinarily wait for its children to
die, but on a slow machine we could perhaps reach the end of that
5-second timeout, whereupon the postmaster would SIGKILL its children
*and exit immediately*. I'm not sure how instantaneous SIGKILL is,
but it seems possible that we could end up trying to start the new
postmaster before all the children of the old one are dead. If the
shmem interlock is working properly that ought to fail.
I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-06-19 15:14:32 | Re: Missing tab-complete for PASSWORD word in CREATE ROLE syntax |
Previous Message | Tom Lane | 2015-06-19 14:58:00 | Re: pg_regress not waiting for postmaster to stop |