PGCTLTIMEOUT in pg_regress, or skink versus the clock

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: PGCTLTIMEOUT in pg_regress, or skink versus the clock
Date: 2016-04-20 22:38:56
Message-ID: 13969.1461191936@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Buildfarm member skink has failed three times recently like this:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2016-04-15%2001%3A20%3A44

the relevant part of that being

pg_regress: postmaster did not respond within 60 seconds
Examine /home/andres/build/buildfarm/REL9_5_STABLE/pgsql.build/src/interfaces/ecpg/test/log/postmaster.log for the reason

where the postmaster log shows nothing particularly surprising,
it's just not reached the ready state yet:

LOG: database system was shut down at 2016-04-15 05:11:18 UTC
FATAL: the database system is starting up
LOG: MultiXact member wraparound protections are now enabled

Now, there are some reasons to suspect that there might be more here than
meets the eye; for one thing, it stretches credulity a bit to believe that
it's only random chance that all three failures are in the 9.5 branch and
all are in the ecpg regression test step. I'm also curious as to why we
see only one "FATAL: the database system is starting up" connection
rejection and not sixty. However, by far the simplest explanation for
this failure is just that the postmaster took more than 60 seconds to
start up; and seeing that skink is running Valgrind and is on an AWS
instance, that's not that much of a stretch of credulity either.

Hence, I am thinking that we missed a bet in commit 2ffa86962077c588
et al, and that pg_regress's hard-wired 60-second start timeout ought to
be overridable from an environment variable just as pg_ctl's timeout is.
It might as well be the same environment variable, so I propose the
attached patch. Note that since the shutdown end of things in pg_regress
uses "pg_ctl stop", that end of it already responds to PGCTLTIMEOUT.
(I could not find any user-facing documentation for pg_regress, so there's
no apparent need for a docs update.)

Any objections?

regards, tom lane

Attachment Content-Type Size
honor-PGCTLTIMEOUT-in-pg_regress.patch text/x-diff 2.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Eric Ridge 2016-04-21 00:09:55 Re: Disallow unique index on system columns
Previous Message Michael Paquier 2016-04-20 22:27:29 Re: Avoid parallel full and right join paths.