Quick Links

Re: Explanation for intermittent buildfarm pg_upgradecheck failures

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Explanation for intermittent buildfarm pg_upgradecheck failures
Date:	2015-08-02 16:45:19
Message-ID:	3633.1438533919@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:
> unlink("/tmp/.s.PGSQL.5432") = 0
> unlink("postmaster.pid") = 0
> unlink("/tmp/.s.PGSQL.5432.lock") = 0
> exit_group(0) = ?
> +++ exited with 0 +++

> I haven't looked to find out why the unlinks happen in this order, but on
> a heavily loaded machine, it's certainly possible that the process would
> lose the CPU after unlink("postmaster.pid"), and then a new postmaster
> could get far enough to see the socket lock file still there. So that
> would account for low-probability failures in the pg_upgradecheck test,
> which is exactly what we've been seeing.

Further experimentation says that 9.0-9.2 do this in the expected order;
so somebody broke it during 9.3.

The lack of a close() on the postmaster socket goes all the way back
though.

regards, tom lane

In response to

Explanation for intermittent buildfarm pg_upgradecheck failures at 2015-08-02 16:30:17 from Tom Lane

Responses

Re: Explanation for intermittent buildfarm pg_upgradecheck failures at 2015-08-02 18:57:27 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2015-08-02 18:31:45	Re: MultiXact member wraparound protections are now enabled
Previous Message	Andres Freund	2015-08-02 16:37:20	Re: No more libedit?! - openssl plans to switch to APL2