Quick Links

Re: Possible fix for occasional failures on castoroides etc

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Dave Page <dpage(at)pgadmin(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Possible fix for occasional failures on castoroides etc
Date:	2014-05-03 18:59:28
Message-ID:	31133.1399143568@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:
> Unfortunately, it seems the Solaris implementors didn't read Stevens,
> because it looks to me like they *do* return ECONNREFUSED on accept queue
> overflow. Still, it's hard to see how that would be the issue if we're
> still seeing this failure with only five clients.

Also, after further inspection of the source code, it appears to me that
the kernel's limit on accept queue length is hard-wired at 4096 in
Solaris. So there's basically no way that we're hitting that limit in the
regression tests, and the MAX_CONNECTIONS configuration is irrelevant.

We seem to be left with the race condition theory. In that connection,
this comment in /usr/src/uts/common/io/tl.c is interesting:

* The T_CONN_CON is generated when processing the T_CONN_REQ i.e. before
* a T_CONN_RES is received from the acceptor. This means that a socket
* connect will complete before the peer has called accept.

I'm not sure that explains anything of value, but it's probably unlike any
other implementation, which makes it perhaps relevant. It implies that
this is totally unrelated to any server-side behavior; so if it's possible
for us to work around it at all, we'd have to do so client-side.

regards, tom lane

In response to

Re: Possible fix for occasional failures on castoroides etc at 2014-05-03 17:25:32 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2014-05-03 19:29:18	Re: Possible fix for occasional failures on castoroides etc
Previous Message	Bruce Momjian	2014-05-03 18:20:27	pgindent run