Re: Intermittent "make check" failures on hyena

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Intermittent "make check" failures on hyena
Date: 2006-08-06 16:13:51
Message-ID: 44D6153F.5000607@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

>I'm noticing that buildfarm member hyena sometimes fails the parallel
>regression tests, for instance
>http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=hyena&dt=2006-07-19%2009:20:00
>
>The symptom is always that one of the tests fails entirely because
>psql couldn't connect:
>
>psql: could not connect to server: Connection refused
> Is the server running locally and accepting
> connections on Unix domain socket "/tmp/.s.PGSQL.55678"?
>
>It's a different test failing in each occurrence. Sometimes there are
>ensuing failures in subsequent tests that expect the side-effects
>of the one that failed, but there's clearly a common cause here.
>
>AFAIK it is not possible for Postgres itself to cause a "connection
>refused" failure --- that's a kernel-level errno. So what's going on
>here? The only idea that comes to mind is that this version of Solaris
>has some very low limit on SOMAXCONN, and when the timing is just so
>it's bouncing connection requests because several of them arrive faster
>than the postmaster can fork off children. Googling suggests that there
>are versions of Solaris with SOMAXCONN as low as 5 :-( ... but other
>pages say that the default is higher, so this theory might be wrong.
>
>What is SOMAXCONN set to on that box, anyway? If it's tiny, I suggest
>you increase SOMAXCONN to something saner, or if you can't, run "make
>check" with MAX_CONNECTIONS=5 added to the make command. Does the
>buildfarm script have provisions for site-local settings of this
>parameter?
>
>
>
>

Yes it sure does.

This is the box that Sun donated, btw.

I get: ndd /dev/tcp tcp_conn_req_max_q => 128

Is that the Solaris equivalent of SOMAXCONN? That's low, maybe, but not
impossibly low.

I don't have root on the box, though. For now I have set MAX_CONNECTIONS
to 8, to provide a modest limit on parallelism. I will see if I can
coordinate with Robert and Josh to increase the OS limits.

Thanks for the diagnosis.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-08-06 16:47:14 Re: Intermittent "make check" failures on hyena
Previous Message Roman Neuhauser 2006-08-06 15:16:46 Re: 8.2 features status