From: | Jason Tishler <Jason(dot)Tishler(at)dothill(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>, pgsql-ports(at)postgresql(dot)org |
Subject: | Re: Cygwin PostgreSQL Regression Test Problems (Revisited) |
Date: | 2001-04-02 17:19:17 |
Message-ID: | 20010402131917.C798@dothill.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-ports |
Tom,
On Sun, Apr 01, 2001 at 01:57:35PM -0400, Tom Lane wrote:
> Jason Tishler <Jason(dot)Tishler(at)dothill(dot)com> writes:
> > I'm glad that you agree. Please post to the list when the change is in
> > CVS and I will test that this solves the Cygwin regression test (i.e.,
> > psql) hangs.
>
> Done as of yesterday; should be in this morning's snapshot.
Thanks.
> > Actually, the blocking connect() change for Cygwin is obviated by the
> > pqWait() fix. So, I am now no longer recommending making the blocking
> > connect() change for Cygwin. Unless, you do so for other Unixes too.
>
> I made both changes in the hope that the blocking connect change would
> suppress your problem with connection-refused failures. If it does not,
> then we may as well reverse out the fe-connect.c change. Let me know.
With both changes or only the fe-connect.c one, psql does not hang and
displays the following error message when the connection is refused:
psql: connectDBStart() -- connect() failed: Connection refused
Is the postmaster running locally
and accepting connections on Unix socket '/tmp/.s.PGSQL.65432'?
With only the fe-misc.c change, psql does not hang and displays the
following error message when the connection is refused:
psql: PQconnectPoll() -- connect() failed: error 10061
Is the postmaster running locally
and accepting connections on Unix socket '/tmp/.s.PGSQL.65432'?
In both cases there are no hangs, just the error messages are different.
Unfortunately, for the non-blocking case the error message is cryptic.
I tried tracking down error "10061" which comes from getsockopt(), but
I was unsuccessful. Is there any way to improve the readability of this
error message?
Also, the blocking connect change did *not* fix the connection refused
(spurious) regression test failures. So this change should probably be
backed out.
> > I'm wondering whether it makes sense to add a simple connection retry
> > policy as suggested above by Hiroshi?
>
> I do not think it is appropriate for libpq to do that.
When I made my suggestion above, I was concerned that may be libpq was not
the right layer to be implementing connection policies and that possibly
psql was the better place.
> For one thing, where would you stop --- why exactly two tries?
This was another one of my concerns too.
> > 2. Change the backlog parameter to listen() in src/backend/libpq/pqcomm.c
> > to a number that will "ensure" that the parallel_schedule version of the
> > regression test does not generate connection refused conditions. Note
> > that I'm not even sure this will really work on all (or any) platforms.
>
> We already use SOMAXCONN which is supposed to be defined by the system
> as the maximum allowed queue depth. If Cygwin fails to define it, or
> defines it as something less than it should be, then we might consider
> installing a Cygwin-specific hack to redefine SOMAXCONN.
Cygwin defines SOMAXCONN to be 5. However, winsock.h defines it to be 5
while winsock2.h defines it to be 0x7fffffff. So, I'm not sure what it
the real Cygwin (i.e., Windows) maximum.
> However Hiroshi says later that he already tried this.
Even if it worked, this would have just pushed the problem instead of
really fixing it.
> I'm inclined to think
> that Cygwin simply has a problem with servicing concurrent connection
> requests, perhaps even before the alleged SOMAXCONN value is reached.
You meant Windows. Right? :,)
In summary, I feel that the fe-connect.c change should be backed out so
that Cygwin will be consistent with other UNIXes. I also hope that the
non-blocking connection failure message can be made more readable and
that make check will not generate spurious failure messages under Cygwin
on slow machines.
Thanks,
Jason
--
Jason Tishler
Director, Software Engineering Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp. Fax: +1 (732) 264-8798
82 Bethany Road, Suite 7 Email: Jason(dot)Tishler(at)dothill(dot)com
Hazlet, NJ 07730 USA WWW: http://www.dothill.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2001-04-02 17:44:14 | Re: Cygwin PostgreSQL Regression Test Problems (Revisited) |
Previous Message | Tom Lane | 2001-04-02 17:14:51 | Re: patch for minor Win32 makefile bug |