Quick Links

Re: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified
Date:	2017-05-16 01:59:52
Message-ID:	CAB7nPqRqgs+2hW_qK5NfGyU7LU_CPN6R6EZo24TZ+aFyb07b8w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, May 16, 2017 at 3:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> FWIW, I think the position most of us were taking is that this feature
> is meant to retry transport-level connection failures, not cases where
> we successfully make a connection to a server and then it rejects our
> login attempt. I would classify a timeout as a transport-level failure
> as long as it occurs before we got any server response --- if it happens
> during the authentication protocol, that's less clear. But it might not
> be very practical to distinguish those two cases.
>
> In short, +1 for retrying on timeout during connection, and I'm okay with
> retrying a timeout during authentication if it's not practical to treat
> that differently.

Sensible argument here. It could happen that a server is unresponsive,
for example in split-brains (?).

I have been playing a bit with the patch.

+ *
+ * Returns -1 on failure, 0 if the socket is readable/writable, 1 if
it timed out.
*/
pqWait is internal to libpq, so we are free to set up what we want
here. Still I think that we should be consistent with what
pqSocketCheck returns:
* >0 means that the socket is readable/writable, counting things.
* 0 is for timeout.
* -1 on failure.

+ int ret = 0;
+ int timeout = 0;
The declaration of "ret" should be internal in the for(;;) loop.

+ /* Attempt connection to the next host, starting the
connect_timeout timer */
+ pqDropConnection(conn, true);
+ conn->addr_cur = conn->connhost[conn->whichhost].addrlist;
+ conn->status = CONNECTION_NEEDED;
+ finish_time = time(NULL) + timeout;
+ }
I think that it would be safer to not set finish_time if
conn->connect_timeout is NULL. I agree that your code works because
pqWaitTimed() will never complain on timeout reached if finish_time is
-1. That's for robustness sake.

The docs say that for connect_timeout:
<para>
Maximum wait for connection, in seconds (write as a decimal integer
string). Zero or not specified means wait indefinitely. It is not
recommended to use a timeout of less than 2 seconds.
This timeout applies separately to each connection attempt.
For example, if you specify two hosts and both of them are unreachable,
and <literal>connect_timeout</> is 5, the total time spent waiting for a
connection might be up to 10 seconds.
</para>

It seems to me that this implies that if a timeout occurs on the first
connection, the counter is reset, which is what this patch is doing.
So we are all set.
--
Michael

In response to

Re: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified at 2017-05-15 18:13:56 from Tom Lane

Responses

Re: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified at 2017-05-16 08:24:01 from Tsunakawa, Takayuki
Re: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified at 2017-05-16 15:47:42 from Robert Haas
Re: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified at 2017-05-17 02:49:07 from Tsunakawa, Takayuki

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2017-05-16 02:06:57	Re: Obsolete sentence in CREATE SUBSCRIPTION docs
Previous Message	Masahiko Sawada	2017-05-16 01:23:00	Obsolete sentence in CREATE SUBSCRIPTION docs