Re: [Pgbuildfarm-members] Submission failures: 500 read timeout

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org>
Subject: Re: [Pgbuildfarm-members] Submission failures: 500 read timeout
Date: 2014-09-22 09:15:16
Message-ID: CABRT9RD4Wst9XDiWNZUzB-2_h5EVtxSWM_0hSiLRktSEcga9Qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: buildfarm-members

On Mon, Sep 15, 2014 at 7:15 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> I have turned on request timing in the web logs. It looks like these status
> uploads are typically taking 1 to 2 seconds to process. So I suspect it's
> client-related.

Well I managed to capture only 1 packet dump of this happening, on
2014-09-18 11:51:36 EEST. The problem seems to have disappeared, did
configuration change on the server side? Or maybe it's just that fewer
commits have been pushed recently. If anyone is interested, I can send
the dump privately.

I'm no expert on TCP, but it's conceivably a bug in the TCP stack. I'd
like to collect a few more samples before bothering any networking
people with it. Here's my understanding of what happened:

11:51:36.433 First packet of HTTP POST request is sent
(data being sent)
11:51:39.254 Last packet of POST body
11:51:39.494 buildfarm responds with a SACK which, I believe,
indicates a dropped packet
(3 minutes pass silently)
11:54:38.010 My end sends a FIN, probably a timeout on client side,
closing the socket
11:54:38.212 buildfarm responds with another SACK, repeating the missing packet
(3 seconds, some retransmits occur for the missing data)
11:54:41.215 My end sends a RST (probably timeout because remote
didn't have time to acknowledge the FIN yet)
11:54:41.236 Remote responds with "HTTP 200 OK", before it could have
received my RST, but my local end no longer sees it because the
connection is already reset.

If my reading of RFC 2018 (SACK) is right, the sender must retransmit
data after receiving a SACK packet if the missing data isn't
acknowledged during the retransmit timeout. But this did not happen
for 3 minutes. I don't know whether the receiver (buildfarm) should
retransmit its SACK or not, but that only happened after it had
received the FIN packet.

Regards,
Marti

In response to

Responses

Browse buildfarm-members by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2014-09-22 09:21:41 Re: [Pgbuildfarm-members] Submission failures: 500 read timeout
Previous Message Andrew Dunstan 2014-09-15 16:15:13 Re: [Pgbuildfarm-members] Submission failures: 500 read timeout