Re: Premature timeout on git.postgresql.org?

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Premature timeout on git.postgresql.org?
Date: 2022-02-21 11:01:33
Message-ID: CABUevEwP4GqCOOf2bKgx5xguGj2wMO+0QN1bazz8Es9HN==Stw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Sun, Feb 20, 2022 at 7:35 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Sun, Feb 20, 2022 at 08:31:23AM -0800, Adrian Klaver wrote:
> >> For a point of reference I just cloned on a Linode instance in Fremont CA.
> >> For the Postgres repo I got 2.25 MiB/s. Trying some other project repos I
> >> got 25-27 MiB/s
>
> > My transfer number fluctuated from 800kB to 2.2MB.
>
> Yeah, I'd supposed that the el-cheapo ethernet dongle I'm using on
> florican's host was the source of the crummy performance, but it is not,
> or at least not all of it: there is something rotten between here and
> Amsterdam. Trying it now on my primary workstation, I can clone from
> the github mirror at a more or less respectable speed:

github does have a global CDN distributing things though, and it's
very likely the postgres entries were cached locally :)

<snip>

> The 1.4MB/s average seen here hides a very variable speed: like
> Bruce, I saw a speed around 700/800 kB/s to start, and then it
> gradually ramped up to something over 2MB/s.
>
> Now in fairness, the ping time to github.com from here is circa 10ms
> while the ping time to gothos.postgresql.org is circa 100ms, so
> I'd not really expect equivalent performance ... but this seems
> seriously bad.

It is definitely worse than I would've thought -- maybe I'm clouded by
it being fast enough while in Europe.

<snip>

> The second try got the same results as yesterday:

Actually, they don't seem to be the same results as yesterday AFAICT?

Yesterday it failed under "Compressing objects", which is before it
even tries to send anything across the network, which would indicate a
host issue on the git server (which is.. weird). This time it's
failing in Receiving objects, which is when it's actually, well,
receiving objects. It probably failed just as it started, so it is
likely to be the same underlying problem.

> $ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
> Cloning into 'pgsql-pg'...
> remote: Enumerating objects: 14782, done.
> remote: Counting objects: 100% (14782/14782), done.
> remote: Compressing objects: 100% (9581/9581), done.
> [ I hit return a few extra times to capture state, last printout was this: ]
> Receiving objects: 99% (876395/879421), 285.62 MiB | 1.19 MiB/s
> error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
> fetch-pack: unexpected disconnect while reading sideband packet
> fatal: early EOF
> fatal: fetch-pack: invalid index-pack output
> 323.67 real 0.00 user 0.11 sys
>
> (The instantaneous transfer speed varied from circa 1.3MB/s to
> as low as a couple hundred kB/s.)
>
> I doubt there's much we can do at the project level about the
> poor transfer speed of the transatlantic link. However, the

We'll check in with our hosting provider on this topic and see if they
have something to say.

I did double check a run from a US server myself now, and it completes
in about 2 minutes so it's much faster than yours -- but much slower
than Europe. I also did some metrics during this time and there is
close to zero load on the server itself, so it must be somewhere in
the transit

> fact that it's failing outright looks to me to be due to a
> timeout of 5 or possibly 6 minutes that's breaking the "sideband"
> connection. That probably is within our control.

Yeah, that should be, if it is on our server.

However, I'm not really sure that's where the problem is. I can get
through an artificially slow clone from the US west coast:

$ time trickle -d 250 git clone https://git.postgresql.org/git/postgresql.git
trickle: Could not reach trickled, working independently: No such file
or directory
Cloning into 'postgresql'...
remote: Enumerating objects: 14820, done.
remote: Counting objects: 100% (14820/14820), done.
remote: Compressing objects: 100% (9619/9619), done.
remote: Total 879459 (delta 9993), reused 6929 (delta 5125), pack-reused 864639
Receiving objects: 100% (879459/879459), 289.21 MiB | 258.00 KiB/s, done.
Resolving deltas: 100% (754341/754341), done.

real 20m17.918s
user 2m22.406s
sys 0m4.184s

so it seems 20 minutes to transfer it is.. OK. (Per the server logs,
about 18 of those 20 minutes were in the single https call that timed
out for you)

Any chance there's a network device on your local network that might
have a 5 minute timeout?

And -- can you send me (offlist) exactly which IP these requests would
be coming in from, and run one more test today, and I'll see if I can
find some more details in some more verbose logging that's on now.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Browse pgsql-www by date

  From Date Subject
Next Message Daniel Westermann (DWE) 2022-02-21 14:35:34 Error 503 Backend fetch failed, Documentation
Previous Message Tom Lane 2022-02-20 18:34:56 Re: Premature timeout on git.postgresql.org?