Re: Issue with pg_basebackup v.11

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ninad Shah <nshah(dot)postgres(at)gmail(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Issue with pg_basebackup v.11
Date: 2021-10-22 16:09:55
Message-ID: 2715482.1634918995@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ninad Shah <nshah(dot)postgres(at)gmail(dot)com> writes:
> What I observed is that it takes a couple of hours between below 2 lines.

> 115454656/1304172127 kB (8%), 0/1 tablespace
> (...atastaging/base/115868/154220.2)
> pgbasebackup: could not read COPY data: could not receive data from server:
> Connection timed out

We have heard reports of network connections dropping while pg_basebackup
is busy doing something disk-intensive such as fsync'ing. The apparent
2-hour delay here does not mean that pg_basebackup was out to lunch for
2 hours; more likely that reflects the TCP timeout delay before the kernel
realizes that the connection is lost. The actual blame probably resides
with some firewall or router that has a short timeout for idle
connections.

I'd try turning on fairly aggressive TCP keepalive settings for the
connection, say keepalives_idle=30 or so.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bryn Llewellyn 2021-10-22 17:26:38 Re: Looking for a doc section that presents the overload selection rules
Previous Message Adrian Klaver 2021-10-22 15:49:28 Re: Looking for a doc section that presents the overload selection rules