Re: Dropped connections with pg_basebackup

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Francisco Reyes <lists(at)natserv(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: Dropped connections with pg_basebackup
Date: 2015-09-24 20:32:07
Message-ID: 56045DC7.7060002@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 09/24/2015 12:57 PM, Francisco Reyes wrote:
> Have an existing setup of 9.3 servers. Replication has been rock solid,
> but recently the circuits between data centers were upgraded and
> pg_basebackup now seems to fail often when setting up streaming
> replication. What used to take 10+ hours now only took 68 minutes, but
> had to do many retries. Many attempts fail within minutes while others
> go to 90% or higher and then drop. The reason we are doing a sync is
> because we have to swap data centers every so often for compliance. So I
> had to swap master and slave.
>
> Calling pg_basebackup like this:
> pg_basebackup -P -R -X s -h <HostName> -D <Folder> -U replicator
>
> The error we keep having is:
> Sep 23 13:36:32 <HostName> postgres[16804]: [11-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: SSL error: bad write retry
> Sep 23 13:36:32 <HostName> postgres[16804]: [12-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: SSL error: bad write retry

Seems to be an SSL problem, so how is your SSL set up on the servers?

> Sep 23 13:36:32 <HostName> postgres[16804]: [13-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator FATAL: connection to client lost
> Sep 23 13:36:32 <HostName> postgres[16972]: [9-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: could not receive data from client:
> Connection reset by peer
>
> I have been working with the network team and we have even been actively
> monitoring the line, and running ping, as the replication is setup. At
> the point the connection reset by peer error happens, we don't see any
> issue with the network and ping doesn't show an issue at that point in
> time.
>
> The issue also happened on another set of machines and likewise, had to
> retry many times before pg_basebackup would do the initial sync. Once
> the initial sync is set, replication is fine.
>
> I tried both "-X s" (stream) and "-X f" (fetch) and both fail often.
>
> Any ideas what may be going on?
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2015-09-24 20:34:30 Re: Dropped connections with pg_basebackup
Previous Message Sherrylyn Branchaw 2015-09-24 20:29:56 Re: Dropped connections with pg_basebackup