From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Francisco Reyes <lists(at)natserv(dot)net>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Dropped connections with pg_basebackup |
Date: | 2015-09-24 20:32:07 |
Message-ID: | 56045DC7.7060002@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 09/24/2015 12:57 PM, Francisco Reyes wrote:
> Have an existing setup of 9.3 servers. Replication has been rock solid,
> but recently the circuits between data centers were upgraded and
> pg_basebackup now seems to fail often when setting up streaming
> replication. What used to take 10+ hours now only took 68 minutes, but
> had to do many retries. Many attempts fail within minutes while others
> go to 90% or higher and then drop. The reason we are doing a sync is
> because we have to swap data centers every so often for compliance. So I
> had to swap master and slave.
>
> Calling pg_basebackup like this:
> pg_basebackup -P -R -X s -h <HostName> -D <Folder> -U replicator
>
> The error we keep having is:
> Sep 23 13:36:32 <HostName> postgres[16804]: [11-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: SSL error: bad write retry
> Sep 23 13:36:32 <HostName> postgres[16804]: [12-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: SSL error: bad write retry
Seems to be an SSL problem, so how is your SSL set up on the servers?
> Sep 23 13:36:32 <HostName> postgres[16804]: [13-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator FATAL: connection to client lost
> Sep 23 13:36:32 <HostName> postgres[16972]: [9-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: could not receive data from client:
> Connection reset by peer
>
> I have been working with the network team and we have even been actively
> monitoring the line, and running ping, as the replication is setup. At
> the point the connection reset by peer error happens, we don't see any
> issue with the network and ping doesn't show an issue at that point in
> time.
>
> The issue also happened on another set of machines and likewise, had to
> retry many times before pg_basebackup would do the initial sync. Once
> the initial sync is set, replication is fine.
>
> I tried both "-X s" (stream) and "-X f" (fetch) and both fail often.
>
> Any ideas what may be going on?
>
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2015-09-24 20:34:30 | Re: Dropped connections with pg_basebackup |
Previous Message | Sherrylyn Branchaw | 2015-09-24 20:29:56 | Re: Dropped connections with pg_basebackup |