From: | Sherrylyn Branchaw <sbranchaw(at)gmail(dot)com> |
---|---|
To: | Francisco Reyes <lists(at)natserv(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Dropped connections with pg_basebackup |
Date: | 2015-09-24 20:29:56 |
Message-ID: | CAB_myF5CSiJftwyGT+kNQSuTNfU7Qt7cL6p-jrjfV09mrsjreg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I'm assuming based on the "SSL error" that you have ssl set to 'on'. What's
your ssl_renegotiation_limit? The default is 512MB, but setting it to 0 has
solved problems for a number of people on this list, including myself.
Sherrylyn
On Thu, Sep 24, 2015 at 3:57 PM, Francisco Reyes <lists(at)natserv(dot)net> wrote:
> Have an existing setup of 9.3 servers. Replication has been rock solid,
> but recently the circuits between data centers were upgraded and
> pg_basebackup now seems to fail often when setting up streaming
> replication. What used to take 10+ hours now only took 68 minutes, but had
> to do many retries. Many attempts fail within minutes while others go to
> 90% or higher and then drop. The reason we are doing a sync is because we
> have to swap data centers every so often for compliance. So I had to swap
> master and slave.
>
> Calling pg_basebackup like this:
> pg_basebackup -P -R -X s -h <HostName> -D <Folder> -U replicator
>
> The error we keep having is:
> Sep 23 13:36:32 <HostName> postgres[16804]: [11-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator LOG: SSL error: bad write retry
> Sep 23 13:36:32 <HostName> postgres[16804]: [12-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator LOG: SSL error: bad write retry
> Sep 23 13:36:32 <HostName> postgres[16804]: [13-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator FATAL: connection to client lost
> Sep 23 13:36:32 <HostName> postgres[16972]: [9-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator LOG: could not receive data from client:
> Connection reset by peer
>
> I have been working with the network team and we have even been actively
> monitoring the line, and running ping, as the replication is setup. At the
> point the connection reset by peer error happens, we don't see any issue
> with the network and ping doesn't show an issue at that point in time.
>
> The issue also happened on another set of machines and likewise, had to
> retry many times before pg_basebackup would do the initial sync. Once the
> initial sync is set, replication is fine.
>
> I tried both "-X s" (stream) and "-X f" (fetch) and both fail often.
>
> Any ideas what may be going on?
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2015-09-24 20:32:07 | Re: Dropped connections with pg_basebackup |
Previous Message | Francisco Reyes | 2015-09-24 19:57:18 | Dropped connections with pg_basebackup |