Streaming Replication Networking Best Practices?

From: Don Seiler <don(at)seiler(dot)us>
To: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Streaming Replication Networking Best Practices?
Date: 2018-05-14 16:11:40
Message-ID: CAHJZqBAFCNnyaZGmyv8gR280=gXnh=ajnrom9SSrCSsFMzXv=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Postgres 9.6.6. Primary has a local (HA) replica and a remote (DR) replica.

I've done a couple of big data purges the past few weeks. This past weekend
I ran a DELETE & VACUUM that took all of 8 minutes. The local replica kept
up just fine, but the remote replica lagged and broke streaming replication
after just a few minutes. We have WAL archives sent via NetApp mirroring to
back that up.

However I'd like to know if there are any optimal networking settings on
the host or network that we maybe missing. My manager says that the circuit
between data centers was only 60% utilized at its peak.

In the past I've tried increasing wal_keep_files, which keeps the WAL files
available for streaming but the fact remains that they stream very slowly
so the lag just gets worse than if we fell back to archives every 30
minutes or so.

I have no basis for this other than my previous experience with Oracle
physical standbys, but I would think that streaming replication should be
able to push more than it seems to be doing in my prod environment. The
fact that the local replica keeps up just fine without breaking streaming
replication tells me that the problem is in the cross-datacenter circuit,
not in postgres recovery performance.

If anyone has any advice on host networking setup, tuning or testing, I'd
love to hear it.

Thanks,
Don.

--
Don Seiler
www.seiler.us

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Flavio Henrique Araque Gurgel 2018-05-14 16:17:39 Re: Streaming Replication Networking Best Practices?
Previous Message Charlin Barak 2018-05-11 20:18:58 Re: https call in PostgreSQL