From: | Flavio Henrique Araque Gurgel <fhagur(at)gmail(dot)com> |
---|---|
To: | Don Seiler <don(at)seiler(dot)us> |
Cc: | pgsql-admin <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Streaming Replication Networking Best Practices? |
Date: | 2018-05-14 16:17:39 |
Message-ID: | CAGHTAeN+KU8vAS5f8M6HiHE8xb-3FFe0pG4rNvuxpJ5G3hw77Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Em seg, 14 de mai de 2018 às 18:12, Don Seiler <don(at)seiler(dot)us> escreveu:
> Postgres 9.6.6. Primary has a local (HA) replica and a remote (DR) replica.
>
> I've done a couple of big data purges the past few weeks. This past
> weekend I ran a DELETE & VACUUM that took all of 8 minutes. The local
> replica kept up just fine, but the remote replica lagged and broke
> streaming replication after just a few minutes. We have WAL archives sent
> via NetApp mirroring to back that up.
>
> However I'd like to know if there are any optimal networking settings on
> the host or network that we maybe missing. My manager says that the circuit
> between data centers was only 60% utilized at its peak.
>
> In the past I've tried increasing wal_keep_files, which keeps the WAL
> files available for streaming but the fact remains that they stream very
> slowly so the lag just gets worse than if we fell back to archives every 30
> minutes or so.
>
> I have no basis for this other than my previous experience with Oracle
> physical standbys, but I would think that streaming replication should be
> able to push more than it seems to be doing in my prod environment. The
> fact that the local replica keeps up just fine without breaking streaming
> replication tells me that the problem is in the cross-datacenter circuit,
> not in postgres recovery performance.
>
> If anyone has any advice on host networking setup, tuning or testing, I'd
> love to hear it.
>
If you're running 9.6, you can use replication slots to avoid to mess with
wal_keep_segments [1]
Be aware that not only network bandwidth and latency are responsable for
that behaviour, wal_receiver, disk write capability on your standby can be
bottlenecks too.
It happens to me in local networks with 10Gbps capable hardware.
[1]
https://www.postgresql.org/docs/9.6/static/warm-standby.html#STREAMING-REPLICATION-SLOTS
Flavio Gurgel
From | Date | Subject | |
---|---|---|---|
Next Message | Don Seiler | 2018-05-14 16:22:23 | Re: Streaming Replication Networking Best Practices? |
Previous Message | Don Seiler | 2018-05-14 16:11:40 | Streaming Replication Networking Best Practices? |