Re: Streaming Replication Networking Best Practices?

From: Flavio Henrique Araque Gurgel <fhagur(at)gmail(dot)com>
To: Don Seiler <don(at)seiler(dot)us>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Streaming Replication Networking Best Practices?
Date: 2018-05-14 16:17:39
Message-ID: CAGHTAeN+KU8vAS5f8M6HiHE8xb-3FFe0pG4rNvuxpJ5G3hw77Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Em seg, 14 de mai de 2018 às 18:12, Don Seiler <don(at)seiler(dot)us> escreveu:

> Postgres 9.6.6. Primary has a local (HA) replica and a remote (DR) replica.
>
> I've done a couple of big data purges the past few weeks. This past
> weekend I ran a DELETE & VACUUM that took all of 8 minutes. The local
> replica kept up just fine, but the remote replica lagged and broke
> streaming replication after just a few minutes. We have WAL archives sent
> via NetApp mirroring to back that up.
>
> However I'd like to know if there are any optimal networking settings on
> the host or network that we maybe missing. My manager says that the circuit
> between data centers was only 60% utilized at its peak.
>
> In the past I've tried increasing wal_keep_files, which keeps the WAL
> files available for streaming but the fact remains that they stream very
> slowly so the lag just gets worse than if we fell back to archives every 30
> minutes or so.
>
> I have no basis for this other than my previous experience with Oracle
> physical standbys, but I would think that streaming replication should be
> able to push more than it seems to be doing in my prod environment. The
> fact that the local replica keeps up just fine without breaking streaming
> replication tells me that the problem is in the cross-datacenter circuit,
> not in postgres recovery performance.
>
> If anyone has any advice on host networking setup, tuning or testing, I'd
> love to hear it.
>

If you're running 9.6, you can use replication slots to avoid to mess with
wal_keep_segments [1]
Be aware that not only network bandwidth and latency are responsable for
that behaviour, wal_receiver, disk write capability on your standby can be
bottlenecks too.
It happens to me in local networks with 10Gbps capable hardware.

[1]
https://www.postgresql.org/docs/9.6/static/warm-standby.html#STREAMING-REPLICATION-SLOTS

Flavio Gurgel

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Don Seiler 2018-05-14 16:22:23 Re: Streaming Replication Networking Best Practices?
Previous Message Don Seiler 2018-05-14 16:11:40 Streaming Replication Networking Best Practices?