Re: getting pg_basebackup to use remote destination

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Chuck Martin <clmartin(at)theombudsman(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: getting pg_basebackup to use remote destination
Date: 2019-01-03 20:46:37
Message-ID: 20190103204636.GV2528@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings Chuck,

* Chuck Martin (clmartin(at)theombudsman(dot)com) wrote:
> Using iperf, the transfer speed between the two servers (from the main to
> the standby) was 938 Mbits/sec. If I understand the units correctly, it is
> close to what it can be.

That does look like the rate it should be going at, but it should only
take about 2 hours to copy 750GB at that rate.

How much WAL does this system generate though...? If you're generating
a very large amount then it's possible the WAL streaming is actually
clogging up the network and causing the rate of copy on the data files
to be quite slow. You'd have to be generating quite a bit of WAL
though.

> Your earlier suggestion was to do the pg_basebackup locally and rsync it
> over. Maybe that would be faster. At this point, it is saying it is 6%
> through, over 24 hours after being started.

For building out a replica, I'd tend to use my backups anyway instead of
using pg_basebackup. Provided you have good backups and reasonable WAL
retention, restoring a backup and then letting it replay WAL from the
archive until it can catch up with the primary works very well. If you
have a very high rate of WAL then you might consider taking a full
backup and then taking an incremental backup (which is much faster, and
reduces the amount of WAL required to be only that needed for the length
of time that the incremental backup is started until the replica has
caught up to WAL that the primary has).

There's a few different backup tools out there which can do parallel
backup and in-transit compression, which loads up the primary's CPUs
with process doing compression but should reduce the overall time if the
bottleneck is the network.

Thanks!

Stephen

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Frost 2019-01-03 20:49:53 Re: Thoughts on row-level security for webapps?
Previous Message Tom Lane 2019-01-03 20:38:14 Re: Memory exhaustion due to temporary tables?