Re: pgbackrest: backup from standby

From: Craig James <cjames(at)emolecules(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: pgbackrest: backup from standby
Date: 2018-08-21 01:42:22
Message-ID: CAFwQ8rf+9hO-NEyRfRLA1zVmvW0aHf5xp74J3NR76iV4tsQdmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi David,

On Mon, Aug 20, 2018 at 12:36 PM, David Steele <david(at)pgmasters(dot)net> wrote:

> On 8/20/18 12:12 PM, Craig James wrote:
> > The pgbackrest documentation has a section on backing up from a standby,
> > but I'm having trouble figuring it out. I think it's just a terminology
> > problem: I can't figure out which configuration directives go on which
> > server.
> >
> > To begin with, I already have streaming replication running:
> >
> > Primary server (live database): "radon"
> >
> > [emolecules]
> > db-path=/var/lib/postgresql/9.6/main
> >
> > [global]
> > repo-path=/pg_archive/pgbackrest
> > retention-full=10
> > backup-user=postgres
> > log-level-file=detail
> >
> >
> > Hot-standby server: "standby"
> >
> > [emolecules]
> > db-path=/var/lib/postgresql/9.6/main
> > recovery-option=standby_mode=on
> >
> > [global]
> > backup-host=radon.emolecules.com <http://radon.emolecules.com>
> > backup-user=postgres
>
> ...
> It's possible to backup/archive to multiple repos but the configuration
> is complex. I recommend that you migrate your pgBackRest repo to S3
> first. That goes like:
>
> 1) Sync the pgbackrest repo to S3 (using s3fs)
> 2) Stop archiving by setting archive_command=false
> 3) Sync the pgbackrest repo again to catch any files that were missed
> the first time.
> 4) Reconfigure for S3 and enable archiving.
>

Just to make it clear what's going on. Black is what we have now. Red is
what I was hoping to add. Blue is added when we switch to AWS and shut off
the current servers.

primary ----+-----> NFS archive |
| | | local (non-AWS) servers
| +-----> Standby #1 |
|
V
Standby #2 +-----> S3 Archive |
| | Amazon AWS & S3
+-----> New Standby |

The goal is getting from here to there, while keeping the web site going
and orders flowing, while never having less than one hot-standby server and
one full archive at all times.

As you can see from my primitive diagram, the primary server is also
backing up to a local hot-standby server and to an NFS archive.

One concern: If you recall from a previous pgbackrest bug we encountered,
we have hundreds of thousands of objects in the database, many of which are
very small. So that's hundreds of thousands of files that must be copied to
S3. My understanding is that there is significant round-trip and
file-creation latency for every S3 file. Presumably that latency will be
minimized between AWS-to-S3 activity, but could be bad from
some-other-site-to-S3 (i.e. from our current primary server).

That's one of the reasons I was hoping to backup from the standby server
(AWS) to S3 rather than from the primary server to S3.

>
> > There are sections in the user guide on /Backup from a Standby
> > <https://pgbackrest.org/user-guide.html#standby-backup>/ and on /S3
> > Support <https://pgbackrest.org/user-guide.html#s3-support>/, but I
> > can't sort out how to combine them. Apparently I need to define
> > pg1-host/pg1-path, and pg2-host/pg2-path, then a stanza for S3. But does
> > all this go in the pgbackrest.conf of the primary ("radon" in my case),
> > or on the standby, or somehow split across both?
>
> You have an asymmetrical configuration here since there is not backup
> server and shown in the pgBackRest docs. That makes configuration quite
> a bit harder and make failover harder.
>

It's AWS, so it's easy to spin up a small server to just be a repo manager.
But I worry that it couldn't handle the existing standby server since all
the traffic would have to go local-->AWS-->local to get from the primary
over to the current standby server.

I'm open to any and all suggestions.

One possibility: Does pgbackrest support chained standby servers?

primary ----> standby #1 ----> standby #2

If so, we could do this:

1. Set up standby #1 and #2 on AWS, but no S3 yet
2. Prepare config for S3, but don't deploy
3. Failover to standby #1, making it the primary
4. Start the S3 backup

This would give us a short time with no archive/PITR capability, but we
could do it during a maintenance window when no orders are expected, and
(hopefully) the S3 archive would finish in less than 24 hours.

Thanks very much for your advice!
Craig

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message pavan95 2018-08-21 05:46:04 Re: How to enable Logon Trigger in postgres?
Previous Message Jerry Sievers 2018-08-20 20:57:41 Re: temp_file_limit parameter ignored?