Re: pg_basebackup + incremental base backups

From: Christopher Pereira <kripper(at)imatronix(dot)cl>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_basebackup + incremental base backups
Date: 2020-05-24 18:36:01
Message-ID: 03f63a68-14be-978a-ccc8-177a84f115a5@imatronix.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> We've contemplated adding support for something like this to pgbackrest,
> since all the pieces are there, but there hasn't been a lot of demand
> for it and it kind of goes against the idea of having a proper backup
> solution, really.. It'd also create quite a bit of load on the primary
> to checksum all the files to do the comparison against what's on the
> replica that you're trying to update, so not something you'd probably
> want to do a lot more than necessary.

Ok, we want to use pgbackrest to *rebuild a standby that has fallen
behind* (where pg_rewind won't work). After reading the docs, we believe
we should use this setup:

a) Primary host: primary cluster

b) Repository host: needed for rebuilding the standby (and having PITR
as bonus).

c) Standby host: standby cluster

Some questions:

1) The standby will use streaming replication and will be in sync until
someday something funny happens and both standby and repository get out
of sync with the primary.
Now, to rebuild the standby first we will have to create a new backup
transferring the data from *primary -> repository*, right?
Wouldn't this also have a load impact on the primary cluster?

2) In the user guide section 17.3 is explained how to create a
"pg-standby host" to replicate the data *from the repository host*.
And in section 17.4 is explained how to setup Streaming Replication to
replicate the data *from the primary host*.
Do 17.3 and 17.4 work together so that the data is *replicated from the
repository* and then *streamed from the primary*?

3) Before being able to rebuild the standby cluster, would we first need
to update the backup on the repository (backup from primary ->
repository) in order for streaming replication to work (from primary ->
standby)?

4) Once the backup on the repository is ready, what are the chances that
streaming replication from primary to standby won't work because they
got out of sync again?

5) Could we just work with 2 hosts (primary and standby) instead of 3?
FAQ section 8 says the repository shouldn't be on the same host as the
standby and having it on the primary doesn't make much sense because if
the primary host is down we won't have access to the backup.

It would be ideal to have the repository on the standby host and taking
good care of the configurations. What exactly should be cared of for
this setup to be safe?

I'm afraid I'm not understanding very well the pgbackrest design or how
to use it efficiently to rebuild a standby cluster that got out of sync.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Frost 2020-05-24 19:48:33 Re: pg_basebackup + incremental base backups
Previous Message Christopher Bottaro 2020-05-24 16:57:27 Help with streaming replication protocol