Re: pg_basebackup + delta base backups

From: Christopher Pereira <kripper(at)imatronix(dot)cl>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_basebackup + delta base backups
Date: 2020-05-27 03:00:52
Message-ID: fff1774f-1d15-2425-c576-61760538743d@imatronix.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 26-May-20 10:20, Stephen Frost wrote:
> [...]
>
> "out of sync" is a bit of an odd concept, but having a replica fall
> behind a long way is certainly something that can happen and may require
> a rebuild from a backup (or from a new sync off of the primary in some
> other way, as you suggest below). In a situation where there's async
> replication happening and you promote a replica to take over, that's
> definitely a case where you might also have to rebuild the former
> primary.

Hi Stepehen,

Yes, a common case with async streaming is when primary (A) goes down
and replica is promoted as a new master (B).
Then A comes back and has some data that was not streamed to B so
pg_rewind is useless.

I wonder if there is some option to just discard this branched data from
A in order to start as a new replica.

I noticed that pg_rewind is useless even when both DBs are identical
(according to pg_dumpall | md5sum).

> [...]
>> As you said, all the pieces are there and it would be quite easy to write a
>> new "pg_basebackup_delta" script that could be executed on the standby host
>> to:
>>
>> 1) setup a pgBackRest repo on the primary host (via SSH)
>>
>> 2) create a backup on the primary host (via SSH)
>>
>> 3) do a delta restore on the standby
>>
>> Even when the repository on the primary host is only created temporarily
>> (and require double storage, resources, etc), it may still be worth
>> considering the traffic that can be saved by doing a delta restore on a
>> standby host in a different region, right?
> So... There's actually a way to do this with pgbackrest, but it doesn't
> support the delta capability.

If I understood correctly the method you described, you were basically
doing a "backup" between A (primary) and B (repo) and in such a way the
repo is then compatible with the pg_data structure, but without delta
support (ie. transfering the whole database)?

Delta support is critical for VLDBs, so I see two alternatives to
replace pg_basebackup with pgbackrest to rebuild a replica:

1) Create a temporary repo on the primary

2) Create a temporary repo on the replica

All configurations would be undone after the replica has been rebuilt
and both alternatives would be using delta over the wire.
In your opinion, which alternative is better considering network traffic?

Thanks,

Christopher

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Olivier Gautherot 2020-05-27 07:12:11 Re: Install PostgreSQL on windows 10 home 64-bit machine
Previous Message Tim Cross 2020-05-27 00:14:46 Re: GPG signing