From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: trying again to get incremental backup |
Date: | 2023-10-04 19:33:27 |
Message-ID: | CA+TgmoZxPpWU11+yQCHr8obHK5nAyb96KpN25bk4PdX51dUFFQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 28, 2023 at 6:22 AM Jakub Wartak
<jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
> all those basic tests had GOOD results. Please find attached. I'll try
> to schedule some more realistic (in terms of workload and sizes) test
> in a couple of days + maybe have some fun with cross-backup-and
> restores across standbys.
That's awesome! Thanks for testing! This can definitely benefit from
any amount of beating on it that people wish to do. It's a complex,
delicate area that risks data loss.
> If that is still an area open for discussion: wouldn't it be better to
> just specify LSN as it would allow resyncing standby across major lag
> where the WAL to replay would be enormous? Given that we had
> primary->standby where standby would be stuck on some LSN, right now
> it would be:
> 1) calculate backup manifest of desynced 10TB standby (how? using
> which tool?) - even if possible, that means reading 10TB of data
> instead of just putting a number, isn't it?
> 2) backup primary with such incremental backup >= LSN
> 3) copy the incremental backup to standby
> 4) apply it to the impaired standby
> 5) restart the WAL replay
Hmm. I wonder if this would even be a safe procedure. I admit that I
can't quite see a problem with it, but sometimes I'm kind of dumb.
> Also maybe it's too early to ask, but wouldn't it be nice if we could
> have an future option in pg_combinebackup to avoid double writes when
> used from restore hosts (right now we need to first to reconstruct the
> original datadir from full and incremental backups on host hosting
> backups and then TRANSFER it again and on target host?). So something
> like that could work well from restorehost: pg_combinebackup
> /tmp/backup1 /tmp/incbackup2 /tmp/incbackup3 -O tar -o - | ssh
> dbserver 'tar xvf -C /path/to/restored/cluster - ' . The bad thing is
> that such a pipe prevents parallelism from day 1 and I'm afraid I do
> not have a better easy idea on how to have both at the same time in
> the long term.
I don't think it's too early to ask for this, but I do think it's too
early for you to get it. ;-)
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-10-04 19:50:59 | Re: --sync-method isn't documented to take an argument |
Previous Message | a.rybakina | 2023-10-04 19:19:59 | Re: POC, WIP: OR-clause support for indexes |