| From: | Stephen Frost <sfrost(at)snowman(dot)net> | 
|---|---|
| To: | Paul Förster <paul(dot)foerster(at)gmail(dot)com> | 
| Cc: | James Sewell <james(dot)sewell(at)jirotech(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: Safe switchover | 
| Date: | 2020-07-13 16:00:53 | 
| Message-ID: | 20200713160052.GV12375@tamriel.snowman.net | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Greetings,
* Paul Förster (paul(dot)foerster(at)gmail(dot)com) wrote:
> > On 13. Jul, 2020, at 17:47, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > 
> > Sure, Patroni will handle the failover fine- but that's not what I was
> > referring to.  If the server crashes and you have no idea why or what
> > happened, I would strongly recommend against using pg_rewind to rebuild
> > it to be a replica as there's no validation happening- you might
> > failover to it much later and, if you're lucky, discover quickly that
> > some blocks had gotten corrupted or if you're unlucky not discover until
> > much later that something was corrupted when the crash happened.  Using
> > initdb -k is good, but PG is only going to check the block when it goes
> > to read it, which might not be until much later especially on a system
> > that's been rebuilt as a replica.
> 
> I see your point, yet, I'm not sure how pgbackrest could protect us from such a situation.
A pgbackrest delta restore will scan the entire data directory and
verify every file matches the last backup, or it'll replace the file
with what was in the backup that's being used.  If there's an error
during any of that, the restore will fail.
That re-validation of the entire data directory is a pretty huge
difference compared to how pg_rewind works.
> > This seems like an independent question and I'm not really sure what is
> > meant here by 'reinit it with Patroni'.
> 
> reinit basically deletes the replica database cluster and triggers a new full copy of the primary. You can either "patronictl reinit" or kill patroni, rm -r ${PGDATA}, and start patroni. This is basically the same.
Ah, yes, if you rebuild the replica from a backup (or from the primary),
then sure, that's pretty similar to the pgbackrest delta restore, except
that when using delta restore we're only rewriting files that have a
different SHA checksum after being scanned, and we're pulling from the
backup repo anything that's needed and not putting load on the primary.
> > I agree that it'd be good to have -k on by default.
> 
> so, now, we're two. :-) Anyone else? ;-)
There's been a few discussions on -hackers about this, that'd probably
be the place to discuss it further..
Thanks,
Stephen
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Paul Förster | 2020-07-13 16:05:59 | Re: Safe switchover | 
| Previous Message | Paul Förster | 2020-07-13 15:54:57 | Re: Safe switchover |