Re: 'replication checkpoint has wrong magic' on the newly cloned replicas

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Alex Kliukin <oleksii(at)fastmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: 'replication checkpoint has wrong magic' on the newly cloned replicas
Date: 2017-11-29 19:23:49
Message-ID: CAOuzzgqeJfq=049Pikii7GjS6DFpo6j9Z6TYSo6V9eyMaOi-YA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Greetings,

On Wed, Nov 29, 2017 at 14:12 Alex Kliukin <oleksii(at)fastmail(dot)com> wrote:

>
> On 29. Nov 2017, at 19:44, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> Greetings,
>
> On Wed, Nov 29, 2017 at 13:33 Alex Kliukin <oleksii(at)fastmail(dot)com> wrote:
>
>>
>> On 29. Nov 2017, at 18:52, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>
>> Greetings,
>>
>> On Wed, Nov 29, 2017 at 12:41 Oleksii Kliukin <oleksii(at)fastmail(dot)com>
>> wrote:
>>
>>> Hi Stephen,
>>>
>>> > On 29. Nov 2017, at 15:54, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>> >
>>> > Greetings,
>>> >
>>> > * Alex Kliukin (alexk(at)hintbits(dot)com) wrote:
>>> >> The cloning itself is done by copying a compressed image via ssh,
>>> >> running the
>>> >> following command from the replica:
>>> >>
>>> >> """ssh {master} 'cd {master_datadir} && tar -lcp --exclude "*.conf" \
>>> >> --exclude "recovery.done" \
>>> >> --exclude "pacemaker_instanz" \
>>> >> --exclude "dont_start" \
>>> >> --exclude "pg_log" \
>>> >> --exclude "pg_xlog" \
>>> >> --exclude "postmaster.pid" \
>>> >> --exclude "recovery.done" \
>>> >> * | pigz -1 -p 4' | pigz -d -p 4 | tar -xpmUv -C
>>> >> {slave_datadir}""
>>> >>
>>> >> The WAL archiving starts before the copy starts, as the script that
>>> >> clones the
>>> >> replica checks that the WALs archiving is running before the cloning.
>>> >
>>> > Maybe you've doing it and haven't mentioned it, but you have to use
>>> > pg_start/stop_backup
>>>
>>> Sorry for not mentioning it, as it seemed obvious, but we are calling
>>> pg_start_backup and pg_stop_backup at the right time.
>>
>>
>> Ah, not something I can assume, heh.
>>
>> Then it depends on which version of PG and if you’re able to run
>> start/stop on the replica or not. If you can’t run it on the replica and
>> have to run it on the primary (prior to 9.6) then you need to make sure to
>> wait for things to happen on the primary and for that to be replicated
>> before you can start.
>>
>>
>> We are using exclusive backups from the master. First, the script checks
>> that WAL files are shipped to the NFS, where the replica expects to find
>> them (we check the md5 checksum of the file in order to make sure that the
>> NFS actually delivers the file that the master has archived) . Then
>> pg_start_backup runs on the master and its status is checked. On success,
>> the copy command runs. When the copy command finishes, pg_stop_backup is
>> executed. Once pg_stop_backup finishes successfully, replica configuration
>> files (postgesql.conf, pg_hba.conf. pg_ident.conf) are linked from their
>> location in the repository and the replica is started.
>>
>
> No, you must wait until the replica has moved forward far enough and you
> have to copy the backup_label file from the primary as well, otherwise PG
> won’t realize you’re doing a backup-based recovery
>
>
>
> Are you talking about the exclusive base backup from the master (the
> master being the source for the backup)?
>

Hrmpf. I could have sworn there was a comment somewhere that you were
backing up from the replica, not the primary.

If you’re doing pg_start/stop_backup on the primary *and* copying the files
from the primary, then that’s much better.

At least the backup label is written by pg_start_backup to the data
> directory and is being copied together with the data directory. The
> necessary WAL files are archived once pg_stop_backup returns, and the
> replica cannot move anywhere in recovery without being started.
>

Ok, yes, if you’re getting the backup_label and it’s included in the copy
of the data directory then that’s reasonable.

This is a fairly typical procedure, which, I believe, is also well
>> described in the docs.
>>
>
> Please provide a link to where that is because if that’s the case then we
> need to correct it or remove it. This is absolutely not safe without
> additional checks being done and various other magic happening (like
> copying the backup_label off the primary where it’s created).
>
>
>
> https://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-EXCLUSIVE
>
> It has been there for years and I don’t think there is anything terribly
> wrong there.
>

I had somehow understood you to be copying the files off the replica and
not the primary, though I’m not entirely clear why now.

If you’re on 9.6 and using non-exclusive backup, you need to be sure to
>> capture the contents of the stop backup and write it into backup_label
>> before you start the system back up.
>>
>>
>> We don’t use non-exclusive backups altogether.
>>
>
> All the more likely that your procedure is causing more corruption than
> you realize then.
>
>
> How does exclusive backups make the procedure more prone to corruption?
>

An exclusive backup can’t be run on a replica, which is what I was getting
at with the above comment.

Seriously, again, this is not easy to get right, especially when you’re
> doing things that weren’t explicitly documented and supported. Using
> existing tools from those versed in why the processes used are safe and
> have written lots of tests to verify that it is safe is really the
> recommendation that you should take away from this.
>
>
> I believe what we are doing is rather simple and well documented by the
> link above.
>

If you’re copying it all from the primary and using start/stop backup, then
yes.

At least with 9.6 there’s proper documentation on how to run a
> non-exclusive backup on a replica properly and if you very carefully follow
> the procedure then you may get it right, but you will still want to test
> extensively.
>
>
> We are not doing non-exclusive backups from the replica.
>

Apologies for the confusion. Offhand (and off my phone) probably best if I
don’t try to guess further at what the issue is that you’re running into.
:)

Thanks!

Stephen

>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Sean G 2017-11-29 22:35:23 Re: Barman WAL size issue
Previous Message Alex Kliukin 2017-11-29 19:12:28 Re: 'replication checkpoint has wrong magic' on the newly cloned replicas