Re: 'replication checkpoint has wrong magic' on the newly cloned replicas

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Alex Kliukin <alexk(at)hintbits(dot)com>, pgsql-admin(at)postgresql(dot)org
Subject: Re: 'replication checkpoint has wrong magic' on the newly cloned replicas
Date: 2017-11-30 00:41:07
Message-ID: 20171130004107.j2ve7qn2ene7fqhm@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi,

On 2017-11-29 20:22:43 -0300, Alvaro Herrera wrote:
> Alex Kliukin wrote:
>
> > 2017-11-15 13:15:46.673 CET,,,15154,,5a0c2ff1.3b32,5,,2017-11-15
> > 13:15:45 CET,,0,PANIC,XX000,"replication checkpoint has wrong magic
> > 5714534 instead of 307747550",,,,,,,,,""
>
> Uhh ... I had never heard of this "replication checkpoint" thing.

Contains information about how far logical replication like solutions
have replayed from other systems.

> It is part of replication origins feature, which is fairly new stuff
> (see src/backend/replication/logical/origin.c). I'd bet this problem
> is related to a bug in logical replication "origins" code rather than
> any procedural problems in your base-backup taking setup ...

Possible.

What's the max_replication_origins setting? Is the system receiving
logical replication data? Could you describe the setup a bit? Any chance
the system's partially been running without fsync? Could you attach both
a corrupt and a non-corrupt state file?

It's a bit weird to see such an error because normally the state file's
just written to a temporary file and then renamed into place,
overwriting the old file.

> I wonder if there is some truncation of the 0x1257DADE value that
> produces the 5714534 value you're seeing -- something related to a
> pg_logical/replorigin_checkpoint file being written partially while the
> backup is being taken.
>
> Another point towards not including pg_logical/ contents when taking a
> base backup, I guess ...

You'd cause corruption if logical replication is in use, so no, please
don't.

- Andres

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Marco Nietz 2017-11-30 06:54:40 Re: Barman WAL size issue
Previous Message Alvaro Herrera 2017-11-29 23:22:43 Re: 'replication checkpoint has wrong magic' on the newly cloned replicas