Re: pg_rewind fails after failover, 'invalid record length'

From: Stuart Bishop <stuart(at)stuartbishop(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: pg_rewind fails after failover, 'invalid record length'
Date: 2017-02-16 08:50:48
Message-ID: CADmi=6MXwic7Wf0oh2DGVZj5T-BdfENMCK2W9FnipqZgAtN7Ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 16 February 2017 at 09:58, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
> On Wed, Feb 15, 2017 at 7:02 PM, Stuart Bishop <stuart(at)stuartbishop(dot)net> wrote:
>> I have a test case with 3 PostgreSQL 9.5.5 servers, one master and two
>> hot standbys using standard streaming replication from the master.
>> wal_log_hints is not enabled, but all systems initialized to use
>> checksums.
>
> The version of pg_rewind in Postgres 9.6 is able to handle timeline
> switches, which allows far more flexibility, not the one of 9.5. If
> the standby that has been promoted was the most advanced one, there is
> actually no need to run pg_rewind on the second standby.

Hmm. Ok.

This is for automation, and I was hoping to cover the race condition
where one or more of the standbys is still able to replicate from the
doomed master at the time of promotion (which unit is more advanced
might have changed between the time I measure the timelines and
restart the remaining standbys pointing to the new master). I think
this means I need a second round of restarts, restarting all the
standbys with no primary_conninfo or restore_command and then making
the measurement on which is the most advanced node. Or is it enough to
pg_xlog_replay_pause(), and it doesn't matter if a standby receives
more logs from the doomed master if it doesn't replay them?

(Or not do the promote step at all, just restarting one of the
standbys as master with no timeline switch. But if the doomed master
is still able to ship its WAL files it could corrupt my backups and be
a worse problem)

--
Stuart Bishop <stuart(at)stuartbishop(dot)net>
http://www.stuartbishop.net/

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message regea_wj 2017-02-16 17:00:50 BUG #14548: Install plv8 issue
Previous Message Антошин Антон Игоревич 2017-02-16 06:28:04 Found error in PostgreSQL 9.6.1 and 9.6.2 ...