Re: BUG #14171: Wrong FSM file after switching hot standby to master

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Timofei Dynikov <timofeid(at)outlook(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14171: Wrong FSM file after switching hot standby to master
Date: 2016-06-03 14:01:45
Message-ID: CAB7nPqRGbous8bpLRskypXLpFJdte-Psjz_PbV51OQnSaiNBjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Jun 3, 2016 at 7:09 PM, Timofei Dynikov <timofeid(at)outlook(dot)com> wrote:
>> Date: Thu, 2 Jun 2016 07:42:32 -0700 andres(at)anarazel(dot)de wrote:
>> If there was a restart involved, it seems unlikely that that'll be
>> relevant. Timofei, do I understand correctly that the problem persists
>> across restarts?
>
> Yes, problem persists across restarts. We can resolve problem only by
> performing VACUUM FULL or delete inconsistent FSM file.

pacemaker removes recovery.conf and then restarts the node at
failover, so the node moves on with a crash recovery on the same
timeline in this case. I recall seeing cases where a relation file was
truncated when crash recovery began in 9.4.4, that got fixed in 9.4.5.
The environment where this happened made it hard to compile to
reproduce it but I somewhat diagnosed this as being a side effect of
be25a08, that e118555 fixed afterwards, at least I did not see
anything else that could have been the origin of the problem between
9.4.4 and 9.4.5. The problem was in the same way happening on a small
table, one that had no more than 5 tuples, and those were removed
quite frequently to the table was most of the time empty, however when
crash recovery began it had some records.

Could you update to at least 9.4.5 and see if the problem goes away?
We may as well have another problem hidden here..
--
Michael

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-06-03 14:22:55 Re: BUG #14173: Not using partitions with ANY(ARRAY[...])
Previous Message james.beck 2016-06-03 13:31:57 BUG #14175: RPM Conflicts Origination from pgdg94