Re: BUG #14171: Wrong FSM file after switching hot standby to master

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: timofeid(at)outlook(dot)com
Cc: PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: BUG #14171: Wrong FSM file after switching hot standby to master
Date: 2016-06-02 07:42:35
Message-ID: CAB7nPqSNDW7fdb5gc3004Y-kHHU0L+uLyyjyaNDkaUy95QF+pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Jun 1, 2016 at 10:48 PM, <timofeid(at)outlook(dot)com> wrote:
> We have an installation of Postgres 9.4.4(PostgreSQL 9.4.4 on
> x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat
> 4.4.7-11), 64-bit) on RHEL 6.6. DB installed on 2 nodes, 1 node is master,
> another node is hot standby(streaming replication). DB is monitored by
> pacеmaker pgsql agent.

You surely want to update to 9.4.8 first. You are missing many bug fixes.

> Sometimes we have troubles with fsm-files. In case:
> • master instance is switching to another node(failover or switchover) on
> highload
> • Hot standby node restart and run as master succesfully.
> • After that we sometimes get FSM files pointing to non-existent blocks in
> the table, so subsequent insert operations on such tables fails with error
> message like following: 'could not read block XX in file "base/YYYY/ZZZZZ"'.
> The issue can be resolved by either deleting of wrong FSM file (while
> database is stopped) or performing VACUUM FULL on erroneous table. The
> problem is usually observed on relatively small tables (e.g. up to 30
> blocks) which are often cleaned out (having most rows deleted).
> Does anybody already faced such behavior? What can be the root cause of such
> problems? Are there any recommendations on how to avoid them?

Andres, do you think that c6ff84b0 can help here? Those symptoms look
rather similar to some missing invalidation messages on the standby.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message lony.namer 2016-06-02 12:32:14 BUG #14172: RPM repository for i386 not working, repomd.xml file is missing.
Previous Message Michael Paquier 2016-06-02 01:59:10 Re: BUG #14167: ecpg parser cann't ignore code in #ifdef ?