Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint

From: Martín Marqués <martin(at)2ndquadrant(dot)com>
To: Cameron Smith <csmith(at)stereodllc(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint
Date: 2016-05-20 00:38:06
Message-ID: 59fb58ad-2e7d-bffb-92d2-cc6d67978e74@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

El 19/05/16 a las 16:15, Cameron Smith escribió:
> I'd agree: most likely a file system problem. Is there any hope that this file could be re-built?
>
> My current plan is to use bdr_part_by_node_names to remove the failing node and then rebuild it from a fresh backup (and probably on a new server).

I think the most sensible plan is to remove the node from the bdr
cluster with bdr_part_by_node_name(), maybe clean up the bdr_nodes table
(some won't be happy with me suggesting this :)), remove the data
directory on the failed node and rejoin with bdr_init_copy

I'd suggest following the suggestions from Christoph and check that you
have a sane file-system configuration.

Also check if you didn't end up with a damaged disk (run some stress
test on the hardware).

If this is on production (not a toy installation) I would suggest
replacing the disks all together.

Regards,

--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Venkata Balaji N 2016-05-20 05:14:54 postgresql-9.5.3 compilation on Solaris SPARC
Previous Message Tom Lane 2016-05-19 21:52:26 Re: Debugging a backend stuck consuming CPU