Re: [BDR] Node Join Question

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: "Wayne E(dot) Seguin" <wayneeseguin(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [BDR] Node Join Question
Date: 2015-05-12 08:31:00
Message-ID: CAMsr+YEVFW5eEf1Gxb198UpBdiQSf-AASQac21mHUoA643nbNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12 May 2015 at 14:36, Wayne E. Seguin <wayneeseguin(at)gmail(dot)com> wrote:

> Also,
>
> Is there a way to remove these things from the init target node easier?
>
> d= p=504 a=ERROR: 55000: previous init failed, manual cleanup is required
> d= p=504 a=DETAIL: Found bdr.bdr_nodes entry for bdr
> (6147869128174526660,1,16908,) with state=i in remote bdr.bdr_nodes
> d= p=504 a=HINT: Remove all replication identifiers and slots
> corresponding to this node from the init target node then drop and recreate
> this database and try again
>

Now that we have SQL-level join it'd probably make sense to provide a
cleanup function for failed node joins. At this point there's no such
function.

Take note of the node identity given in the error as it corresponds to the
replication identifier name and slot name.

You need to, on the join target node:

SELECT pg_drop_replication_slot(slot_name)
FROM pg_replication_slots
WHERE slot_name =
bdr.bdr_format_slot_name('6147869128174526660',1,16908)

where the sysid, timeline ID and database OID are those given in the error.
You must run this from the target node's database, as it'll only consider
slots for the current database.

Then

SELECT pg_replication_identifier_drop(...)

the replication identifier used, after looking up the replication
identifier from pg_catalog.pg_replication_identifier. There isn't an
equivalent of bdr.bdr_format_slot_name for replication identifiers; I'll
look at adding one. Look it up visually or write a simple function to
format the string in the mean time.

Then delete the bdr.bdr_nodes entry for the failed-to-join node and any
bdr.bdr_connections entries for it.

You *must* drop and re-create the database on the failed-to-join node,
making a new blank db (preferably from template0).

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message hubert depesz lubaczewski 2015-05-12 08:42:53 Re: Why does this SQL work?
Previous Message Craig Ringer 2015-05-12 08:19:22 Re: [BDR] Node Join Question