bdr_init_copy fails when starting 2nd BDR node

From: John Casey <john(dot)casey(dot)rtp(at)icloud(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: bdr_init_copy fails when starting 2nd BDR node
Date: 2014-12-30 04:51:05
Message-ID: 007701d023ec$38ebdce0$aac396a0$@icloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I've been having issues while attempting to begin BDR replication. If I set
up the main node, then use bdr_init_copy, it always fails on second node, as
shown below.

postgres$ rm -Rf $PGDATA

postgres$ echo db_password | pg_basebackup -X stream -h main_node_ip -p 5432
-U username -D $PGDATA

postgres$ cp $HOME/backup/postgresql.conf $PGDATA

postgres$ bdr_init_copy -U username -D $PGDATA

bdr_init_copy: starting...

Assigning new system identifier: 6098464173726284030...

Creating primary replication slots...

Creating restore point...

Could not connect to the remote server: could not connect to server: No such
file or directory

Is the server running locally and accepting

connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

If I start both servers simply by using pg_ctl using conf set up for
replication, I get the following error on the main node:.

LOG: starting background worker process "bdr
(6098483684958107256,1,16384,): dr: apply"

CONTEXT: slot "bdr_16384_6098483684958107256_1_16384__", output plugin
"bdr", in the startup callback

ERROR: data stream ended

LOG: worker process: bdr (6098483684958107256,1,16384,): dr: apply (PID
6294) exited with exit code 1

. and, I get the following error on the second node:

ERROR: bdr output plugin: slot creation rejected, bdr.bdr_nodes entry for
local node (sysid=6098483778037269710, timelineid=1, dboid=16384):
status='i', bdr still starting up: applying initial dump of remote node

HINT: Monitor pg_stat_activity and the logs, wait until the node has caught
up

CONTEXT: slot "bdr_16384_6098483684958107256_1_16384__", output plugin
"bdr", in the startup callback

LOG: could not receive data from client: Connection reset by peer

It will keep cycling these errors indefinitely.

I have gotten this working off and on; but, I keep running into this issue.
I am on CentOS 6.5. Both servers can execute psql against the databases on
other nodes when not configured for replication, so it is not a connectivity
or firewall issue. I have installed using the beta2 rpm as well as built it
from source for rc1 (bdr stable).

Any ideas?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pawel Veselov 2014-12-30 05:29:59 Re: Improving performance of merging data between tables
Previous Message Andrew Sullivan 2014-12-30 01:53:09 Re: Hostnames, IDNs, Punycode and Unicode Case Folding