Re: repmgr won't update witness after failover

From: Aviel Buskila <aviel33(at)gmail(dot)com>
To: Jony Cohen <jony(dot)cohenjo(at)gmail(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>, Martín Marqués <martin(at)2ndquadrant(dot)com>
Subject: Re: repmgr won't update witness after failover
Date: 2015-08-21 05:25:52
Message-ID: CAB3=tTGy2DCiceEC8=i9ZkYOTm_3OGx-Wf5HJtDKggoruHVj_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hey,
Thanks for the reply, this helped me very much.

Kind Regards,
Aviel Buskila.
בתאריך 17 באוג' 2015 08:49, "Jony Cohen" <jony(dot)cohenjo(at)gmail(dot)com> כתב:

> Hi,
> The clone command just clones the data from node2 to node1, you need to
> also register it with the `force` option to override the old record. (as if
> you're building a new replica node...)
> see:
>
> https://github.com/2ndQuadrant/repmgr#converting-a-failed-master-to-a-standby
>
> Regards,
> - Jony
>
>
> On Sun, Aug 16, 2015 at 3:19 PM, Aviel Buskila <aviel33(at)gmail(dot)com> wrote:
>
>> Hey,
>> I think I know what the problem is,
>> after the first failover when I clone the old master to be standby with
>> the 'repmgr standby clone' command it seems that nothing updates the
>> repl_nodes table with the new standby in my cluster so on the next failover
>> the repmgrd is failed to find a new upcoming standby to failover..
>>
>> this issue is confirmed after that I manually updated the repl_nodes
>> table after the clone so that the old master is now a standby database.
>>
>> now my question is:
>> Where does is suppose to happen that after I issue the 'repmgr standby
>> clone' the repl_nodes should be updated too about the new standby server?
>>
>> Best regards,
>> Aviel Buskila
>>
>>
>>
>> 2015-08-16 12:11 GMT+03:00 Aviel Buskila <aviel33(at)gmail(dot)com>:
>>
>>> hey,
>>>
>>> I have tried to set the configuration all over again, now the status of
>>> 'repl_nodes' before the failover is:
>>>
>>> id | type | upstream_node_id | cluster | name | conninfo | priority |
>>> active
>>>
>>> ----+---------+---------------+------------------------------------------------------------+----------+---------
>>> 1 | master | | cluster_name |node1| host=node1
>>> dbname=repmgr port=5432 user=repmgr | 100 | t
>>> 2 | standby| 1 | cluster_name |node2| host=node2
>>> dbname=repmgr port=5432 user=repmgr | 100 | t
>>>
>>> 3 | witness| | cluster_name |node3| host=node3
>>> dbname=repmgr port=5499 user=repmgr | 100 | t
>>>
>>>
>>> repmgr is started on node2 and node3 (standby and witness) now when I
>>> kill postgresmaster process I can see in the
>>>
>>> repmgrd log the following messages:
>>>
>>> [WARNING] connection to master has been lost, trying to recover... 60
>>> seconds before failover decision
>>>
>>> [WARNING] connection to master has been lost, trying to recover... 50
>>> seconds before failover decision
>>>
>>> [WARNING] connection to master has been lost, trying to recover... 40
>>> seconds before failover decision
>>>
>>> [WARNING] connection to master has been lost, trying to recover... 30
>>> seconds before failover decision
>>>
>>> [WARNING] connection to master has been lost, trying to recover... 20
>>> seconds before failover decision
>>>
>>> [WARNING] connection to master has been lost, trying to recover... 10
>>> seconds before failover decision
>>>
>>>
>>> and than when it tried to elect node2 to be promoted it shows the
>>> following messages:
>>>
>>> [DEBUG] connecting to: 'host=node2 user=repmgr dbname=repmgr
>>> fallback_application_name='repmgr''
>>>
>>> [WARNING] unable to defermmine a valid master server; waiting 10 seconds
>>> to retry...
>>>
>>> [ERROR] unable to determine a valid master node, terminating...
>>>
>>> [INFO] repmgrd terminating..
>>>
>>>
>>>
>>> what am I doing wrong?
>>>
>>>
>>> El 14/08/15 a las 04:14, Aviel Buskila escribió:
>>> > Hey,
>>> > yes I did .. and still it wont fail back..
>>>
>>> Can you send over the output of "repmgr cluster show" before and after
>>> the failover process?
>>>
>>> The output of SELECT * FROM repmgr_schema.repl_nodes; after the failover
>>> (you need to change repmgr_schema with what you have configured).
>>>
>>> Also, which version of repmgr are you running?
>>>
>>> > 2015-08-13 16:23 GMT+03:00 Jony Vesterman Cohen <
>>> jony(dot)cohenjo(at)gmail(dot)com>:
>>> >
>>> >> Hi, did you make the old master follow the new one using repmgr?
>>> >>
>>> >> It doesn't update itself automatically...
>>> >> From the looks of it repmgr thinks you have 2 masters - the old one
>>> >> offline and the new one online.
>>>
>>> Regards,
>>>
>>> --
>>> Martín Marqués http://www.2ndQuadrant.com/
>>> PostgreSQL Development, 24x7 Support, Training & Services
>>>
>>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Igor Sosa Mayor 2015-08-21 05:30:35 Problem with pl/python procedure connecting to the internet
Previous Message Jeff Janes 2015-08-20 19:22:50 Dangers of mislabelled immutable functions