Re: Recovery - New Slave PostgreSQL 9.2

From: "drum(dot)lucas(at)gmail(dot)com" <drum(dot)lucas(at)gmail(dot)com>
To: John Scalia <jayknowsunix(at)gmail(dot)com>
Cc: Shreeyansh Dba <shreeyansh2014(at)gmail(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Recovery - New Slave PostgreSQL 9.2
Date: 2016-01-09 21:48:23
Message-ID: CAE_gQfXdBVUXdJqP5r-r77LDOdk+c31vBi_qChphBYPFP5_c4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi John,

First, when you built the slave server, I'm assuming you used pg_basebackup
> and if you did, did you specify -X s in your command?

Yep. I ran the pg_basebackup into the new slave from ANOTHER SLAVE...
ssh postgres(at)slave1 'pg_basebackup --pgdata=- --format=tar
--label=bb_master --progress --host=localhost --port=5432
--username=replicator --xlog | pv --quiet --rate-limit 100M' | tar -x
--no-same-owner

*-X = --xlog*

On my new Slave, I've got all the wall archives. (The master copies the wal
at all the time...)
ls /var/lib/pgsql/9.2/wal_archive:
0000000200000C6A0000002D
0000000200000C6A0000002E

and not
../wal_archive/0000000400000C68000000C8` not found
../wal_archive/00000005.history` not found

Remember that I'm trying to do a cascading replication (It was working with
another slave. But the server went down and I'm trying to set up a new one)

I would suggest, in spite of of the 2TB size, rebuilding the standby
> servers with a proper pg_basebackup.

I've already ran the pg_basebackup over than once. And I always get the
same error... :(

Is there anything else guys? please,, help hehehhe

Lucas Possamai

kinghost.co.nz
<http://forum.kinghost.co.nz/memberlist.php?mode=viewprofile&u=2&sid=e999f8370385657a65d41d5ff60b0b38>

On 10 January 2016 at 10:33, John Scalia <jayknowsunix(at)gmail(dot)com> wrote:

> Hi,
>
> I'm a little late to this thread, but in looking at the errors you
> originally posted, two things come to mind:
>
> First, when you built the slave server, I'm assuming you used
> pg_basebackup and if you did, did you specify -X s in your command?
>
> Second, the missing history file isn't an issue, in case you're unfamiliar
> with this. However, yeah, the missing WAL segment is, as well as the bad
> timeline error. Is that missing segment still on your primary? You know
> you could just copy it manually to your standby and start from that. As far
> as the timeline error, that's disturbing to me as it's claiming the primary
> is actually a failed over standby. AFAIK, that's the main if not only way
> transaction timelines increment.
>
> I would suggest, in spite of of the 2TB size, rebuilding the standby
> servers with a proper pg_basebackup.
> --
> Jay
>
> Sent from my iPad
>
> On Jan 9, 2016, at 2:19 PM, "drum(dot)lucas(at)gmail(dot)com" <drum(dot)lucas(at)gmail(dot)com>
> wrote:
>
> Hi, thanks for your reply... I've been working on this problem for 20h =(
>
> *# cat postgresql.conf | grep synchronous_standby_names*
> #synchronous_standby_names = '' - It's commented
>
> *# cat postgresql.conf | grep application_name*
> log_line_prefix = '%m|%p|%q[%c](at)%r|%u|%a|%d '
> ( %a = application name )
>
> I can't resyc all the DB again, because it has 2TB of data :(
>
> Is there anything else I can do?
> Thank you
>
>
>
> Lucas Possamai
>
> kinghost.co.nz
> <http://forum.kinghost.co.nz/memberlist.php?mode=viewprofile&u=2&sid=e999f8370385657a65d41d5ff60b0b38>
>
> On 10 January 2016 at 04:22, Shreeyansh Dba <shreeyansh2014(at)gmail(dot)com>
> wrote:
>
>>
>>
>> On Sat, Jan 9, 2016 at 3:28 PM, drum(dot)lucas(at)gmail(dot)com <
>> drum(dot)lucas(at)gmail(dot)com> wrote:
>>
>>> My recovery was like that!
>>> I was already using that way.. I still have the problem =\
>>>
>>> Is there anything I can do?
>>>
>>>
>>>
>>> Lucas Possamai
>>>
>>> kinghost.co.nz
>>> <http://forum.kinghost.co.nz/memberlist.php?mode=viewprofile&u=2&sid=e999f8370385657a65d41d5ff60b0b38>
>>>
>>> On 9 January 2016 at 22:53, Shreeyansh Dba <shreeyansh2014(at)gmail(dot)com>
>>> wrote:
>>>
>>>>
>>>> Hi Lucas,
>>>>
>>>> Yes , now recovery.conf looks good.
>>>> Hope this solve you problem.
>>>>
>>>>
>>>> Thanks and regards,
>>>> ShreeyanshDBA Team
>>>> Shreeyansh Technologies
>>>> www.shreeyansh.com
>>>>
>>>>
>>>>
>>>> On Sat, Jan 9, 2016 at 3:07 PM, drum(dot)lucas(at)gmail(dot)com <
>>>> drum(dot)lucas(at)gmail(dot)com> wrote:
>>>>
>>>>> Hi there!
>>>>>
>>>>> Yep, it's correct:
>>>>> It looks like You have a set up A (Master) ---> B (Replica) ---> C
>>>>> Replica (Base backup from Replica B)
>>>>>
>>>>> Master (A): 192.168.100.1
>>>>> Slave1 (B): 192.168.100.2
>>>>> Slave2 (C): 192.168.100.3
>>>>>
>>>>> My recovery.conf in slave2(C) is:
>>>>>
>>>>> restore_command = 'exec nice -n 19 ionice -c 2 -n 7 ../../bin/restore_wal_segment.bash "../wal_archive/%f" "%p"'
>>>>> archive_cleanup_command = 'exec nice -n 19 ionice -c 2 -n 7 ../../bin/pg_archivecleaup_mv.bash -d "../wal_archive" "%r"'
>>>>> recovery_target_timeline = 'latest'
>>>>> standby_mode = on
>>>>> primary_conninfo = 'host=192.168.100.2 port=5432 user=replicator application_name=replication_slave02'
>>>>>
>>>>> So, seems to be right to me... Is that u mean?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> Lucas Possamai
>>>>>
>>>>> kinghost.co.nz
>>>>> <http://forum.kinghost.co.nz/memberlist.php?mode=viewprofile&u=2&sid=e999f8370385657a65d41d5ff60b0b38>
>>>>>
>>>>> On 9 January 2016 at 22:25, Shreeyansh Dba <shreeyansh2014(at)gmail(dot)com>
>>>>> wrote:
>>>>>
>>>>>> On Sat, Jan 9, 2016 at 8:29 AM, drum(dot)lucas(at)gmail(dot)com <
>>>>>> drum(dot)lucas(at)gmail(dot)com> wrote:
>>>>>>
>>>>>>> ** NOTE: I ran the pg_basebackup from another STANDBY SERVER. Not
>>>>>>> from the MASTER*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Lucas Possamai
>>>>>>>
>>>>>>> kinghost.co.nz
>>>>>>> <http://forum.kinghost.co.nz/memberlist.php?mode=viewprofile&u=2&sid=e999f8370385657a65d41d5ff60b0b38>
>>>>>>>
>>>>>>> On 9 January 2016 at 15:28, drum(dot)lucas(at)gmail(dot)com <
>>>>>>> drum(dot)lucas(at)gmail(dot)com> wrote:
>>>>>>>
>>>>>>>> Still trying to solve the problem...
>>>>>>>> Anyone can help please?
>>>>>>>>
>>>>>>>> Lucas
>>>>>>>>
>>>>>>>>
>>>>>>>> Lucas Possamai
>>>>>>>>
>>>>>>>> kinghost.co.nz
>>>>>>>> <http://forum.kinghost.co.nz/memberlist.php?mode=viewprofile&u=2&sid=e999f8370385657a65d41d5ff60b0b38>
>>>>>>>>
>>>>>>>> On 9 January 2016 at 14:45, drum(dot)lucas(at)gmail(dot)com <
>>>>>>>> drum(dot)lucas(at)gmail(dot)com> wrote:
>>>>>>>>
>>>>>>>>> Sure... Here's the total information:
>>>>>>>>>
>>>>>>>>> http://superuser.com/questions/1023770/new-postgresql-slave-server-error-timeline
>>>>>>>>>
>>>>>>>>> recovery.conf:
>>>>>>>>>
>>>>>>>>> restore_command = 'exec nice -n 19 ionice -c 2 -n 7 ../../bin/restore_wal_segment.bash "../wal_archive/%f" "%p"'
>>>>>>>>> archive_cleanup_command = 'exec nice -n 19 ionice -c 2 -n 7 ../../bin/pg_archivecleaup_mv.bash -d "../wal_archive" "%r"'
>>>>>>>>> recovery_target_timeline = 'latest'
>>>>>>>>> standby_mode = on
>>>>>>>>> primary_conninfo = 'host=192.168.100.XX port=5432 user=replicator application_name=replication_new_slave'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Lucas Possamai
>>>>>>>>>
>>>>>>>>> kinghost.co.nz
>>>>>>>>> <http://forum.kinghost.co.nz/memberlist.php?mode=viewprofile&u=2&sid=e999f8370385657a65d41d5ff60b0b38>
>>>>>>>>>
>>>>>>>>> On 9 January 2016 at 14:37, Ian Barwick <ian(at)2ndquadrant(dot)com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> On 16/01/09 9:23, drum(dot)lucas(at)gmail(dot)com wrote:
>>>>>>>>>> > Hi all!
>>>>>>>>>> >
>>>>>>>>>> > I've done the pg_basebackup from the live to a new slave
>>>>>>>>>> server...
>>>>>>>>>> >
>>>>>>>>>> > I've recovery the wal files, but now that I configured to
>>>>>>>>>> replicate from the master (recovery.conf) I got this error:
>>>>>>>>>> >
>>>>>>>>>> > ../wal_archive/0000000400000C68000000C8` not found
>>>>>>>>>> > ../wal_archive/00000005.history` not found
>>>>>>>>>> >
>>>>>>>>>> > FATAL: timeline 2 of the primary does not match recovery
>>>>>>>>>> target timeline 1
>>>>>>>>>>
>>>>>>>>>> Can you post the contents of your recovery.conf file, suitably
>>>>>>>>>> anonymised if necessary?
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> Ian Barwick
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> Hi Lucas,
>>>>>>
>>>>>> I followed your question I generated the same error:
>>>>>>
>>>>>> cp: cannot stat `/pgdata/arch/00000003.history': No such file or
>>>>>> directory
>>>>>> 2016-01-09 14:11:42 IST FATAL: timeline 1 of the primary does not
>>>>>> match recovery target timeline 2
>>>>>>
>>>>>> It looks like You have a set up A (Master) ---> B (Replica) ---> C
>>>>>> Replica (Base backup from Replica B)
>>>>>>
>>>>>> It seems you have used recovery.conf (to replicate from master to
>>>>>> slave) to new replica setup C and there is high probability not changing
>>>>>> the primary connection info
>>>>>> in C's recovery.conf (Replica B's Connection info)
>>>>>>
>>>>>> During testing providing B's connection info in C's recovery.conf
>>>>>> resolved the issue.
>>>>>>
>>>>>> Please verify the Primary connection info parameter in recovery.conf
>>>>>> (C replica) might resolve your problem.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks and regards,
>>>>>> ShreeyanshDBA Team
>>>>>> Shreeyansh Technologies
>>>>>> www.shreeyansh.com
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> Hi Lucas,
>>
>> It looks like application_name parameter that set in recovery.conf may
>> mismatch.
>> Please verify the value to synchronous_standby_names value set in the
>> postgresql.conf of Replica - C and the value that using as application_name
>> in recovery.conf
>>
>> Also, check whether the Async replication works with out using
>> application_name in recovery.conf of replica -C and check the status in
>> pg_stat_replication catalog table.
>>
>>
>> Thanks and regards
>> ShreeyanshDBA Team
>> Shreeyansh Technologies
>> www.shreeyansh.com
>>
>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message John Scalia 2016-01-09 23:16:21 Re: Recovery - New Slave PostgreSQL 9.2
Previous Message John Scalia 2016-01-09 21:33:05 Re: Recovery - New Slave PostgreSQL 9.2