Re: how to decrease the promotion time when performing a multiple failovers.....

From: Shay Cohavi <cohavisi(at)gmail(dot)com>
To: Vladimir Borodin <root(at)simply(dot)name>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: how to decrease the promotion time when performing a multiple failovers.....
Date: 2016-01-01 16:32:45
Message-ID: CAMFxmHGiTboqBGxGE-LB=HyLJzkD3BZWexPkqT8QR=vtT==aGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

Thanks for the replay.....
Can you please explain the long failover according to the log:

*for example:*

*[2015-12-12 20:35:10.769 IST] LOG: trigger file found:
/home/postgres/databases/fabrix/trigger*
*[2015-12-12 20:35:10.769 IST] FATAL: terminating walreceiver process due
to administrator command*
*scp: /home/postgres/archive/0000009400000002000000DC: No such file or
directory*
*[2015-12-12 20:35:10.893 IST] LOG: record with zero length at 2/DC000168*
*[2015-12-12 20:35:10.893 IST] LOG: redo done at 2/DC000100*
*scp: /home/postgres/archive/0000009400000002000000DC: No such file or
directory*
*scp: /home/postgres/archive/0000009300000002000000DC: No such file or
directory*
*scp: /home/postgres/archive/0000009200000002000000DC: No such file or
directory*
*.*
*.*
*.*

*scp: /home/postgres/archive/0000009100000002000000DC: No such file or
directory*
*scp: /home/postgres/archive/0000009000000002000000DC: No such file or
directory*
*scp: /home/postgres/archive/00000095.history: No such file or directory*
*[2015-12-12 20:35:11.801 IST] LOG: selected new timeline ID: 149*
*[2015-12-12 20:35:11.931 IST] LOG: restored log file "00000094.history"
from archive*
*[2015-12-12 20:35:12.173 IST] LOG: archive recovery complete*
*[2015-12-12 20:35:12.181 IST] LOG: database system is ready to accept
connections*
*[2015-12-12 20:35:12.181 IST] LOG: autovacuum launcher started*

*And how the recovery_target_timeline = 'latest' should minimize the
promotion time??*

Thanks,
ShayC

On Fri, Jan 1, 2016 at 12:55 PM, Vladimir Borodin <root(at)simply(dot)name> wrote:

>
> 1 янв. 2016 г., в 9:29, Shay Cohavi <cohavisi(at)gmail(dot)com> написал(а):
>
> *Hi,*
> *I have postgresql 9.3 setup with 2 nodes (active/standby with streaming
> replication & continuos archiving).*
> *I have created 2 failover & failback script in order to perform a
> switchover between the DB servers:*
> *1. failover - create a trigger file in order to promote the new primary.*
> *2. failback - perform a base backup as mentions in :*
> * a. start backup on the primary.*
> * b. stop the failed node .*
> didn't delete the DB directory on the failed node
> * c. performing rsync between the nodes.*
> * d.stopping the backup on the primary.*
> * e.performing rsync on the pg_xlog.*
> * f. creating a recovery.conf*
>
> * standby_mode = 'on'*
> * primary_conninfo = 'host=10.50.1.153 port=5432 user=usr
> password=pass'*
> * restore_command = 'scp 10.50.1.153:/home/postgres/archive/%f %p'*
> * trigger_file = '/home/postgres/databases/fabrix/trigger'*
> * archive_cleanup_command = 'ssh 10.50.1.153
> /home/postgres/pg_utils/archive_cleanup.sh %r'*
>
> * g. starting the failed node as secondary.*
>
> *the switchover method:*
> *1. stop the primary node.*
> *2. promote the secondary node (failover.sh).*
> *3. perform failback on the failed node.*
> *4. start the failed node.*
>
> *this method works great! *
>
>
> *but if I perform multiple switchovers (>20), each time the new primary
> gets promoted (trigger file) - it takes longer because it searches the
> timelines on the archive. *
> *is there any way to prevent the multiple 'scp' archive commands which
> makes the promotion longer! *
>
> *for example:*
>
> *[2015-12-12 20:35:10.769 IST] LOG: trigger file found:
> /home/postgres/databases/fabrix/trigger*
> *[2015-12-12 20:35:10.769 IST] FATAL: terminating walreceiver process due
> to administrator command*
> *scp: /home/postgres/archive/0000009400000002000000DC: No such file or
> directory*
> *[2015-12-12 20:35:10.893 IST] LOG: record with zero length at 2/DC000168*
> *[2015-12-12 20:35:10.893 IST] LOG: redo done at 2/DC000100*
> *scp: /home/postgres/archive/0000009400000002000000DC: No such file or
> directory*
> *scp: /home/postgres/archive/0000009300000002000000DC: No such file or
> directory*
> *scp: /home/postgres/archive/0000009200000002000000DC: No such file or
> directory*
> *.*
> *.*
> *.*
>
> *scp: /home/postgres/archive/0000009100000002000000DC: No such file or
> directory*
> *scp: /home/postgres/archive/0000009000000002000000DC: No such file or
> directory*
> *scp: /home/postgres/archive/00000095.history: No such file or directory*
> *[2015-12-12 20:35:11.801 IST] LOG: selected new timeline ID: 149*
> *[2015-12-12 20:35:11.931 IST] LOG: restored log file "00000094.history"
> from archive*
> *[2015-12-12 20:35:12.173 IST] LOG: archive recovery complete*
> *[2015-12-12 20:35:12.181 IST] LOG: database system is ready to accept
> connections*
> *[2015-12-12 20:35:12.181 IST] LOG: autovacuum launcher started*
>
> *this could take for a least 1 min.....or more.*
>
> *is there any way to skip the timeline searching in order to decrease the
> promotion?*
>
>
> You should add recovery_target_timeline = 'latest' to your recovery.conf
> [0].
>
> [0] http://www.postgresql.org/docs/9.3/static/warm-standby.html
> <...>
> If you plan to have multiple standby servers for high availability
> purposes, set recovery_target_timeline to latest, to make the standby
> server follow the timeline change that occurs at failover to another
> standby.
> <...>
>
>
>
> *Thanks,*
> *ShayC*
>
>
>
> --
> May the force be with you…
> https://simply.name
>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Achilleas Mantzios 2016-01-04 07:28:12 Re: pg_upgrade 9.0 -> 9.3 general questions : things to watch out for
Previous Message Vladimir Borodin 2016-01-01 10:55:40 Re: how to decrease the promotion time when performing a multiple failovers.....

Browse pgsql-general by date

  From Date Subject
Next Message Dorian Hoxha 2016-01-01 17:05:43 Re: Happy New Year
Previous Message Melvin Davidson 2016-01-01 14:51:31 Re: Happy New Year