| From: | "Kumar, Devesh" <devesh(dot)kumar(at)cmegroup(dot)com> |
|---|---|
| To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
| Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1 |
| Date: | 2024-04-29 10:25:42 |
| Message-ID: | CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hello Laurenz
Thanks for the response. I am putting the details as below:
Primary repmgr.conf Details
[image: image.png]
Secondary repmgr.conf Details
[image: image.png]
Failover steps:
We stopped the primary server pg service and repmgrd automatically did the
failover to standby and made standby as the new primary.
See the below status after failover
[image: image.png]
Failback steps;
1. We executed a checkpoint on the new primary( originally standby ).
2. We ran the below node rejoin command with --dry-run
repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d
'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind
--config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v
--dry-run ///try to check if original_primary is eligible to rejoin
NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 7360952088605465701
NOTICE: pg_rewind execution required for this node to attach to rejoin
target node 2
DETAIL: rejoin target server's timeline 2 forked off current database
system timeline 1 before current recovery point 0/9000028
INFO: prerequisites for using pg_rewind are met
INFO: file "postgresql.conf" would be copied to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not
found, skipping
INFO: file "pg_hba.conf" would be copied to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: pg_rewind would now be executed
DETAIL: pg_rewind command is:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data'
--source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr
connect_timeout=2'
INFO: prerequisites for executing NODE REJOIN are met
3. executed node rejoin command
repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d
'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind
--config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v
NOTICE: using provided configuration file
"/opt/postgresql/15.6/bin/repmgr.conf"
DEBUG: server version number is: 150000
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_primary_node_id():
SELECT node_id FROM repmgr.nodes WHERE type =
'primary' AND active IS TRUE
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo,
n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file,
'' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE
n.node_id = 2
NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr
host=10.29.97.241 port=5432 fallback_application_name=repmgr
options=-csearch_path="
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_recovery_type(): SELECT pg_catalog.pg_is_in_recovery()
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo,
n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file,
'' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE
n.node_id = 1
DEBUG: local timeline: 1; rejoin target timeline: 2
DEBUG: get_timeline_history():
TIMELINE_HISTORY 2
DEBUG: local tli: 1; local_xlogpos: 0/9000028; follow_target_history->tli:
1; follow_target_history->end: 0/9000000
NOTICE: pg_rewind execution required for this node to attach to rejoin
target node 2
DETAIL: rejoin target server's timeline 2 forked off current database
system timeline 1 before current recovery point 0/9000028
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'full_page_writes'
AND setting = 'off'
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'wal_log_hints' AND
setting = 'on'
INFO: prerequisites for using pg_rewind are met
DEBUG: using archive directory "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
DEBUG: copying "postgresql.conf" to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not
found, skipping
DEBUG: copying "pg_hba.conf" to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: 2 files copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "/opt/postgresql/pg/bin/pg_rewind -D
'/pgresdata101/data' --source-server='host=10.29.97.241 port=5432
user=repmgr dbname=repmgr connect_timeout=2'"
DEBUG: executing:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data'
--source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr
connect_timeout=2' 2>/tmp/repmgr_command.wgVGPS
DEBUG: result of command was 1 (256)
DEBUG: local_command(): output returned was:
pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file
"/pgresdata101/data/pg_wal/000000010000000000000008": No such file or
directory
pg_rewind: error: could not find previous WAL record at 0/802B668
ERROR: pg_rewind execution failed
DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file
"/pgresdata101/data/pg_wal/000000010000000000000008": No such file or
directory
pg_rewind: error: could not find previous WAL record at 0/802B668
___________________________
*DEVESH KUMAR*
Database Admin I – India
M: +91 6366843695
devesh(dot)kumar(at)cmegroup(dot)com <firstname(dot)lastname(at)cmegroup(dot)com>
[image: CC24_EC010-Great-Place-to-Work-India-email-sign-260x100px_v2 (1)
(1).jpg]
Address: Tridib Building Block B 5th Floor
Bagmane Tech Park CV Raman Nagar,
Bengaluru, 560093, IN
www.cmegroup.com
On Mon, Apr 29, 2024 at 3:37 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:
> This email is from an external source. Do not click links or open
> attachments you do not trust. EXERCISE CAUTION.
>
> On Sat, 2024-04-27 at 00:36 +0530, Kumar, Devesh wrote:
> > Currently we are working on setting up replication and testing failover
> scenarios
> > and failback. During our testing, failover is getting successful. During
> Failback,
> > when we are reverting the original primary instance as the new standby,
> we are
> > getting pg_rewind errors. Kindly can someone check and let us know.
> >
> > pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
> > pg_rewind: error: could not open file
> "/pgresdata101/data/pg_wal/000000010000000000000008": No such file or
> directory
> > pg_rewind: error: could not find previous WAL record at 0/802B668
>
> You should show the exact commands used for failover and failback.
>
> Yours,
> Laurenz Albe
>
--
NOTICE: This message, and any attachments, are for the intended
recipient(s) only, may contain information that is privileged, confidential
and/or proprietary and subject to important terms and conditions available
at
https://www.cmegroup.com/tools-information/communications/e-communication-disclaimer.html
<https://www.cmegroup.com/tools-information/communications/e-communication-disclaimer.html>
If you are not the intended recipient, please delete this message. CME
Group and its subsidiaries reserve the right to monitor all email
communications that occur on CME Group information systems.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Lakhin | 2024-04-29 14:00:00 | Re: BUG #17855: Uninitialised memory used when the name type value processed in binary mode of Memoize |
| Previous Message | Shlok Kyal | 2024-04-29 10:14:29 | Re: BUG #18433: Logical replication timeout |