Re: BUG #14109: pg_rewind fails to update target control file in one scenario

From: John Lumby <johnlumby(at)hotmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: pgsql bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14109: pg_rewind fails to update target control file in one scenario
Date: 2016-04-25 13:48:35
Message-ID: COL131-W5403F755BF3290F32B1E62A3620@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thanks Michael,

After the pg_rewind in the scenario I described,

1) on System B (new Primary) I see

Sat Apr 23 14:19:18 EDT 2016

control file indicates
last check point WAL id : 0000000C00000009000000A3

 client_addr |         backend_start         |  state  | sent_location | write_location | flush_location | replay_location
-------------+-------------------------------+---------+---------------+----------------+----------------+-----------------
 10.19.0.1   | 2016-04-23 18:19:50.812509+00 | startup | 9/A30000D0    | 9/A30000D0     | 9/A30000D0     | 9/A30000D0

2) whereas on System A after pg_rewind  I see

Sat Apr 23 14:19:54 EDT 2016

control file indicates

last check point WAL id : 0000000B00000009000000A3

 pg_last_xlog_receive_location() , pg_last_xlog_replay_location() indicates

 pg_last_xlog_receive_location | pg_last_xlog_replay_location
-------------------------------+------------------------------
 9/A3000000                    | 9/A30000D0
(1 row)

Note the difference in timeline

and then,  as I described,   no WAL is replicated from B to A.

Did you try this scenario yourself?     I hope you agree it is a bug?
I will defer to you on what part of the code is the true cause,
but to me it looks very much as though pg_rewind ought to update the control file in this scenario.
That certainly does fix it.
If not that,   then what?

Cheers,   John

----------------------------------------
> Date: Mon, 25 Apr 2016 16:23:58 +0900
> Subject: Re: [BUGS] BUG #14109: pg_rewind fails to update target control file in one scenario
> From: michael(dot)paquier(at)gmail(dot)com
> To: johnlumby(at)hotmail(dot)com
> CC: pgsql-bugs(at)postgresql(dot)org
>
> On Mon, Apr 25, 2016 at 4:25 AM, <johnlumby(at)hotmail(dot)com> wrote:
>> However, what I believe *is* needed is to update the target control file
>> with the new timeline and other information from the source.
>
> No, this is incorrect. There is no need to update the control file of
> a node that has not been rewound, and pg_rewind should not mess up
> with that if there is no divergence point between the target and the
> source nodes or it would update the minimum recovery point of a node
> without real need to do so. It should be able to join back the cluster
> depending on its initial shutdown state (when you shut down systemA).
> What are the logs of your system A telling you regarding its startup
> state?
> --
> Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2016-04-25 14:03:23 Re: BUG #14109: pg_rewind fails to update target control file in one scenario
Previous Message Thom Brown 2016-04-25 13:12:39 Re: [BUGS] Breakage with VACUUM ANALYSE + partitions