From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | masao(dot)fujii(at)oss(dot)nttdata(dot)com, soumyadeep2007(at)gmail(dot)com, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)postgresql(dot)org, jyih(at)vmware(dot)com, kyeap(at)vmware(dot)com |
Subject: | Re: PITR promote bug: Checkpointer writes to older timeline |
Date: | 2021-03-22 00:07:19 |
Message-ID: | YFfft/IZVoHK90Vy@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Mar 18, 2021 at 12:56:12PM +0900, Michael Paquier wrote:
> I was looking at uses of ThisTimeLineID in the wild, and could not
> find it getting checked or used actually in backend-side code that
> involved the WAL reader facility. Even if it brings confidence, it
> does not mean that it is not used somewhere :/
I have been working on that over the last couple of days, and applied
a fix down to 10. One thing that I did not like in the test was the
use of compare() to check if the contents of the WAL segment before
and after the timeline jump remained the same as this would have been
unstable with any concurrent activity. Instead, I have added a phase
at the end of the test with an extra checkpoint and recovery triggered
once, which is enough to reproduce the PANIC reported at the top of
the thread.
I'll look into clarifying the use of ThisTimeLineID within the those
WAL reader callbacks, because this is really bug-prone in the long
term... This requires some coordination with the recent work aimed at
adding some logical decoding support in standbys, though.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2021-03-22 00:09:45 | Re: Log message for GSS connection is missing once connection authorization is successful. |
Previous Message | Justin Pryzby | 2021-03-21 23:55:45 | Re: [HACKERS] Custom compression methods |