Re: pg_rewind WAL segments deletion pitfall

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, bungina(at)gmail(dot)com
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, cyberdemn(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: Re: pg_rewind WAL segments deletion pitfall
Date: 2023-08-23 09:04:18
Message-ID: 41e7c31b85aad03a4a2cd14daad31acd@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 2023-08-22 14:32, Michael Paquier wrote:
Thanks for your review!

> On Fri, Aug 18, 2023 at 03:40:57PM +0900, torikoshia wrote:
>> Thanks for the patch, I've marked this as ready-for-committer.
>>
>> BTW, this issue can be considered a bug, right?
>> I think it would be appropriate to provide backpatch.
>
> Hmm, I agree that there is a good argument in back-patching as we have
> the WAL files between the redo LSN and the divergence LSN, but
> pg_rewind is not smart enough to keep them around. If the archives of
> the primary were not able to catch up, the old primary is as good as
> kaput, and restore_command won't help here.

True.
I also imagine that in the typical failover scenario where the target
cluster was shut down soon after the divergence and pg_rewind was
executed without much time, we can avoid this kind of 'requested WAL
segment has already removed' error by preventing pg_rewind from deleting
necessary WALs.

> I don't like much this patch. While it takes correctly advantage of
> the backward record read logic from SimpleXLogPageRead() able to
> handle correctly timeline jumps, it creates a hidden dependency in the
> code between the hash table from filemap.c and the page callback.
> Wouldn't it be simpler to build a list of the segment names using the
> information from WALOpenSegment and build this list in
> findLastCheckpoint()? Also, I am wondering if we should be smarter
> with any potential conflict handling between the source and the
> target, rather than just enforcing a FILE_ACTION_NONE for all these
> files. In short, could it be better to copy the WAL file from the
> source if it exists there?
>
> + /*
> + * Some entries (WAL segments) already have an action assigned
> + * (see SimpleXLogPageRead()).
> + */
> + if (entry->action == FILE_ACTION_UNDECIDED)
> + entry->action = decide_file_action(entry);
>
> This change makes me a bit uneasy, per se my previous comment with the
> additional code dependencies.
>
> I think that this scenario deserves a test case. If one wants to
> emulate a delay in WAL archiving, it is possible to set
> archive_command to a command that we know will fail, for instance.
> --
> Michael

Bungina, are you going to respond to these comments?

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message kaijian xu 2023-08-23 11:40:32 Re: BUG #18065: An error occurred when attempting to add a column of type "vector" to a table named "vector".
Previous Message Amit Kapila 2023-08-23 08:46:32 Re: BUG #18055: logical decoding core on AllocateSnapshotBuilder()

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-08-23 09:30:52 Re: EBCDIC sorting as a use case for ICU rules
Previous Message vignesh C 2023-08-23 08:57:11 Re: persist logical slots to disk during shutdown checkpoint