From: | Alexander Kukushkin <cyberdemn(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, bungina(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Subject: | Re: pg_rewind WAL segments deletion pitfall |
Date: | 2023-08-23 11:44:52 |
Message-ID: | CAFh8B=k2DUwruBA3djsMf3b_wfqeN092iqaDcZYJ4PAgnne2KA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Hi,
On Tue, 22 Aug 2023 at 07:32, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
>
> I don't like much this patch. While it takes correctly advantage of
> the backward record read logic from SimpleXLogPageRead() able to
> handle correctly timeline jumps, it creates a hidden dependency in the
> code between the hash table from filemap.c and the page callback.
> Wouldn't it be simpler to build a list of the segment names using the
> information from WALOpenSegment and build this list in
> findLastCheckpoint()?
I think the first version of the patch more or less did that. Not
necessarily a list, but a hash table of WAL file names that we want to
keep. But Kyotaro Horiguchi didn't like it and suggested creating entries
in the filemap.c hash table instead.
But, I agree, doing it directly from the findLastCheckpoint() makes the
code easier to understand.
> Also, I am wondering if we should be smarter
> with any potential conflict handling between the source and the
> target, rather than just enforcing a FILE_ACTION_NONE for all these
> files. In short, could it be better to copy the WAL file from the
> source if it exists there?
>
Before the switchpoint these files are supposed to be the same on the old
primary, new primary, and also in the archive. Also, if there is a
restore_command postgres will fetch the same file from the archive even if
it already exists in pg_wal, which effectively discards all pg_rewind
efforts on copying WAL files.
>
> + /*
> + * Some entries (WAL segments) already have an action assigned
> + * (see SimpleXLogPageRead()).
> + */
> + if (entry->action == FILE_ACTION_UNDECIDED)
> + entry->action = decide_file_action(entry);
>
> This change makes me a bit uneasy, per se my previous comment with the
> additional code dependencies.
>
We can revert to the original approach (see
v1-0001-pg_rewind-wal-deletion.patch from the very first email) if you like.
> I think that this scenario deserves a test case. If one wants to
> emulate a delay in WAL archiving, it is possible to set
> archive_command to a command that we know will fail, for instance.
>
Yes, I totally agree, it is on our radar, but meanwhile please see the new
version, just to check if I correctly understood your idea.
Regards,
--
Alexander Kukushkin
Attachment | Content-Type | Size |
---|---|---|
v4-0001-pg_rewind-wal-deletion.patch | text/x-patch | 4.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Euler Taveira | 2023-08-23 19:26:19 | Re: BUG #18067: Droping function that was used to generate column drops the column, even after `DROP EXPRESSION` |
Previous Message | kaijian xu | 2023-08-23 11:40:32 | Re: BUG #18065: An error occurred when attempting to add a column of type "vector" to a table named "vector". |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-08-23 12:02:51 | Re: PostgreSQL 16 release announcement draft |
Previous Message | Etsuro Fujita | 2023-08-23 11:35:00 | Re: list of acknowledgments for PG16 |