From: | Shubham Khanna <khannashubham1197(at)gmail(dot)com> |
---|---|
To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Log a warning in pg_createsubscriber for max_slot_wal_keep_size |
Date: | 2025-01-02 09:21:43 |
Message-ID: | CAHv8RjKTxgUU37Ou42FRio-28+27xAiY_p_9yA=Yvmi4VFD2dw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jan 2, 2025 at 4:20 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Shubham.
>
> Here are some review comments for the patch v4-0001.
>
> ======
> Commit message.
>
> 1.
> The 'pg_createsubscriber' utility is updated to fetch and validate the
> 'max_slot_wal_keep_size' setting from the publisher. A warning is raised during
> the '--dry-run' mode if the configuration is set to a non-default value.
>
>
> This says a "warning is raised", but patch v4 changed the warning to
> an error, so this description is incompatible with the code.
>
Since, this patch now raises a warning instead of an error so I think
this should be the same as before.
> ======
> doc/src/sgml/ref/pg_createsubscriber.sgml
>
> 2.
> + <para>
> + The 'max_slot_wal_keep_size' must be set to -1 to prevent the automatic
> + removal of WAL files needed by replication slots. Setting this parameter to
> + a specific size may lead to replication failures if required WAL files are
> + prematurely deleted.
> + </para>
>
> The 'max_slot_wal_keep_size' should not be single-quoted like that. It
> should use the same markup as the other nearby GUC names and also have
> a link like those others do.
>
Added the link for 'max_slot_wal_keep_size' and updated the
'pg_createsubscriber' documentation.
> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> 3.
> +/*
> + * Validate max_slot_wal_keep_size
> + * Logical replication requires max_slot_wal_keep_size to be set to -1 on the
> + * publisher to prevent the deletion of WAL files that are still needed by
> + * replication slots. If this parameter is set to a non-default value, it may
> + * cause replication failures due to required WAL files being prematurely
> + * removed.
> + */
>
> 3a.
> Not fixed? For patch v3 I had already posted [1-#4a] that this entire
> comment is wrongly indented.
>
Fixed.
>
> 3b.
> That first sentence looks like it is missing a period and a following
> blank line. OTOH, maybe the first sentence is not even necessary.
>
Fixed.
>
> 4.
> + if (dry_run && max_slot_wal_keep_size != -1)
> + {
> + pg_log_error("publisher requires max_slot_wal_keep_size to be -1,
> but only %d remain",
> + max_slot_wal_keep_size);
> + pg_log_error_hint("Change the configuration parameter \"%s\" on the
> publisher to %d.",
> + "max_slot_wal_keep_size", -1);
> + failed = true;
> + }
> +
>
> 4a.
> Not fixed? For patch v3 I had already posted [1-#4b] that it seemed
> strange you did not use the \"%s\" substitution for the GUC name of
> the first message (unlike the 2nd message).
>
> I also think it is strange that the 1st message uses a hardwired -1,
> but the 2nd message uses a parameter for the -1.
>
Updated the warning message and warning detail.
>
> 4b.
> I don't think "but only %d remain" is the appropriate wording for this
> GUC. It looks like a cut/paste mistake from a previous message.
> Anyway, maybe this message should be something quite different so it
> is not just duplicating the same information as the error_hint. e.g.
> see below for one idea. This could also resolve the previous review
> comment.
>
> SUGGESTION
> "publisher requires WAL size must not be restricted"
>
I have used your suggestion in the latest patch.
>
> 4c.
> I saw that this only emits the message in dry-mode, which is
> apparently based on the suggestion from Vignesh [2-#1]. Meanwhile, all
> the comments/docs say "it may cause replication failures", so I felt
> it might be better to always stop everything up-front if there is a
> wrong configuration, instead of waiting for potential failures to
> happen -- that's why I had suggested using error instead of a warning,
> but maybe I am in the minority.
>
> IIUC there are two ways to address this configuration problem:
>
> A. Give warnings, but only in dry-run mode. If the warnings are
> ignored (or if the dry-run was not even done), there may be
> replication failures later, but we HOPE it will not happen.
>
> B. Give errors in all modes. The early config error prevents any
> chance of later replication errors.
>
> So, I think the implementation needs to choose either A or B.
> Currently, it seems a mixture.
>
> ======
> [1] my v3 review -
> https://www.postgresql.org/message-id/CAHut%2BPsgEphCa-P-3Q7cORA%3D1TRv15UUsaxN9ZkX56M1-J_QRg%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/CALDaNm09cRzke52UN5zx33PT390whU92oXY4gfOSZEo17CLPjw%40mail.gmail.com
If 'max_slot_wal_keep_size' is not set to -1, there is no surety that
'pg_createsubscriber' will function as expected and the removal of WAL
files will start to occur. Therefore, a warning is raised instead of
an error to alert users about the potential configuration issue.
If 'max_slot_wal_keep_size' is -1 (the default), replication slots may
retain an unlimited amount of WAL files.
The attached Patch contains the suggested changes.
Thanks and regards,
Shubham Khanna.
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Validate-max_slot_wal_keep_size-in-pg_createsubsc.patch | application/octet-stream | 6.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2025-01-02 09:27:17 | Re: Conflict detection for update_deleted in logical replication |
Previous Message | Zhou, Zhiguo | 2025-01-02 09:14:44 | RFC: Lock-free XLog Reservation from WAL |