Re: Separate GUC for replication origins

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Separate GUC for replication origins
Date: 2025-01-24 23:12:49
Message-ID: CAD21AoD-TqY1N8Vibnzr81ZVuQOjVDY0jF0zQi_e1_8oNLzieQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 8, 2025 at 7:39 AM Euler Taveira <euler(at)eulerto(dot)com> wrote:
>
> On Thu, Dec 19, 2024, at 10:31 AM, Peter Eisentraut wrote:
>
> On 10.12.24 19:41, Euler Taveira wrote:
> > I'm attaching a patch that adds max_replication_origins. It basically
> > replaces
> > all of the points that refers to max_replication_slots on the subscriber. It
> > uses the same default value as max_replication_slots (10). I did nothing to
> > keep the backward compatibility. I mean has a special value (like -1) that
> > means same value as max_replication_slots. Is it worth it? (If not, it
> > should
> > be mentioned in the commit message.)
>
> I think some backward compatibility tweak like that would be useful.
> Otherwise, the net effect of this is that most people will have to do
> one more thing than before to keep their systems working. Also, all
> deployment and automation scripts would have to be updated for this.
>
> This new patch includes the special value (-1) that uses max_replication_slots
> instead.

Thank you for working on this. The proposed idea makes sense to me.

Here are some comments on v2 patch:

---
/* Report this after the initial starting message for consistency. */
- if (max_replication_slots == 0)
+ if (max_replication_origins == 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
errmsg("cannot start logical
replication workers when \"max_replication_slots\"=0")));

Need to update the error message too.

---
+ {"max_replication_origins",
+ PGC_POSTMASTER,
+ REPLICATION_SUBSCRIBERS,
+ gettext_noop("Sets the maximum number of
simultaneously defined replication origins."),
+ NULL
+ },

I think the description is not accurate; this GUC controls the maximum
number of simultaneous replication origins that can be setup.

---
Given that max_replication_origins doesn't control the maximum number
of replication origins that can be defined, probably we need to find a
better name. As Kuroda-san already mentioned some proposed names,
max_tracked_replication_origins or max_replication_origin_states seem
reasonable to me.

---
+#include "utils/guc_hooks.h"

I think #include'ing guc.h would be more appropriate.

------
ereport(PANIC,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("replication slot checkpoint has wrong
checksum %u, expected %u",
crc, file_crc)));

I don't understand why we use the term "replication slot checkpoint"
in an error message for the replication origin code. It could be fixed
in a separate patch for back-patching.

---
It's not related to the proposed patch but I found a suspicious
behavior; when the server startup we try to restore the
replorigin_checkpoint file only when max_replication_slots (or
max_replication_origins with the patch) > 0. Therefore, we don't get
the error message "could not find free replication state, increase
\"max_replication_slots\"" when it's set to 0, even if we have some
replication states in the replorigin_checkpoint file. Which seems
quite confusing.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2025-01-24 23:20:57 Re: Convert sepgsql tests to TAP
Previous Message Robins Tharakan 2025-01-24 23:11:14 Re: Convert sepgsql tests to TAP