Re: Synchronizing slots from primary to standby

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-07-10 03:36:22
Message-ID: CAA4eK1KAC1TPZStUXjMi0gd-OCRTY04JgY3cyKUXyq3Ndt32CQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 9, 2023 at 1:01 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> >
> > 1. Can you please try to explain the functionality of the overall
> > patch somewhere in the form of comments and or commit message?
>
> IIUC, there are 2 core ideas of the feature:
>
> 1) It will never let the logical replication subscribers go ahead of
> physical replication standbys specified in standby_slot_names. It
> implements this by delaying decoding of commit records on the
> walsenders corresponding to logical replication subscribers on the
> primary until all the specified standbys confirm receiving the commit
> LSN.
>
> 2) The physical replication standbys will synchronize data of the
> specified logical replication slots (in synchronize_slot_names) from
> the primary, creating the logical replication slots if necessary.
> Since the logical replication subscribers will never go out of
> physical replication standbys, the standbys can safely synchronize the
> slots and keep the data necessary for subscribers to connect to it and
> work seamlessly even after a failover.
>
> If my understanding is right,
>

This matches my understanding as well.

> I have few thoughts here:
>
> 1. All the logical walsenders are delayed on the primary - per
> wait_for_standby_confirmation() despite the user being interested in
> only a few of them via synchronize_slot_names. Shouldn't the delay be
> for just the slots specified in synchronize_slot_names?
> 2. I think we can split the patch like this - 0001 can be the logical
> walsenders delaying decoding on the primary unless standbys confirm,
> 0002 standby synchronizing the logical slots.
>

Agreed with the above two points.

> 3. I think we need to change the GUC standby_slot_names to better
> reflect what it is used for - wait_for_replication_slot_names or
> wait_for_
>

I feel at this stage we can focus on getting the design and
implementation correct. We can improve GUC names later once we are
confident that the functionality is correct.

> 4. It allows specifying logical slots in standby_slot_names, meaning,
> it can disallow logical slots getting ahead of other logical slots
> specified in standby_slot_names. Should we allow this case with the
> thinking that if there's anyone using logical replication for failover
> (well, will anybody do that in production?).
>

I think on the contrary we should prohibit this case. We can always
extend this functionality later.

> 5. Similar to above, it allows specifying physical slots in
> synchronize_slot_names. Should we disallow?
>

We should prohibit that as well.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-10 04:10:45 Re: Add more sanity checks around callers of changeDependencyFor()
Previous Message Hayato Kuroda (Fujitsu) 2023-07-10 03:35:49 RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL