Re: Synchronizing slots from primary to standby

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-07-09 07:30:00
Message-ID: CALj2ACW5z=QYNabSEW-iqJ0oZUyTNztPBXzCoAiDtWFWYMcuag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > Please find attached V5 (a rebase of V4 posted up-thread).
> >
> > In addition to the "rebasing" work, the TAP test adds a test about conflict handling (logical slot invalidation)
> > relying on the work done in the "Minimal logical decoding on standby" patch series.
> >
> > I did not look more at the patch (than what's was needed for the rebase) but plan to do so.
> >
>
> Are you still planning to continue working on this? Some miscellaneous
> comments while going through this patch are as follows?
>
> 1. Can you please try to explain the functionality of the overall
> patch somewhere in the form of comments and or commit message?

IIUC, there are 2 core ideas of the feature:

1) It will never let the logical replication subscribers go ahead of
physical replication standbys specified in standby_slot_names. It
implements this by delaying decoding of commit records on the
walsenders corresponding to logical replication subscribers on the
primary until all the specified standbys confirm receiving the commit
LSN.

2) The physical replication standbys will synchronize data of the
specified logical replication slots (in synchronize_slot_names) from
the primary, creating the logical replication slots if necessary.
Since the logical replication subscribers will never go out of
physical replication standbys, the standbys can safely synchronize the
slots and keep the data necessary for subscribers to connect to it and
work seamlessly even after a failover.

If my understanding is right, I have few thoughts here:

1. All the logical walsenders are delayed on the primary - per
wait_for_standby_confirmation() despite the user being interested in
only a few of them via synchronize_slot_names. Shouldn't the delay be
for just the slots specified in synchronize_slot_names?
2. I think we can split the patch like this - 0001 can be the logical
walsenders delaying decoding on the primary unless standbys confirm,
0002 standby synchronizing the logical slots.
3. I think we need to change the GUC standby_slot_names to better
reflect what it is used for - wait_for_replication_slot_names or
wait_for_
4. It allows specifying logical slots in standby_slot_names, meaning,
it can disallow logical slots getting ahead of other logical slots
specified in standby_slot_names. Should we allow this case with the
thinking that if there's anyone using logical replication for failover
(well, will anybody do that in production?).
5. Similar to above, it allows specifying physical slots in
synchronize_slot_names. Should we disallow?

I'm attaching the v6 patch, a rebased version of v5.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v6-0001-Synchronize-logical-replication-slots-from-primar.patch application/octet-stream 62.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-09 07:36:11 Re: Autogenerate some wait events code and documentation
Previous Message Peter Eisentraut 2023-07-09 07:23:24 Re: Fix last unitialized memory warning