Re: Synchronizing slots from primary to standby

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-04-02 14:12:01
Message-ID: CALj2ACUADAcX3ptvBhbaS8bp=7ZZLvq27+v84w_RKCesc6X03w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 2, 2024 at 7:25 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> > 1. Can we just remove pg_logical_replication_slot_advance and use
> > LogicalSlotAdvanceAndCheckSnapState instead? If worried about the
> > function naming, LogicalSlotAdvanceAndCheckSnapState can be renamed to
> > pg_logical_replication_slot_advance?
> >
> > + * Advance our logical replication slot forward. See
> > + * LogicalSlotAdvanceAndCheckSnapState for details.
> > */
> > static XLogRecPtr
> > pg_logical_replication_slot_advance(XLogRecPtr moveto)
> > {
>
> It was commented[1] that it's not appropriate for the
> pg_logical_replication_slot_advance to have an out parameter
> 'ready_for_decoding' which looks bit awkward as the functionality doesn't match
> the name, and is also not consistent with the style of
> pg_physical_replication_slot_advance(). So, we added a new function.

I disagree here. A new function just for a parameter is not that great
IMHO. I'd just rename LogicalSlotAdvanceAndCheckSnapState(XLogRecPtr
moveto, bool *found_consistent_snapshot) to
pg_logical_replication_slot_advance(XLogRecPtr moveto, bool
*found_consistent_snapshot) and use it. If others don't like this, I'd
at least turn pg_logical_replication_slot_advance(XLogRecPtr moveto) a
static inline function.

> > 5. As far as the test case for this issue is concerned, I'm fine with
> > adding one using an INJECTION point because we seem to be having no
> > consistent way to control postgres writing current snapshot to WAL.
>
> Since me and my colleagues can reproduce the issue consistently after applying
> 0002 and it could be rare for concurrent xl_running_xacts to happen, we discussed[2] to
> consider adding the INJECTION point after pushing the main fix.

Right.

> > 7.
> > + /*
> > + * We need to access the system tables during decoding to build the
> > + * logical changes unless we are in fast-forward mode where no changes
> > are
> > + * generated.
> > + */
> > + if (slot->data.database != MyDatabaseId && !fast_forward)
> >
> > May I know if we need this change for this fix?
>
> The slotsync worker needs to advance the slots from different databases in
> fast_forward. So, we need to skip this check in fast_forward mode. The analysis can
> be found in [3].
- if (slot->data.database != MyDatabaseId)
+ /*
+ * We need to access the system tables during decoding to build the
+ * logical changes unless we are in fast_forward mode where no changes are
+ * generated.
+ */
+ if (slot->data.database != MyDatabaseId && !fast_forward)
ereport(ERROR,

It's not clear from the comment that we need it for a slotsync worker
to advance the slots from different databases. Can this be put into
the comment? Also, specify in the comment, why this is safe?

Also, if this change is needed for only slotsync workers, why not
protect it with IsSyncingReplicationSlots()? Otherwise, it might
impact non-slotsync callers, no?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-04-02 14:32:54 Re: psql not responding to SIGINT upon db reconnection
Previous Message Greg Sabino Mullane 2024-04-02 14:03:57 Re: On disable_cost