Re: Synchronizing slots from primary to standby

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-11-18 10:45:57
Message-ID: CAA4eK1+kLSPrkUGL=hCgyyHNK4oNhnXKvN-dFLjcXMJGEayOUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 17, 2023 at 5:18 PM Drouvot, Bertrand
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> On 11/17/23 2:46 AM, Zhijie Hou (Fujitsu) wrote:
> > On Tuesday, November 14, 2023 10:27 PM Drouvot, Bertrand <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > I feel the WaitForWALToBecomeAvailable may not be the best place to shutdown
> > slotsync worker and drop slots. There could be other reasons(other than
> > promotion) as mentioned in comments in case XLOG_FROM_STREAM to reach the code
> > there. I thought if the intention is to stop slotsync workers on promotion,
> > maybe FinishWalRecovery() is a better place to do it as it's indicating the end
> > of recovery and XLogShutdownWalRcv is also called in it.
>
> I can see that slotsync_drop_initiated_slots() has been moved in FinishWalRecovery()
> in v35. That looks ok.
> >

I was thinking what if we just ignore creating such slots (which
require init state) in the first place? I think that can be
time-consuming in some cases but it will reduce the complexity and we
can always improve such cases later if we really encounter them in the
real world. I am not very sure that added complexity is worth
addressing this particular case, so I would like to know your and
others' opinions.

More Review for v35-0002*
============================
1.
+ ereport(WARNING,
+ errmsg("skipping slots synchronization as primary_slot_name "
+ "is not set."));

There is no need to use a full stop at the end for WARNING messages
and as previously mentioned, let's not split message lines in such
cases. There are other messages in the patch with similar problems,
please fix those as well.

2.
+slotsync_checks()
{
...
...
+ /* The hot_standby_feedback must be ON for slot-sync to work */
+ if (!hot_standby_feedback)
+ {
+ ereport(WARNING,
+ errmsg("skipping slots synchronization as hot_standby_feedback "
+ "is off."));

This message has the same problem as mentioned in the previous
comment. Additionally, I think either atop slotsync_checks or along
with GUC check we should write comments as to why we expect these
values to be set for slot sync to work.

3.
+ /* The worker is running already */
+ if (SlotSyncWorker &&SlotSyncWorker->hdr.in_use
+ && SlotSyncWorker->hdr.proc)

The spacing for both the &&'s has problems. You need a space after the
first && and the second && should be in the prior line.

4.
+ LauncherRereadConfig(&recheck_slotsync);
+
}

An empty line after LauncherRereadConfig() is not required.

5.
+static void
+LauncherRereadConfig(bool *ss_recheck)
+{
+ char *conninfo = pstrdup(PrimaryConnInfo);
+ char *slotname = pstrdup(PrimarySlotName);
+ bool syncslot = enable_syncslot;
+ bool feedback = hot_standby_feedback;

Can we change the variable name 'feedback' to 'standbyfeedback' to
make it slightly more descriptive?

6. The logic to recheck the slot_sync related parameters in
LauncherMain() is not very clear. IIUC, if after reload config any
parameter is changed, we just seem to be checking the validity of the
changed parameter but not restarting the slot sync worker, is that
correct? If so, what if dbname is changed, don't we need to restart
the slot-sync worker and re-initialize the connection; similarly
slotname change also needs some thoughts. Also, if all the parameters
are valid we seem to be re-launching the slot-sync worker without
first stopping it which doesn't seem correct, am I missing something
in this logic?

7.
@@ -524,6 +525,25 @@ CreateDecodingContext(XLogRecPtr start_lsn,
errmsg("replication slot \"%s\" was not created in this database",
NameStr(slot->data.name))));

+ in_recovery = RecoveryInProgress();
+
+ /*
+ * Do not allow consumption of a "synchronized" slot until the standby
+ * gets promoted. Also do not allow consumption of slots with sync_state
+ * as SYNCSLOT_STATE_INITIATED as they are not synced completely to be
+ * used.
+ */
+ if ((in_recovery && (slot->data.sync_state != SYNCSLOT_STATE_NONE)) ||
+ slot->data.sync_state == SYNCSLOT_STATE_INITIATED)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use replication slot \"%s\" for logical decoding",
+ NameStr(slot->data.name)),
+ in_recovery ?
+ errdetail("This slot is being synced from the primary server.") :
+ errdetail("This slot was not synced completely from the primary server."),
+ errhint("Specify another replication slot.")));
+

If we are planning to drop slots in state SYNCSLOT_STATE_INITIATED at
the time of promotion, don't we need to just have an assert or
elog(ERROR, .. for non-recovery cases as such cases won't be
reachable? If so, I think we can separate out that case here.

8.
wait_for_primary_slot_catchup()
{
...
+ /* Check if this standby is promoted while we are waiting */
+ if (!RecoveryInProgress())
+ {
+ /*
+ * The remote slot didn't pass the locally reserved position at
+ * the time of local promotion, so it's not safe to use.
+ */
+ ereport(
+ WARNING,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg(
+ "slot-sync wait for slot %s interrupted by promotion, "
+ "slot creation aborted", remote_slot->name)));
+ pfree(cmd.data);
+ return false;
+ }
...
}

Shouldn't this be an Assert because a slot-sync worker shouldn't exist
for non-standby servers?

9.
wait_for_primary_slot_catchup()
{
...
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ if (!tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ ereport(WARNING,
+ (errmsg("slot \"%s\" disappeared from the primary server,"
+ " slot creation aborted", remote_slot->name)));
+ pfree(cmd.data);
+ walrcv_clear_result(res);
+ return false;

If the slot on primary disappears, shouldn't this part of the code
somehow ensure to remove the slot on standby as well? If it is taken
at some other point in time then at least we should write a comment
here to state how it is taken care of. I think this comment also
applies to a few other checks following this check.

10.
+ /*
+ * It is possible to get null values for lsns and xmin if slot is
+ * invalidated on the primary server, so handle accordingly.
+ */
+ new_invalidated = DatumGetBool(slot_getattr(slot, 1, &isnull));

We can say LSN and Xmin in the above comment to make it easier to
read/understand.

11.
/*
+ * Once we got valid restart_lsn, then confirmed_lsn and catalog_xmin
+ * are expected to be valid/non-null, so assert if found null.
+ */

No need to explicitly say about assert, it is clear from the code. We
can slightly change this comment to: "Once we got valid restart_lsn,
then confirmed_lsn and catalog_xmin are expected to be
valid/non-null."

12.
+ if (remote_slot->restart_lsn < MyReplicationSlot->data.restart_lsn ||
+ TransactionIdPrecedes(remote_slot->catalog_xmin,
+ MyReplicationSlot->data.catalog_xmin))
+ {
+ if (!wait_for_primary_slot_catchup(wrconn, remote_slot))
+ {
+ /*
+ * The remote slot didn't catch up to locally reserved position.
+ * But still persist it and attempt the wait and sync in next
+ * sync-cycle.
+ */
+ if (MyReplicationSlot->data.persistency != RS_PERSISTENT)
+ {
+ ReplicationSlotPersist();
+ *slot_updated = true;
+ }

I think the reason to persist in this case is because next time local
restart_lsn can be ahead than the current location and it can take
more time to create such a slot. We can probably mention the same in
the comments.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-11-18 10:56:47 Re: long-standing data loss bug in initial sync of logical replication
Previous Message Tomas Vondra 2023-11-18 10:30:53 Re: long-standing data loss bug in initial sync of logical replication