RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2023-11-29 09:17:04
Message-ID: OS0PR01MB57167608D8CB5152F25EA1DD9483A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, November 23, 2023 6:06 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> On Tue, Nov 21, 2023 at 8:32 PM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> wrote:
> >
> > v37 fails to apply to HEAD due to a recent commit e83aa9f92fdd,
> > rebased the patches. PFA v37_2 patches.
>
> Thanks for the patch. Some comments:

Thanks for the comments.

> subscriptioncmds.c:
> CreateSubscription()
> and tablesync.c:
> process_syncing_tables_for_apply()
> walrcv_create_slot(wrconn, opts.slot_name, false,
> twophase_enabled,
> - CRS_NOEXPORT_SNAPSHOT, NULL);
> -
> - if (twophase_enabled)
> - UpdateTwoPhaseState(subid,
> LOGICALREP_TWOPHASE_STATE_ENABLED);
> -
> + failover_enabled,
> CRS_NOEXPORT_SNAPSHOT, NULL);
>
> either here or in libpqrcv_create_slot(), shouldn't you check the remote server
> version if it supports the failover flag?

I think we expect the create slot to fail if the server doesn't support failover.
The same is true for two_phase option.

>
>
> +
> + /*
> + * If only the slot_name is specified, it is possible
> that the user intends to
> + * use an existing slot on the publisher, so here we
> enable failover for the
> + * slot if requested.
> + */
> + else if (opts.slot_name && failover_enabled)
> + {
> + walrcv_alter_slot(wrconn, opts.slot_name, opts.failover);
> + ereport(NOTICE,
> + (errmsg("enabled failover for replication
> slot \"%s\" on publisher",
> + opts.slot_name)));
> + }
>
> Here, the code only alters the slot if failover = true. You could use "else if
> (opts.slot_name && IsSet(opts.specified_opts, SUBOPT_FAILOVER)" to check if
> the failover flag is specified and alter for failover=false as well.

Adjusted.

> Also, shouldn't
> you check for the server version if the command ALTER_REPLICATION_SLOT is
> supported?

Similar to create_slot, we expect the command fail if the target server doesn't
support failover the user specified failover = true.

>
> slot.c:
> ReplicationSlotAlter()
>
> +void
> +ReplicationSlotAlter(const char *name, bool failover) {
> + Assert(MyReplicationSlot == NULL);
> +
> + ReplicationSlotAcquire(name, true);
> +
> + if (SlotIsPhysical(MyReplicationSlot))
> + ereport(ERROR,
> + errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot use %s with a physical replication slot",
> + "ALTER_REPLICATION_SLOT"));
>
> shouldn't you release the slot by calling ReplicationSlotRelease before
> erroring out?

No, I think the release of the replication slot will be handled by either
WalSndErrorCleanup, ReplicationSlotShmemExit, or the ReplicationSlotRelease
call in PostgresMain().

>
> slot.c:
> +/*
> + * A helper function to validate slots specified in standby_slot_names GUCs.
> + */
> +static bool
> +validate_standby_slots(char **newval)
> +{
> + char *rawname;
> + List *elemlist;
> + ListCell *lc;
> +
> + /* Need a modifiable copy of string */
> + rawname = pstrdup(*newval);
>
> rawname is not always freed.

The code has been changed due to other comments.

>
> launcher.c:
>
> + SlotSyncWorker->hdr.proc = MyProc;
> +
> + before_shmem_exit(slotsync_worker_detach, (Datum) 0);
> +
> + LWLockRelease(SlotSyncWorkerLock);
> +}
>
> before_shmem_exit() can error out leaving the lock acquired. Maybe you
> should release the lock prior to calling before_shmem_exit() because you don't
> need the lock there.

This has been fixed.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-11-29 09:26:26 RE: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Zhijie Hou (Fujitsu) 2023-11-29 09:12:54 RE: Synchronizing slots from primary to standby