Re: promotion related handling in pg_sync_replication_slots()

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Subject: Re: promotion related handling in pg_sync_replication_slots()
Date: 2024-04-12 10:47:53
Message-ID: CAA4eK1LbZ4bgDFtG5HM=ye6mrHR4WTiAv1V7dY4PPod4F++5dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 12, 2024 at 7:57 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > Please find v2. Changes are:
> >
> > Thanks for the patch. Here are some comments.
>
> Thanks for reviewing.
> >
> > 1. Can we have a clear saying in the shmem variable who's syncing at
> > the moment? Is it a slot sync worker or a backend via SQL function?
> > Perhaps turn "bool syncing;" to "SlotSyncSource sync_source;"
> >
> > typedef enum SlotSyncSource
> > {
> > SLOT_SYNC_NONE,
> > SLOT_SYNC_WORKER,
> > SLOT_SYNC_BACKEND,
> > } SlotSyncSource;
> >
> > Then, the check in ShutDownSlotSync can be:
> >
> > + /*
> > + * Return if neither the slot sync worker is running nor the function
> > + * pg_sync_replication_slots() is executing.
> > + */
> > + if ((SlotSyncCtx->pid == InvalidPid) &&
> > SlotSyncCtx->sync_source != SLOT_SYNC_BACKEND)
> > {
> >

I don't know if this will be help, especially after fixing the race
condition I mentioned. But otherwise, also, at this stage it doesn't
seem helpful to add the source of sync explicitly.

> > 2.
> > SyncReplicationSlots(WalReceiverConn *wrconn)
> > {
> > + /*
> > + * Startup process signaled the slot sync to stop, so if meanwhile user
> > + * has invoked slot sync SQL function, simply return.
> > + */
> > + SpinLockAcquire(&SlotSyncCtx->mutex);
> > + if (SlotSyncCtx->stopSignaled)
> > + {
> > + ereport(LOG,
> > + errmsg("skipping slot synchronization as slot sync
> > shutdown is signaled during promotion"));
> > +
> >
> > Unless I'm missing something, I think this can't detect if the backend
> > via SQL function is already half-way through syncing in
> > synchronize_one_slot. So, better move this check to (or also have it
> > there) slot sync loop that calls synchronize_one_slot. To avoid
> > spinlock acquisitions, we can perhaps do this check in when we acquire
> > the spinlock for synced flag.
>
> If the sync via SQL function is already half-way, then promotion
> should wait for it to finish. I don't think it is a good idea to move
> the check to synchronize_one_slot(). The sync-call should either not
> start (if it noticed the promotion) or finish the sync and then let
> promotion proceed. But I would like to know others' opinion on this.
>

Agreed.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2024-04-12 11:21:24 Re: Another WaitEventSet resource leakage in back branches
Previous Message Amit Kapila 2024-04-12 10:37:06 Re: promotion related handling in pg_sync_replication_slots()