From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Subject: | Re: promotion related handling in pg_sync_replication_slots() |
Date: | 2024-04-12 10:47:53 |
Message-ID: | CAA4eK1LbZ4bgDFtG5HM=ye6mrHR4WTiAv1V7dY4PPod4F++5dA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 12, 2024 at 7:57 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > Please find v2. Changes are:
> >
> > Thanks for the patch. Here are some comments.
>
> Thanks for reviewing.
> >
> > 1. Can we have a clear saying in the shmem variable who's syncing at
> > the moment? Is it a slot sync worker or a backend via SQL function?
> > Perhaps turn "bool syncing;" to "SlotSyncSource sync_source;"
> >
> > typedef enum SlotSyncSource
> > {
> > SLOT_SYNC_NONE,
> > SLOT_SYNC_WORKER,
> > SLOT_SYNC_BACKEND,
> > } SlotSyncSource;
> >
> > Then, the check in ShutDownSlotSync can be:
> >
> > + /*
> > + * Return if neither the slot sync worker is running nor the function
> > + * pg_sync_replication_slots() is executing.
> > + */
> > + if ((SlotSyncCtx->pid == InvalidPid) &&
> > SlotSyncCtx->sync_source != SLOT_SYNC_BACKEND)
> > {
> >
I don't know if this will be help, especially after fixing the race
condition I mentioned. But otherwise, also, at this stage it doesn't
seem helpful to add the source of sync explicitly.
> > 2.
> > SyncReplicationSlots(WalReceiverConn *wrconn)
> > {
> > + /*
> > + * Startup process signaled the slot sync to stop, so if meanwhile user
> > + * has invoked slot sync SQL function, simply return.
> > + */
> > + SpinLockAcquire(&SlotSyncCtx->mutex);
> > + if (SlotSyncCtx->stopSignaled)
> > + {
> > + ereport(LOG,
> > + errmsg("skipping slot synchronization as slot sync
> > shutdown is signaled during promotion"));
> > +
> >
> > Unless I'm missing something, I think this can't detect if the backend
> > via SQL function is already half-way through syncing in
> > synchronize_one_slot. So, better move this check to (or also have it
> > there) slot sync loop that calls synchronize_one_slot. To avoid
> > spinlock acquisitions, we can perhaps do this check in when we acquire
> > the spinlock for synced flag.
>
> If the sync via SQL function is already half-way, then promotion
> should wait for it to finish. I don't think it is a good idea to move
> the check to synchronize_one_slot(). The sync-call should either not
> start (if it noticed the promotion) or finish the sync and then let
> promotion proceed. But I would like to know others' opinion on this.
>
Agreed.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2024-04-12 11:21:24 | Re: Another WaitEventSet resource leakage in back branches |
Previous Message | Amit Kapila | 2024-04-12 10:37:06 | Re: promotion related handling in pg_sync_replication_slots() |