From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Subject: | Re: promotion related handling in pg_sync_replication_slots() |
Date: | 2024-04-06 06:19:19 |
Message-ID: | CAA4eK1LiZdF9UAL0KTOCKt3rw1sU2cRrV+nwfjipbS2WZQwOCw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> Please find v2. Changes are:
> 1) Rebased the patch as there was conflict due to recent commit 6f132ed.
> 2) Added an Assert in update_synced_slots_inactive_since() to ensure
> that the slot does not have active_pid.
> 3) Improved commit msg and comments.
>
Few comments:
==============
1.
void
SyncReplicationSlots(WalReceiverConn *wrconn)
{
+ /*
+ * Startup process signaled the slot sync to stop, so if meanwhile user
+ * has invoked slot sync SQL function, simply return.
+ */
+ SpinLockAcquire(&SlotSyncCtx->mutex);
+ if (SlotSyncCtx->stopSignaled)
+ {
+ ereport(LOG,
+ errmsg("skipping slot synchronization as slot sync shutdown is
signaled during promotion"));
+
+ SpinLockRelease(&SlotSyncCtx->mutex);
+ return;
+ }
+ SpinLockRelease(&SlotSyncCtx->mutex);
There is a race condition with this code. Say during promotion
ShutDownSlotSync() is just before setting this flag and the user has
invoked pg_sync_replication_slots() and passed this check but still
didn't set the SlotSyncCtx->syncing flag. So, now, the promotion would
recognize that there is slot sync going on in parallel, and slot sync
wouldn't know that the promotion is in progress.
The current coding for slot sync worker ensures there is no such race
by checking the stopSignaled and setting the pid together under
spinlock. I think we need to move the setting of the syncing flag
similarly. Once we do that probably checking SlotSyncCtx->syncing
should be sufficient in ShutDownSlotSync(). If we change the location
of setting the 'syncing' flag then please ensure its cleanup as we
currently do in slotsync_failure_callback().
2.
@@ -1395,6 +1395,7 @@ update_synced_slots_inactive_since(void)
if (s->in_use && s->data.synced)
{
Assert(SlotIsLogical(s));
+ Assert(s->active_pid == 0);
We can add a comment like: "The slot must not be acquired by any process"
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2024-04-06 06:25:38 | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Previous Message | Amit Langote | 2024-04-06 06:03:13 | Re: remaining sql/json patches |