Re: Synchronizing slots from primary to standby

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-11-08 03:50:49
Message-ID: CAA4eK1LHb9Bb1O0GFft4H8ddtqZeLPCG2hKuSXYvj=du9MjsLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 7, 2023 at 7:58 PM Drouvot, Bertrand
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> On 11/7/23 11:55 AM, Amit Kapila wrote:
> >>>
> >>> This is not full proof solution but optimization over first one. Now
> >>> in any sync-cycle, we take 2 attempts for slots-creation (if any slots
> >>> are available to be created). In first attempt, we do not wait
> >>> indefinitely on inactive slots, we wait only for a fixed amount of
> >>> time and if remote-slot is still behind, then we add that to the
> >>> pending list and move to the next slot. Once we are done with first
> >>> attempt, in second attempt, we go for the pending ones and now we wait
> >>> on each of them until the primary catches up.
> >>
> >> Aren't we "just" postponing the "issue"? I mean if there is really no activity
> >> on, say, the first created slot, then once we move to the second attempt then any newly
> >> created slot from that time would wait to be synced forever, no?
> >>
> >
> > We have to wait at some point in time for such inactive slots and the
> > same is true even for manually created slots on standby. Do you have
> > any better ideas to deal with it?
> >
>
> What about:
>
> - get rid of the second attempt and the pending_slot_list
> - keep the wait_count and PrimaryCatchupWaitAttempt logic
>
> so basically, get rid of:
>
> /*
> * Now sync the pending slots which were failed to be created in first
> * attempt.
> */
> foreach(cell, pending_slot_list)
> {
> RemoteSlot *remote_slot = (RemoteSlot *) lfirst(cell);
>
> /* Wait until the primary server catches up */
> PrimaryCatchupWaitAttempt = 0;
>
> synchronize_one_slot(wrconn, remote_slot, NULL);
> }
>
> and the pending_slot_list list.
>
> That way, for each slot that have not been created and synced yet:
>
> - it will be created on the standby
> - we will wait up to PrimaryCatchupWaitAttempt attempts
> - the slot will be synced or removed on/from the standby
>
> That way an inactive slot on the primary would not "block"
> any other slots on the standby.
>
> By "created" here I mean calling ReplicationSlotCreate() (not to be confused
> with emitting "ereport(LOG, errmsg("created slot \"%s\" locally", remote_slot->name)); "
> which is confusing as mentioned up-thread).
>
> The problem I can see with this proposal is that the "sync" window waiting
> for slot activity on the primary is "only" during the PrimaryCatchupWaitAttempt
> attempts (as the slot will be dropped/recreated).
>
> If we think this window is too short we could:
>
> - increase it
> or
> - don't drop the slot once created (even if there is no activity
> on the primary during PrimaryCatchupWaitAttempt attempts) so that
> the next loop of attempts will compare with "older" LSN/xmin (as compare to
> dropping and re-creating the slot). That way the window would be since the
> initial slot creation.
>

Yeah, this sounds reasonable but we can't mark such slots to be
synced/available for use after failover. I think if we want to follow
this approach then we need to also monitor these slots for any change
in the consecutive cycles and if we are able to sync them then
accordingly we enable them to use after failover.

Another somewhat related point is that right now, we just wait for the
change on the first slot (the patch refers to it as the monitoring
slot) for computing nap_time before which we will recheck all the
slots. I think we can improve that as well such that even if any
slot's information is changed, we don't consider changing naptime.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-11-08 03:54:15 Re: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message vignesh C 2023-11-08 03:13:50 Re: [PoC] pg_upgrade: allow to upgrade publisher node