From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: Synchronizing slots from primary to standby |
Date: | 2023-11-07 10:55:11 |
Message-ID: | CAA4eK1JawxrJ_pWSPPMJb-TG8h0y9Tt7kx9Ei4yBYbRR37a5=w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 7, 2023 at 3:51 PM Drouvot, Bertrand
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> On 10/31/23 10:37 AM, shveta malik wrote:
> > On Fri, Oct 27, 2023 at 8:43 PM Drouvot, Bertrand
> >>
> >> Agree with your test case, but in my case I was not using pub/sub.
> >>
> >> I was not clear, so when I said:
> >>
> >>>> - create logical_slot1 on the primary (and don't start using it)
> >>
> >> I meant don't start decoding from it (like using pg_recvlogical() or
> >> pg_logical_slot_get_changes()).
> >>
> >> By using pub/sub the "don't start using it" is not satisfied.
> >>
> >> My test case is:
> >>
> >> "
> >> SELECT * FROM pg_create_logical_replication_slot('logical_slot1', 'test_decoding', false, true, true);
> >> SELECT * FROM pg_create_logical_replication_slot('logical_slot2', 'test_decoding', false, true, true);
> >> pg_recvlogical -d postgres -S logical_slot2 --no-loop --start -f -
> >> "
> >>
> >
> > Okay, I am able to reproduce it now. Thanks for clarification. I have
> > tried to change the algorithm as per suggestion by Amit in [1]
> >
> > [1]: https://www.postgresql.org/message-id/CAA4eK1KBL0110gamQfc62X%3D5JV8-Qjd0dw0Mq0o07cq6kE%2Bq%3Dg%40mail.gmail.com
>
> Thanks!
>
> >
> > This is not full proof solution but optimization over first one. Now
> > in any sync-cycle, we take 2 attempts for slots-creation (if any slots
> > are available to be created). In first attempt, we do not wait
> > indefinitely on inactive slots, we wait only for a fixed amount of
> > time and if remote-slot is still behind, then we add that to the
> > pending list and move to the next slot. Once we are done with first
> > attempt, in second attempt, we go for the pending ones and now we wait
> > on each of them until the primary catches up.
>
> Aren't we "just" postponing the "issue"? I mean if there is really no activity
> on, say, the first created slot, then once we move to the second attempt then any newly
> created slot from that time would wait to be synced forever, no?
>
We have to wait at some point in time for such inactive slots and the
same is true even for manually created slots on standby. Do you have
any better ideas to deal with it?
> Looking at V30:
>
> + /* Update lsns of slot to remote slot's current position */
> + local_slot_update(remote_slot);
> + ReplicationSlotPersist();
> +
> + ereport(LOG, errmsg("created slot \"%s\" locally", remote_slot->name));
>
> I think this message is confusing as the slot has been created before it, here:
>
> + else
> + {
> + TransactionId xmin_horizon = InvalidTransactionId;
> + ReplicationSlot *slot;
> +
> + ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL,
> + remote_slot->two_phase, false);
>
> So that it shows up in pg_replication_slots before this message is emitted (and that
> specially true/worst for non active slots).
>
> Maybe something like "newly locally created slot XXX has been synced..."?
>
> While at it, would that make sense to move
>
> + slot->data.failover = true;
>
> once we stop waiting for this slot? I think that would avoid confusion if one
> query pg_replication_slots while we are still waiting for this slot to be synced,
> thoughts? (currently we can see pg_replication_slots.synced_slot set to true
> while we are still waiting).
>
The failover property of the slot is different from whether the slot
has been synced yet, so we can't change the location of marking it but
we can try to improve when to show that slot has been synced.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Lakhin | 2023-11-07 11:00:00 | Re: collect_corrupt_items_vacuum.patch |
Previous Message | Amit Kapila | 2023-11-07 10:46:21 | Re: A recent message added to pg_upgade |