Re: Synchronizing slots from primary to standby

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-11-17 11:48:49
Message-ID: d05a8944-a287-485c-85bd-5a694420d870@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/17/23 2:46 AM, Zhijie Hou (Fujitsu) wrote:
> On Tuesday, November 14, 2023 10:27 PM Drouvot, Bertrand <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>> On 11/13/23 2:57 PM, Zhijie Hou (Fujitsu) wrote:
>>> On Friday, November 10, 2023 4:16 PM Drouvot, Bertrand
>> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>>>> Yeah good point, agree to just error out in all the case then (if we
>>>> discard the sync_ reserved wording proposal, which seems to be the
>>>> case as probably not worth the extra work).
>>>
>>> Thanks for the discussion!
>>>
>>> Here is the V33 patch set which includes the following changes:
>>
>> Thanks for working on it!
>>
>>>
>>> 1) Drop slots with state 'i' in promotion flow after we shut down WalReceiver.
>>
>> @@ -3557,10 +3558,15 @@ WaitForWALToBecomeAvailable(XLogRecPtr
>> RecPtr, bool randAccess,
>> * this only after failure, so when you promote, we still
>> * finish replaying as much as we can from archive and
>> * pg_wal before failover.
>> + *
>> + * Drop the slots for which sync is initiated but not yet
>> + * completed i.e. they are still waiting for the primary
>> + * server to catch up.
>> */
>> if (StandbyMode && CheckForStandbyTrigger())
>> {
>> XLogShutdownWalRcv();
>> + slotsync_drop_initiated_slots();
>> return XLREAD_FAIL;
>> }
>>
>> I had a closer look and it seems this is not located at the right place.
>>
>> Indeed, it's added here:
>>
>> switch (currentSource)
>> {
>> case XLOG_FROM_ARCHIVE:
>> case XLOG_FROM_PG_WAL:
>>
>> While in our case we are in
>>
>> case XLOG_FROM_STREAM:
>>
>> So I think we should move slotsync_drop_initiated_slots() in the
>> XLOG_FROM_STREAM case. Maybe before shutting down the sync slot worker?
>> (the TODO item number 2 you mentioned up-thread)
>
> Thanks for the comment.
>
> I feel the WaitForWALToBecomeAvailable may not be the best place to shutdown
> slotsync worker and drop slots. There could be other reasons(other than
> promotion) as mentioned in comments in case XLOG_FROM_STREAM to reach the code
> there. I thought if the intention is to stop slotsync workers on promotion,
> maybe FinishWalRecovery() is a better place to do it as it's indicating the end
> of recovery and XLogShutdownWalRcv is also called in it.

I can see that slotsync_drop_initiated_slots() has been moved in FinishWalRecovery()
in v35. That looks ok.
>
> And I feel we'd better drop the slots after shutting down the slotsync workers,
> because otherwise the slotsync workers could create the dropped slot again in
> rare cases.

Yeah, agree and I can see that it's done that way in v35.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2023-11-17 12:25:15 Re: ALTER COLUMN ... SET EXPRESSION to alter stored generated column's expression
Previous Message Amit Kapila 2023-11-17 11:38:36 Re: Synchronizing slots from primary to standby