RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2023-10-31 11:20:56
Message-ID: OS0PR01MB57167A14540096B5D173E73E94A0A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, October 31, 2023 6:45 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > b) Currently I have restricted 'alter subscription.. refresh
> > publication with copy=true' when failover=true (on a similar line of
> > two-phase). The reason being, refresh with copy=true will go for
> > table-sync again and since failover was set in main-slot after
> > table-sync was done, it will need going through the same transition of
> > 'p' to 'e' for main slot making it unsyncable for that time. Should it
> > be allowed?
> >
>
> Yeah, I also think we can't allow refresh with copy=true when 'failover' is
> enabled.
>
> I think the current implementation of this flag seems a bit clumsy because
> 'failover' is a slot property and we are trying to map it to plugin_options. It has
> to be considered similar to the opt_temporary option while creating the slot.
>
> We have create_replication_slot and drop_replication_slot in repl_gram.y. How
> about if introduce alter_replication_slot and handle the 'failover' flag with that?
> The idea is we will either enable 'failover' at the time create_replication_slot by
> providing an optional failover option or execute a separate command
> alter_replication_slot. I think we probably need to perform this command
> before the start of streaming.
>
> I think we will have the following options to allow alter of the 'failover'
> property: (a) we can allow altering 'failover' only for the 'disabled' subscription;
> to achieve that, we need to open a connection during alter subscription and
> change this property of slot; (b) apply worker detects the change in 'failover'
> option; run the alter_replication_slot command; this needs more analysis as
> apply_worker is already doing streaming and changing slot property in
> between could be tricky.

I think for approach b), one challenge is the handling of the error case. E.g.
If the apply worker errored out when executing the alter_replication_slot
command, it may not be able to retry that after restarting, because it won't
know if the value has changed before. (Or we have to execute
alter_replication_slot always at the beginning in apply worker which seems not great).

Best Regards,
Hou zj

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nisha Moond 2023-10-31 11:23:00 Intermittent failure with t/003_logical_slots.pl test on windows
Previous Message Amit Kapila 2023-10-31 11:17:38 Re: A recent message added to pg_upgade