From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | RE: Synchronizing slots from primary to standby |
Date: | 2024-01-12 14:21:14 |
Message-ID: | OS0PR01MB571641EF50C4B99A5DB3C601946F2@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Friday, January 12, 2024 8:00 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
Hi,
>
> On Thu, Jan 11, 2024 at 7:53 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Jan 9, 2024 at 6:39 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > +static bool
> > > +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot
> > > +*remote_slot)
> > > {
> > > ...
> > > + /* Slot ready for sync, so sync it. */ else {
> > > + /*
> > > + * Sanity check: With hot_standby_feedback enabled and
> > > + * invalidations handled appropriately as above, this should never
> > > + * happen.
> > > + */
> > > + if (remote_slot->restart_lsn < slot->data.restart_lsn) elog(ERROR,
> > > + "cannot synchronize local slot \"%s\" LSN(%X/%X)"
> > > + " to remote slot's LSN(%X/%X) as synchronization"
> > > + " would move it backwards", remote_slot->name,
> > > + LSN_FORMAT_ARGS(slot->data.restart_lsn),
> > > + LSN_FORMAT_ARGS(remote_slot->restart_lsn));
> > > ...
> > > }
> > >
> > > I was thinking about the above code in the patch and as far as I can
> > > think this can only occur if the same name slot is re-created with
> > > prior restart_lsn after the existing slot is dropped. Normally, the
> > > newly created slot (with the same name) will have higher restart_lsn
> > > but one can mimic it by copying some older slot by using
> > > pg_copy_logical_replication_slot().
> > >
> > > I don't think as mentioned in comments even if hot_standby_feedback
> > > is temporarily set to off, the above shouldn't happen. It can only
> > > lead to invalidated slots on standby.
> > >
> > > To close the above race, I could think of the following ways:
> > > 1. Drop and re-create the slot.
> > > 2. Emit LOG/WARNING in this case and once remote_slot's LSN moves
> > > ahead of local_slot's LSN then we can update it; but as mentioned in
> > > your previous comment, we need to update all other fields as well.
> > > If we follow this then we probably need to have a check for
> > > catalog_xmin as well.
> > >
> >
> > The second point as mentioned is slightly misleading, so let me try to
> > rephrase it once again: Emit LOG/WARNING in this case and once
> > remote_slot's LSN moves ahead of local_slot's LSN then we can update
> > it; additionally, we need to update all other fields like two_phase as
> > well. If we follow this then we probably need to have a check for
> > catalog_xmin as well along remote_slot's restart_lsn.
> >
> > > Now, related to this the other case which needs some handling is
> > > what if the remote_slot's restart_lsn is greater than local_slot's
> > > restart_lsn but it is a re-created slot with the same name. In that
> > > case, I think the other properties like 'two_phase', 'plugin' could
> > > be different. So, is simply copying those sufficient or do we need
> > > to do something else as well?
> > >
> >
> > Bertrand, Dilip, Sawada-San, and others, please share your opinion on
> > this problem as I think it is important to handle this race condition.
>
> Is there any good use case of copying a failover slot in the first place? If it's not
> a normal use case and we can probably live without it, why not always disable
> failover during the copy? FYI we always disable two_phase on copied slots. It
> seems to me that copying a failover slot could lead to problems, as long as we
> synchronize slots based on their names. IIUC without the copy, this pass should
> never happen.
Thanks for the suggestion. I also don't have a use case for this.
Attach the V61 patch set that addresses this suggestion. And here is the
summary of the changes made in each patch.
V61-0001
1. Reverts the changes in copy_replication_slot.
V61-0002
1. Adds the documents for the steps that user needs to follow to ensure the
standby is ready for failover
2. Directly update the fields restart_lsn/confirmed_flush/catalog_xmin instead
of using APIs like LogicalConfirmReceivedLocation
3. Updates all the fields(two_phase, failover, plugin) when syncing the slots
4. fixes CFbot failures.
5. Some code style adjustment. (pending comments in last version)
6. Remove some unnecessary Assert and variable assignment (off-list comments from Peter)
Thanks Shveta for working on 4 and 5.
V61-0003
1. Some documents update related to standby_slot_names and the steps for
failover.
V61-0004
- No change.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
v61-0002-Add-logical-slot-sync-capability-to-the-physical.patch | application/octet-stream | 86.8 KB |
v61-0003-Allow-logical-walsenders-to-wait-for-the-physica.patch | application/octet-stream | 41.1 KB |
v61-0004-Non-replication-connection-and-app_name-change.patch | application/octet-stream | 9.7 KB |
v61-0001-Enable-setting-failover-property-for-a-slot-thro.patch | application/octet-stream | 100.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2024-01-12 14:43:53 | Re: Emit fewer vacuum records by reaping removable tuples during pruning |
Previous Message | Aleksander Alekseev | 2024-01-12 14:12:28 | Re: reorganize "Shared Memory and LWLocks" section of docs |