From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: Synchronizing slots from primary to standby |
Date: | 2024-04-11 04:11:04 |
Message-ID: | CAA4eK1Ktspr+nCieMOHDLQL0nQR4PMRVv_F+rqWkLY3Q-32wPg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 10, 2024 at 5:28 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Thursday, April 4, 2024 5:37 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > BTW, while thinking on this one, I
> > noticed that in the function LogicalConfirmReceivedLocation(), we first update
> > the disk copy, see comment [1] and then in-memory whereas the same is not
> > true in
> > update_local_synced_slot() for the case when snapshot exists. Now, do we have
> > the same risk here in case of standby? Because I think we will use these xmins
> > while sending the feedback message (in XLogWalRcvSendHSFeedback()).
> >
> > * We have to write the changed xmin to disk *before* we change
> > * the in-memory value, otherwise after a crash we wouldn't know
> > * that some catalog tuples might have been removed already.
>
> Yes, I think we have the risk on the standby, I can reproduce the case that if
> the server crashes after updating the in-memory value and before saving them to
> disk, the synced slot could be invalidated after restarting from crash, because
> the necessary rows have been removed on the primary. The steps can be found in
> [1].
>
> I think we'd better fix the order in update_local_synced_slot() as well. I
> tried to make the fix in 0002, 0001 is Shveta's patch to fix another issue in this thread. Since
> they are touching the same function, so attach them together for review.
>
Few comments:
===============
1.
+
+ /* Sanity check */
+ if (slot->data.confirmed_flush != remote_slot->confirmed_lsn)
+ ereport(LOG,
+ errmsg("synchronized confirmed_flush for slot \"%s\" differs from
remote slot",
+ remote_slot->name),
Is there a reason to use elevel as LOG instead of ERROR? I think it
should be elog(ERROR, .. as this is an unexpected case.
2.
- if (remote_slot->restart_lsn < slot->data.restart_lsn)
+ if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
elog(ERROR,
"cannot synchronize local slot \"%s\" LSN(%X/%X)"
Can we be more specific in this message? How about splitting it into
error_message as "cannot synchronize local slot \"%s\"" and then
errdetail as "Local slot's start streaming location LSN(%X/%X) is
ahead of remote slot's LSN(%X/%X)"?
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2024-04-11 05:34:08 | Re: Schema variables - new implementation for Postgres 15 |
Previous Message | Thomas Munro | 2024-04-11 04:07:15 | Re: DROP DATABASE is interruptible |