From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary |
Date: | 2025-02-12 11:52:47 |
Message-ID: | CAA4eK1+4AQRZK5VcmkVWSqquBFB6U1uy1=qtqw9YJ85U3Gi-Ag@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 12, 2025 at 1:16 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Feb 10, 2025 at 4:11 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Feb 7, 2025 at 11:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Feb 6, 2025 at 9:30 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > True but it sounds like there is more harm than benefit. It seems
> > > > reasonable to do this on HEAD. Shall we think of doing it differently
> > > > in HEAD and back-branches or let's restrict as your v2 patch is doing
> > > > and if by any chance users complain about it we can try to fix it in
> > > > another way?
> > >
> > > While the latter would be good for maintainability it would not be
> > > advisable to change the behavior back and forth in back branches. I'd
> > > like to make it clear what point of v2 patch (restring server startup
> > > for pre-existing logical slots) is preferable over the first patch
> > > (removing InHotStandby condition)?
> > >
> >
> > Sorry, I didn't understand your question.
>
> I wanted to mean that there is another direction to fix this issue by
> applying the v1 patch to both HEAD and back-branches. It might be
> worth considering this option too if the v1 patch's approach is
> reasonable.
>
> >
> > > If the approach of the first patch
> > > is reasonably good, we could take it both in HEAD and backbranches.
> > >
> > > >
> > > > >
> > > > > But, similarly, is
> > > > > there any concerns of my proposed fix?
> > > > >
> > > >
> > > > - if (InRecovery && InHotStandby &&
> > > > + if (InRecovery &&
> > > > xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> > > >
> > > > Won't the InRecovery be true even for crash-recovery as well? I think
> > > > we need to check standby mode here.
> > >
> > > I think if we check standby mode here (i.e., adding StandbyMode), we
> > > would not be able to invalidate logical slots during archive recovery.
> > > That is, in the following scenario, we would hit the same assertion
> > > failure:
> > >
> > > 1. setup the primary and the standby servers (with hot_standby=on).
> > > 2. create a logical slot on the standby.
> > > 3. on standby, set archive recovery settings (setting restore_command,
> > > removing standby.signal, and creating recovery.signal etc.).
> > > 4. on primary, lower wal_level to replica and restart.
> > > 5. start standby (in archive recovery mode).
> > > 6. execute the logical decoding after the archive recovery completes.
> > >
> > > And, you're right that InRecovery is true even for crash-recovery. But
> > > is there any case where we invalidate logical slots due to
> > > XLOG_PARAMETER_CHANGE during a crash recovery?
> > >
> >
> > I am not sure but what about the cases where the server crashes
> > immediately after logging this WAL record and the user reduced the
> > wal_lvel before the next restart? I think in such a case we may ERROR
> > out while restoring slots from the disk which will be before we will
> > try to replay the WAL.
>
> I think so too.
>
> > However, I am not sure if it is a good idea to
> > rely on that assumption.
>
> Agreed, I'm fine with leaving InRecovery in this condition. I think
> the point is whether we should add StandbyMode to the condition or
> not. I think if we do that, we would end up with the same error in the
> above scenario I described. So does the following condition make
> sense?
>
> if (InRecovery &&
> xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> wal_level >= WAL_LEVEL_LOGICAL)
> InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> 0, InvalidOid,
> InvalidTransactionId);
>
This will still be true for crash-recovery as the InRecovery flag will
be true for that case as well. I think we should go with your v2 patch
approach for HEAD and back-branches.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2025-02-12 11:53:50 | Re: generic plans and "initial" pruning |
Previous Message | Fujii Masao | 2025-02-12 11:32:11 | Re: Add “FOR UPDATE NOWAIT” lock details to the log. |