Quick Links

Re: using an end-of-recovery record in all cases

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: using an end-of-recovery record in all cases
Date:	2021-09-04 09:51:14
Message-ID:	CAA4eK1LJ_0YGp8EPqXRL41cVQFC=DgGrx_7=_4v1hiDkAbMy7Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Aug 10, 2021 at 12:31 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Jul 29, 2021 at 11:49 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Wed, Jul 28, 2021 at 3:28 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > > Imagine if instead of
> > > > all the hairy logic we have now we just replaced this whole if
> > > > (IsInRecovery) stanza with this:
> > > >
> > > > if (InRecovery)
> > > > CreateEndOfRecoveryRecord();
> > > >
> > > > That would be WAY easier to reason about than the rat's nest we have
> > > > here today. Now, I am not sure what it would take to get there, but I
> > > > think that is the direction we ought to be heading.
> > >
> > > What are we going to do in the single user ([1]) case in this awesome future?
> > > I guess we could just not create a checkpoint until single user mode is shut
> > > down / creates a checkpoint for other reasons?
> >
> > It probably depends on who writes this thus-far-hypothetical patch,
> > but my thought is that we'd unconditionally request a checkpoint after
> > writing the end-of-recovery record, same as we do now if (promoted).
> > If we happened to be in single-user mode, then that checkpoint request
> > would turn into performing a checkpoint on the spot in the one and
> > only process we've got, but StartupXLOG() wouldn't really need to care
> > what happens under the hood after it called RequestCheckpoint().
>
> I decided to try writing a patch to use an end-of-recovery record
> rather than a checkpoint record in all cases. The patch itself was
> pretty simple but didn't work. There are two different reasons why it
> didn't work, which I'll explain in a minute. I'm not sure whether
> there are any other problems; these are the only two things that cause
> problems with 'make check-world', but that's hardly a guarantee of
> anything. Anyway, I thought it would be useful to report these issues
> first and hopefully get some feedback.
>
> The first problem I hit was that GetRunningTransactionData() does
> Assert(TransactionIdIsNormal(CurrentRunningXacts->latestCompletedXid)).
> While that does seem like a superficially reasonable thing to assert,
> StartupXLOG() initializes latestCompletedXid by calling
> TransactionIdRetreat on the value extracted from checkPoint.nextXid,
> and the first-ever checkpoint that is written at the beginning of WAL
> has a nextXid of 3, so when starting up from that checkpoint nextXid
> is 2, which is not a normal XID. When we try to create the next
> checkpoint, CreateCheckPoint() does LogStandbySnapshot() which calls
> GetRunningTransactionData() and the assertion fails. In the current
> code, we avoid this more or less accidentally because
> LogStandbySnapshot() is skipped when starting from a shutdown
> checkpoint. If we write an end-of-recovery record and then trigger a
> checkpoint afterwards, then we don't avoid the problem. Although I'm
> impishly tempted to suggest that we #define SecondNormalTransactionId
> 4 and then use that to create the first checkpoint instead of
> FirstNormalTransactionId -- naturally with no comments explaining why
> we're doing it -- I think the real problem is that the assertion is
> just wrong. CurrentRunningXacts->latestCompletedXid shouldn't be
> InvalidTransactionId or BootstrapTransactionId, but
> FrozenTransactionId is a legal, if corner-case, value, at least as the
> code stands today.
>
> Unfortunately we can't just relax the assertion, because the
> XLOG_RUNNING_XACTS record will eventually be handed to
> ProcArrayApplyRecoveryInfo() for processing ... and that function
> contains a matching assertion which would in turn fail. It in turn
> passes the value to MaintainLatestCompletedXidRecovery() which
> contains yet another matching assertion, so the restriction to normal
> XIDs here looks pretty deliberate. There are no comments, though, so
> the reader is left to guess why. I see one problem:
> MaintainLatestCompletedXidRecovery uses FullXidRelativeTo, which
> expects a normal XID. Perhaps it's best to just dodge the entire issue
> by skipping LogStandbySnapshot() if latestCompletedXid happens to be
> 2,
>

By reading above explanation, it seems like it is better to skip
LogStandbySnapshot() for this proposal. Can't we consider skipping
LogStandbySnapshot() by passing some explicit flag-like
'recovery_checkpoint' or something like that?

I think there is some prior discussion in another thread related to
this work but it would be easier to follow if you can summarize the
same.

With Regards,
Amit Kapila.

In response to

using an end-of-recovery record in all cases at 2021-08-09 19:00:40 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Gilles Darold	2021-09-04 11:00:58	Re: [PATCH] Hooks at XactCommand level
Previous Message	Fabien COELHO	2021-09-04 06:27:00	Re: Avoid stuck of pbgench due to skipped transactions