From: | Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | michael(at)paquier(dot)xyz, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: [BUG] non archived WAL removed during production crash recovery |
Date: | 2020-04-21 22:00:22 |
Message-ID: | 20200422000022.7aa6a62b@firost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Tue, 21 Apr 2020 15:08:17 +0900 (JST)
Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Tue, 21 Apr 2020 13:57:39 +0900, Michael Paquier <michael(at)paquier(dot)xyz>
> wrote in
> > On Tue, Apr 21, 2020 at 12:09:25PM +0900, Kyotaro Horiguchi wrote:
> > > + if (!XLogArchivingAlways() &&
> > > + GetRecoveryState() == RECOVERY_STATE_ARCHIVE)
> > >
> > > Is rewritten as
> > >
> > > + if (!XLogArchivingAlways() &&
> > > + GetDBState() > DB_IN_CRASH_RECOVERY)
> > >
> > > FWIW, what annoyed me is there are three variables that are quite
> > > similar but has different domains, ControlFile->state,
> > > XLogCtl->SharedRecoveryState, and LocalRecoveryInProgress. I didn't
> > > mind there were two, but three seems a bit too many to me.
In fact, my original goal [1] was to avoid adding another shared boolean related
to the same topic. So I understand your feeling.
I played with this idea again, based on your argument that there's no problem
if the update to DB_IN_ARCHIVE_RECOVERY reaches checkpointer with some delay.
The other point I feel bad with is to open and check the controlfile again and
again for each segment to archive, even on running production instance.
It's not that it would be heavy, but it feels overkill to fetch this information
that should be available more easily.
That leads me an idea where we would keep the ControlFile data up-to-date in
shared memory. There's a few duplicates between ControlFile and XLogCtl, so
maybe it could make the code a little simpler at some other places than just
fixing $SUBJECT using DBState? I'm not sure of the implications and impacts
though. This seems way bigger than the current fix and with many traps on the
way. Maybe we could discuss this in another thread if you think it deserves it?
Regards,
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2020-04-21 22:31:19 | Re: Bug with memory leak on cert validation in libpq |
Previous Message | Euler Taveira | 2020-04-21 19:47:44 | Re: BUG #16354: No geos 3.8.1 package for RHEL 8 |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2020-04-21 22:10:54 | Re: Do we need to handle orphaned prepared transactions in the server? |
Previous Message | Alvaro Herrera | 2020-04-21 21:57:26 | Re: Remove non-fast promotion Re: Should we remove a fallback promotion? take 2 |