From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | osumi(dot)takamichi(at)fujitsu(dot)com |
Cc: | masao(dot)fujii(at)oss(dot)nttdata(dot)com, david(at)pgmasters(dot)net, pgsql-hackers(at)lists(dot)postgresql(dot)org, laurenz(dot)albe(at)cybertec(dot)at |
Subject: | Re: Stronger safeguard for archive recovery not to miss data |
Date: | 2021-04-06 06:24:21 |
Message-ID: | 20210406.152421.159594903939861443.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Tue, 6 Apr 2021 04:11:35 +0000, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> wrote in
> On Tuesday, April 6, 2021 9:41 AM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
> > On 2021/04/05 23:54, osumi(dot)takamichi(at)fujitsu(dot)com wrote:
> > >> This makes me think that we should document this risk.... Thought?
> > > +1. We should notify the risk when user changes
> > > the wal_level higher than minimal to minimal to invoke a carefulness
> > > of user for such kind of operation.
> >
> > I removed the HINT message "or recover to the point in ..." and added the
> > following note into the docs.
> >
> > Note that changing <varname>wal_level</varname> to
> > <literal>minimal</literal> makes any base backups taken before
> > unavailable for archive recovery and standby server, which may
> > lead to database loss.
> Thank you for updating the patch. Let's make the sentence more strict.
>
> My suggestion for this explanation is
> "In order to prevent database corruption, changing
> wal_level to minimal from higher level in the middle of
> WAL archiving requires careful attention. It makes any base backups
> taken before the operation unavailable for archive recovery
> and standby server. Also, it may lead to whole database loss when
> archive recovery fails with an error for that change.
> Take a new base backup immediately after making wal_level back to higher level."
The first sentense looks like somewhat nanny-ish. The database is not
corrupt at the time of this error. We just lose updates after the last
read segment at this point. As Fujii-san said, we can continue
recoverying using crash recovery and we will reach having a corrupt
database after that.
About the last sentense, I prefer more flat wording, such as "You need
to take a new base backup..."
> Then, we can be consistent with our new hint message,
> "Use a backup taken after setting wal_level to higher than minimal.".
>
> Is it better to add something similar to "Take an offline backup when you stop the server
> and change the wal_level" around the end of this part as another option for safeguard, also?
Backup policy is completely a matter of DBAs. If flipping wal_level
alone highly causes unstartable corruption,,, I think it is a bug.
> For the performance technique part, what we need to explain is same.
Might be good, but in simpler wording.
> Another minor thing I felt we need to do might be to add double quotes to wrap minimal in errhint.
Since the error about hot_standby has gone, either will do for me.
> Other errhints do so when we use it in a sentence.
>
> There is no more additional comment from me !
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-04-06 06:49:20 | Re: Replication slot stats misgivings |
Previous Message | Noah Misch | 2021-04-06 06:20:56 | Re: policies with security definer option for allowing inline optimization |