From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | masao(dot)fujii(at)oss(dot)nttdata(dot)com |
Cc: | osumi(dot)takamichi(at)fujitsu(dot)com, david(at)pgmasters(dot)net, pgsql-hackers(at)lists(dot)postgresql(dot)org, laurenz(dot)albe(at)cybertec(dot)at |
Subject: | Re: Stronger safeguard for archive recovery not to miss data |
Date: | 2021-04-05 07:13:27 |
Message-ID: | 20210405.161327.326176026982929551.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Mon, 5 Apr 2021 12:34:53 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2021/04/04 11:58, osumi(dot)takamichi(at)fujitsu(dot)com wrote:
> >> IMO it's better to comment why this server restart is necessary.
> >> As far as I understand correctly, this is necessary to ensure the WAL
> >> file
> >> containing the record about the change of wal_level (to minimal) is
> >> archived,
> >> so that the subsequent archive recovery will be able to replay it.
> > OK, added some comments. Further, I felt the way I wrote this part was
> > not good at all and self-evident
> > and developers who read this test would feel uneasy about that point.
> > So, a little bit fixed that test so that we can get clearer conviction
> > for wal archive.
>
> LGTM. Thanks for updating the patch!
>
> Attached is the updated version of the patch. I applied the following
> changes.
+ errhint("Use a backup taken after setting wal_level to higher than minimal "
+ "or recover to the point in time before wal_level was changed to minimal even though it may cause data loss.")));
Looking the HINT message, I thought that it's hard to find where up to
I should recover. The caller knows the LSN of last PARAMETER_CHANGE
record so we can show that as the recovery-end LSN. On the other
hand, I'm not sure it's worth considering tough, we can reach here
when starting archive recovery on a server with wal_level=minimal. We
can pass 0 as LSN to notify that case.
If we do that, we can emit more clear message like the following.
WAL-while-minimal case
FATAL: WAL generated with wal_level=minimal at 0/40000A0, data may be missing
HINT: Use a backup later than, or recover up to before that LSN.
Mis-setting case
FATAL: archive recovery is not available on server with wal_level=minimal
HINT: Start this server with setting wal_level=replica or higher
The value "replica" is not double-quoted there (since before this
patch), but double-quoted in another error message about hot standby
just below. Maybe we should let them in the same style.
> Could you review this version? Barring any objection, I'm thinking to
> commit this.
>
> +sub check_wal_level
> +{
> + my ($target_wal_level, $explanation) = @_;
> +
> + is( $node->safe_psql(
> + 'postgres', q{show wal_level}),
> + $target_wal_level,
> + $explanation);
>
> Do we really need this test? This test doesn't seem to be directly
> related
> to the test purpose. It seems to be testing the behavior of the
> PostgresNode
> functions. So I removed this from the patch.
+1.
> +# This protection should apply to recovery mode
> +my $another_node = get_new_node('another_node');
>
> The same test is performed twice with different settings. So isn't it
> better
> to merge the code for both tests into one common function, to simplify
> the code? I did that.
Sounds reasonable for that size.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2021-04-05 07:14:06 | Re: Replication slot stats misgivings |
Previous Message | Amit Kapila | 2021-04-05 07:06:07 | Re: Any objections to implementing LogicalDecodeMessageCB for pgoutput? |