From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | bossartn(at)amazon(dot)com |
Cc: | david(at)pgmasters(dot)net, peter(dot)eisentraut(at)2ndquadrant(dot)com, andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, pgsql-hackers(at)lists(dot)postgresql(dot)org, jtc331(at)gmail(dot)com, robertmhaas(at)gmail(dot)com |
Subject: | Re: Make mesage at end-of-recovery less scary. |
Date: | 2021-11-08 05:59:46 |
Message-ID: | 20211108.145946.1513355777186578917.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Fri, 22 Oct 2021 17:54:40 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
> On 3/4/21, 10:50 PM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > As the result, the following messages are emitted with the attached.
>
> I'd like to voice my support for this effort, and I intend to help
> review the patch. It looks like the latest patch no longer applies,
> so I've marked the commitfest entry [0] as waiting-on-author.
>
> Nathan
>
> [0] https://commitfest.postgresql.org/35/2490/
Sorry for being late to reply. I rebased this to the current master.
- rebased
- use LSN_FORMAT_ARGS instead of bare shift and mask.
- v4 immediately exited walreceiver on disconnection. Maybe I wanted
not to see a FATAL message on standby after primary dies. However
that would be another issue and that change was plain wrong.. v5
just removes the "end-of-WAL" part from the message, which duplicate
to what startup emits.
- add a new error message "missing contrecord at %X/%X". Maybe this
should be regarded as a leftover of the contrecord patch. In the
attached patch the "%X/%X" is the LSN of the current record. The log
messages look like this (026_overwrite_contrecord).
LOG: redo starts at 0/1486CB8
WARNING: missing contrecord at 0/1FFC2E0
LOG: consistent recovery state reached at 0/1FFC2E0
LOG: started streaming WAL from primary at 0/2000000 on timeline 1
LOG: successfully skipped missing contrecord at 0/1FFC2E0, overwritten at 2021-11-08 14:50:11.969952+09
CONTEXT: WAL redo at 0/2000028 for XLOG/OVERWRITE_CONTRECORD: lsn 0/1FFC2E0; time 2021-11-08 14:50:11.969952+09
While checking the behavior for the case of missing-contrecord, I
noticed that emode_for_corrupt_record() doesn't work as expected since
readSource is reset to XLOG_FROM_ANY after a read failure. We could
remember the last failed source but pg_wal should have been visited if
page read error happened so I changed the function so that it treats
XLOG_FROM_ANY the same way with XLOG_FROM_PG_WAL.
(Otherwise we see "LOG: reached end-of-WAL at .." message after
"WARNING: missing contrecord at.." message.)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Make-End-Of-Recovery-error-less-scary.patch | text/x-patch | 9.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuro Yamada | 2021-11-08 06:06:54 | Re: Question about psql meta-command with schema option doesn't use visibilityrule |
Previous Message | Michael Paquier | 2021-11-08 05:43:43 | Re: Commitfest 2021-11 Patch Triage - Part 1 |