From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | masao(dot)fujii(at)oss(dot)nttdata(dot)com |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Crash by targetted recovery |
Date: | 2020-02-27 08:05:30 |
Message-ID: | 20200227.170530.2264412619413068664.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for the comment.
At Thu, 27 Feb 2020 16:23:44 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
> On 2020/02/27 15:23, Kyotaro Horiguchi wrote:
> >> I failed to understand why random access while reading from
> >> stream is bad idea. Could you elaborate why?
> > It seems to me the word "streaming" suggests that WAL record should be
> > read sequentially. Random access, which means reading from arbitrary
> > location, breaks a stream. (But the patch doesn't try to stop wal
> > sender if randAccess.)
> >
> >> Isn't it sufficient to set currentSource to 0 when disabling
> >> StandbyMode?
> > I thought that and it should work, but I hesitated to manipulate on
> > currentSource in StartupXLOG. currentSource is basically a private
> > state of WaitForWALToBecomeAvailable. ReadRecord modifies it but I
> > think it's not good to modify it out of the the logic in
> > WaitForWALToBecomeAvailable.
>
> If so, what about adding the following at the top of
> WaitForWALToBecomeAvailable()?
>
> if (!StandbyMode && currentSource == XLOG_FROM_STREAM)
> currentSource = 0;
It works virtually the same way. I'm happy to do that if you don't
agree to using randAccess. But I'd rather do that in the 'if
(!InArchiveRecovery)' section.
> > Come to think of that I got to think the
> > following part in ReadRecord should use randAccess instead..
> > xlog.c:4384
> >> /*
> > - * Before we retry, reset lastSourceFailed and currentSource
> > - * so that we will check the archive next.
> > + * Streaming has broken, we retry from the same LSN.
> >> */
> >> lastSourceFailed = false;
> > - currentSource = 0;
> > + private->randAccess = true;
>
> Sorry, I failed to understand why this change is necessary...
It's not necessary, just for being tidy about the responsibility on
currentSource.
> At least the comment that you added seems incorrect
> because WAL streaming should not have started yet when
> we reach the above point.
Oops, right.
- * Streaming has broken, we retry from the same LSN.
+ * Restart recovery from the current LSN.
For clarity, I don't insist on the change at all. If it were
necessary, it's another topic, anyway. Please forget it.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
v2-0001-TAP-test-for-a-crash-bug.patch | text/x-patch | 2.1 KB |
v2-0002-Fix-a-crash-bug-of-targetted-promotion.patch | text/x-patch | 1.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2020-02-27 08:07:35 | Re: reindex concurrently and two toast indexes |
Previous Message | Julien Rouhaud | 2020-02-27 07:45:35 | Re: Collation versioning |