Re: Crash by targetted recovery

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Crash by targetted recovery
Date: 2020-02-27 11:04:41
Message-ID: a74bfa5d-3a86-90b8-286e-9ff8822bd8e1@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/02/27 17:05, Kyotaro Horiguchi wrote:
> Thank you for the comment.
> At Thu, 27 Feb 2020 16:23:44 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>> On 2020/02/27 15:23, Kyotaro Horiguchi wrote:
>>>> I failed to understand why random access while reading from
>>>> stream is bad idea. Could you elaborate why?
>>> It seems to me the word "streaming" suggests that WAL record should be
>>> read sequentially. Random access, which means reading from arbitrary
>>> location, breaks a stream. (But the patch doesn't try to stop wal
>>> sender if randAccess.)
>>>
>>>> Isn't it sufficient to set currentSource to 0 when disabling
>>>> StandbyMode?
>>> I thought that and it should work, but I hesitated to manipulate on
>>> currentSource in StartupXLOG. currentSource is basically a private
>>> state of WaitForWALToBecomeAvailable. ReadRecord modifies it but I
>>> think it's not good to modify it out of the the logic in
>>> WaitForWALToBecomeAvailable.
>>
>> If so, what about adding the following at the top of
>> WaitForWALToBecomeAvailable()?
>>
>> if (!StandbyMode && currentSource == XLOG_FROM_STREAM)
>> currentSource = 0;
>
> It works virtually the same way. I'm happy to do that if you don't
> agree to using randAccess. But I'd rather do that in the 'if
> (!InArchiveRecovery)' section.

The approach using randAccess seems unsafe. Please imagine
the case where currentSource is changed to XLOG_FROM_ARCHIVE
because randAccess is true, while walreceiver is still running.
For example, this case can occur when the record at REDO
starting point is fetched with randAccess = true after walreceiver
is invoked to fetch the last checkpoint record. The situation
"currentSource != XLOG_FROM_STREAM while walreceiver is
running" seems invalid. No?

So I think that the approach that I proposed is better.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2020-02-27 12:44:14 Re: Error on failed COMMIT
Previous Message Asif Rehman 2020-02-27 10:57:09 Re: Online verification of checksums