From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | michael(at)paquier(dot)xyz |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, ethmertz(at)amazon(dot)com, nathandbossart(at)gmail(dot)com, pgsql(at)j-davis(dot)com, sawada(dot)mshk(at)gmail(dot)com |
Subject: | Re: Incorrect handling of OOM in WAL replay leading to data loss |
Date: | 2023-08-02 04:16:02 |
Message-ID: | 20230802.131602.744103748768663215.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Tue, 01 Aug 2023 15:28:54 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> I thoght that the failure on a stanby results in continuing to retry
> reading the next record. However, I found that there's a case where
> start process stops in response to OOM [1].
I've examined the calls to
MemoryContextAllocExtended(..,MCXT_ALLOC_NO_OOM). In server recovery
path, XLogDecodeNextRecord is the only function that uses it.
So, there doesn't seem to be a problem here. I proceeded to test the
idea of only varifying headers after an allocation failure, and I've
attached a PoC.
- allocate_recordbuf() ensures a minimum of SizeOfXLogRecord bytes
when it reutnrs false, indicating an allocation failure.
- If allocate_recordbuf() returns false, XLogDecodeNextRecord()
continues to read pages and perform header checks until the
total_len reached, but not copying data (except for the logical
record header, when the first page didn't store the entire header).
- If all relevant WAL pages are consistent, ReadRecord concludes with
an 'out of memory' ERROR, which then escalates to FATAL.
I believe this approach is sufficient to determine whether the error
is OOM or not. If total_len is currupted and has an excessively large
value, it's highly unlikely that all subsequent pages for that length
will be consistent.
Do you have any thoughts on this?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
PoC_continue_record_verification_after_OOM.txt | text/plain | 5.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-08-02 04:52:33 | Re: Faster "SET search_path" |
Previous Message | jian he | 2023-08-02 03:45:06 | Re: [PATCH] [zh_CN.po] fix a typo in simplified Chinese translation file |