Re: The XLogFindNextRecord() routine find incorrect record start point after a long continuation record

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: The XLogFindNextRecord() routine find incorrect record start point after a long continuation record
Date: 2019-11-06 05:06:55
Message-ID: c57afeaf-4cf5-84a9-d19b-738f3f801e78@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 06/11/2019 09:41, Michael Paquier wrote:
> On Wed, Nov 06, 2019 at 07:40:48AM +0500, Andrey Lepikhov wrote:
>> I found this in our multimaster project on PostgreSQL 11.5. It is difficult
>> to reproduce this error, but I will try to do it if necessary.
>>
>> The rest of a continuation WAL-record can exactly match the block size. In
>> this case, we need to switch targetPagePtr to the next block before
>> calculating the starting point of the next WAL-record.
>> See the patch in attachment for the bug fix.
>
> What's the error you actually saw after reading the record in
> xlogreader.c? If you have past WAL archives, perhaps you are able to
> reproduce the problem with a given WAL segment and pg_waldump?

I saw the message:
pg_waldump: xlogreader.c:264: XLogReadRecord: <Text in russian>
"((RecPtr) % 8192 >= (((uintptr_t) ((sizeof(XLogPageHeaderData))) + ((8)
- 1)) & ~((uintptr_t) ((8) - 1))))" <Text in russian>

Yes, I reproduced error with pg_waldump too. The patch in previous
letter fixed this problem.

some details:
I have the record:
rmgr: Transaction len (rec/tot): 86/ 86, tx: 0, lsn: 0/11FFFF98, prev
0/11FFFDA8, desc: COMMIT_PREPARED

The next record occupied the rest of segment No.11 and 8151 bytes of the
first block of the segment No.12, i.e. its size is 8167 bytes.

Problematic record (I got it by pg_waldump after applying the patch) is:
rmgr: Heap len (rec/tot): 71/ 8167, tx: 1249835485258, lsn: 0/11FFFFF0,
prev 0/11FFFF98, desc: LOCK off 8: xid 1249835485258: flags 0 LOCK_ONLY
EXCL_LOCK KEYS_UPDATED , blkref #0: rel 1663/13121/16474 blk 10880 FPW

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Skjalg A. Skagen 2019-11-06 09:30:34 Re: PostgreSQL 12 installation fails because locale name contained non-english characters
Previous Message Michael Paquier 2019-11-06 04:41:20 Re: The XLogFindNextRecord() routine find incorrect record start point after a long continuation record