From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "R, Siva" <sivasubr(at)amazon(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun |
Date: | 2018-09-05 17:23:06 |
Message-ID: | CAD21AoAXiVCV1McUAoakk+ruaJ-uqd377hsVp4UD4rka-TRnqQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Sep 5, 2018 at 4:59 AM, R, Siva <sivasubr(at)amazon(dot)com> wrote:
> Hi,
>
> We recently encountered an issue where the opaque data flags on a gin data
> leaf page was corrupted while replaying a gin insert WAL record. Upon
> further examination of the redo code, we found a bug in ginRedoRecompress
> code, which extracts the WAL information and updates the page.
>
> Specifically, when a new segment is inserted in the middle of a page, a
> memmove operation is performed [1] at the current point in the page to make
> room for the new segment. If this segment insertion is followed by delete
> segment actions that are yet to be processed and the total data size is very
> close to GinDataPageMaxDataSize, then we may move the data portion beyond
> the boundary causing the opaque data to be corrupted.
>
> One way of solving this problem is to perform the replay work on a scratch
> space, perform sanity check on the total size of the data portion before
> copying it back to the actual page. While it involves additional memory
> allocation and memcpy operations, it is safer and similar to the 'do' code
> path where we ensure to make a copy of all segment past the first modified
> segment before placing them back on the page [2].
>
Hmm, could you share the sequence of what kind of WAL has applied to
the broken page? I suspect the segment list contains
GIN_SEGMENT_REPLACE before GIN_SEGMENT_INSERT.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2018-09-05 17:45:27 | Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo. |
Previous Message | Chris Travers | 2018-09-05 17:12:49 | Re: Funny hang on PostgreSQL 10 during parallel index scan on slave |