Re: Tracking down log segment corruption

From: Gordon Shannon <gordo169(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Tracking down log segment corruption
Date: 2010-05-02 21:43:39
Message-ID: x2ub2dd93301005021443t3c13fb57m7fc5b53e7f0f466b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Sounds like you're on it. Just wanted to share one additional piece, in
case it helps.

Just before the ALTER INDEX SET TABLESPACE was issued, there were some
writes to the table in question inside a serializable transaction. The
transaction committed at 11:11:58 EDT, and consisted of, among a couple
thousand writes to sibling tables, 4 writes (unknown combination of inserts
and updates) to cts_20100501, which definitely effected the index in
question.

In any case, I will cease and desist from ALTER SET TABLESPACE for a while!.

Thanks!
Gordon

Between 11:11:56 and 11:11:58 EDT (11 sec before the crash), there were

On Sun, May 2, 2010 at 3:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Found it, I think. ATExecSetTableSpace transfers the copied data to the
> slave by means of XLOG_HEAP_NEWPAGE WAL records. The replay function
> for this (heap_xlog_newpage) is failing to pay any attention to the
> forkNum field of the WAL record. This means it will happily write FSM
> and visibility-map pages into the main fork of the relation. So if the
> index had any such pages on the master, it would immediately become
> corrupted on the slave. Now indexes don't have a visibility-map fork,
> but they could have FSM pages. And an FSM page would have the right
> header information to look like an empty index page. So dropping an
> index FSM page into the main fork of the index would produce the
> observed symptom.
>
> I'm not 100% sure that this is what bit you, but it's clearly a bug and
> AFAICS it could produce the observed symptoms.
>
> This is a seriously, seriously nasty data corruption bug. The only bit
> of good news is that ALTER SET TABLESPACE seems to be the only operation
> that can emit XLOG_HEAP_NEWPAGE records with forkNum different from
> MAIN_FORKNUM, so that's the only operation that's at risk. But if you
> do do that, not only are standby slaves going to get clobbered, but the
> master could get corrupted too if you were unlucky enough to have a
> crash and replay from WAL shortly after completing the ALTER. And it's
> not only indexes that are at risk --- tables could get clobbered the
> same way.
>
> My crystal ball says there will be update releases in the very near
> future.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2010-05-02 22:29:55 Re: Tracking down log segment corruption
Previous Message Tom Lane 2010-05-02 21:16:11 Re: Tracking down log segment corruption