Quick Links

Re: Fwd: index corruption in PG 8.3.13

From:	Nikhil Sontakke <nikhil(dot)sontakke(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Greg Stark <gsstark(at)mit(dot)edu>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Fwd: index corruption in PG 8.3.13
Date:	2011-03-11 14:28:05
Message-ID:	AANLkTin8jifMsE-jE8P9x-+BbpVsQD4uoxWkBbO3xQ3h@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>>> VACUUM FULL - immediate shutdown - problem with recovery?
>
> An immediate shutdown == an intentional crash. OK, so you have the
> VACUUM FULL and the immediate shutdown just afterward. So we just
> need to figure out what happened during recovery.
>

Right.

>> But WAL replay should still have handled this. I would presume even an
>> immediate shutdown ensures that WAL is flushed to disk properly?
>
> I'm not sure, but I doubt it. If the VACUUM FULL committed, then the
> WAL records should be on disk, but if the immediate shutdown happened
> while it was still running, then the WAL records might still be in
> wal_buffers, in which case I don't think they'll get written out and
> thus zero pages in the index are to be expected. Now that doesn't
> explain any other corruption in the file, but I believe all-zeroes
> pages in a relation are an expected consequence of an unclean
> shutdown. But assuming the VF actually committed before the immediate
> shutdown, there must be something else going on, since by that point
> XLOG should have been flushed.
>

Oh yeah, so if VF committed, the xlog should have been ok too, but
can't say the same about the shared buffers.

>> So that means that either there is a corner case bug in VF which adds
>> incorrect WAL logging in some specific btree layout scenarios or there
>> was indeed some bit flipping in the WAL, which caused the recovery to
>> prematurely end during WAL replay. What are the scenarios that you
>> would think can cause WAL bit flipping?
>
> Some kind of fluke hard drive malfunction, maybe? I know that the
> incidence of a hard drive being told to write A and actually writing B
> is very low, but it's probably not exactly zero. Do you have the logs
> from the recovery following the immediate shutdown? Anything
> interesting there?
>

Unfortunately we do not have the recovery logs. Would have loved to
see some signs about some issues in the WAL replay to confirm the
theory about bit flipping..

> Or, as you say, there could be a corner-case VF bug.
>

Yeah, much harder to find by just eyeballing the code I guess :)

Regards,
Nikhils

In response to

Re: Fwd: index corruption in PG 8.3.13 at 2011-03-11 13:14:22 from Robert Haas

Responses

Re: Fwd: index corruption in PG 8.3.13 at 2011-03-11 18:21:20 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2011-03-11 14:31:44	Re: Re: [COMMITTERS] pgsql: Basic Recovery Control functions for use in Hot Standby. Pause,
Previous Message	Bruce Momjian	2011-03-11 14:27:44	Re: Replication server timeout patch