Re: corrupt pages detected by enabling checksums

From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <jim(at)nasby(dot)net>, Jeff Davis <pgsql(at)j-davis(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: corrupt pages detected by enabling checksums
Date: 2013-05-11 03:33:15
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C38421A8644@szxeml558-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, May 10, 2013 10:24 PM Greg Stark wrote:
On Fri, May 10, 2013 at 5:31 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>> In the case where one block is missing, how can it even reach to next record
>> to check "prev" pointer.
>> I think it can be possible when one of the record is corrupt and following
>> are okay which I think is the
>> case in which it can proceed with warning as suggested by Simon.

>A single WAL record can be over 24kB. The checksum covers the entire
>WAL record and if it reports corruption it can be because a chunk in
>the middle wasn't flushed to disk before the system crashed. The
>beginning of the WAL record with the length and checksum and the
>entire following record with its prev pointer might have been flushed
>but the missing block in the middle of this record means it can't be
>replayed. This would be a normal situation in case of a system crash.

The only point I wanted to say was it can be only "one such record",
length can be large or small.

>If you replayed the following record but not this record you would
>have an inconsistent database. The following record could be an insert
>for a child table with a foreign key reference to a tuple that was
>inserted by the skipped record for example. Resulting in a database
>that is logically inconsistent.

The corrupt record can be such that it can lead to inconsistency in database or
it could be a commit record of transaction which has performed only select operation.
It will be difficult or not possible to find such information during recovery,
but informing DBA/user at such occasion can be useful and with an optional way for him to
continue (although I am not sure how in such a case DBA can decide, may be need some other information as well).
The reason why it can be useful to allow DBA/user intervention in such cases is that, in case
of ignoring data after one corrupt record, it can still take time for DBA/user to find
which of the operations he needs to perform again.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-05-11 04:34:24 Re: issues with dropped columns in plpgsql code again
Previous Message Greg Smith 2013-05-11 03:31:19 Re: corrupt pages detected by enabling checksums