Re: Behavior for crash recovery when it detects a corrupt WAL record

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Behavior for crash recovery when it detects a corrupt WAL record
Date: 2012-10-09 14:07:53
Message-ID: 50742FB9.7040906@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.10.2012 16:42, Amit Kapila wrote:
> I have observed that currently during recovery, while it applies the WAL
> records even if it detects that there is a corrupt record
>
> by crc validation, it proceeds.
>
> Basically ReadRecord(), returns NULL in such cases which makes the behavior
> same as it has reached end of WAL.
>
> After that server get started and user can perform operations normally.

Yeah. We rely on the CRC to detect end of WAL during recovery. If the
system crashes while the WAL is being flushed to disk, it's normal that
there's a corrupt (ie. partially written) record at the end of the WAL.
This is a common technique used by pretty much every system with a
transaction log / journal.

The other option would be to perform two fsyncs for every commit; one to
flush the WAL to disk, and another to update some global pointer to
point to the end of valid WAL (e.g in pg_control).

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-10-09 14:09:14 Re: Truncate if exists
Previous Message Tom Lane 2012-10-09 14:06:51 Re: Truncate if exists