Quick Links

Re: corrupt pages detected by enabling checksums

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: corrupt pages detected by enabling checksums
Date:	2013-04-05 00:31:16
Message-ID:	1365121876.14231.69.camel@jdavis
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, 2013-04-04 at 14:21 -0700, Jeff Janes wrote:

> This brings up a pretty frightening possibility to me, unrelated to
> data checksums. If a bit gets twiddled in the WAL file due to a
> hardware issue or a "cosmic ray", and then a crash happens, automatic
> recovery will stop early with the failed WAL checksum with
> an innocuous looking message. The system will start up but will be
> invisibly inconsistent, and will proceed to overwrite that portion of
> the WAL file which contains the old data (real data that would have
> been necessary to reconstruct, once the corruption is finally realized
> ) with an end-of-recovery checkpoint record and continue to chew up
> real data from there.

I've been worried about that for a while, and I may have even seen
something like this happen before. We could perhaps do some checks, but
in general it seems hard to solve without writing flushing some data to
two different places. For example, you could flush WAL, and then update
an LSN stored somewhere else indicating how far the WAL has been
written. Recovery could complain if it gets an error in the WAL before
that point.

But obviously, that makes WAL flushes expensive (in many cases, about
twice as expensive).

Maybe it's not out of the question to offer that as an option if nobody
has a better idea. Performance-conscious users could place the extra LSN
on an SSD or NVRAM or something; or maybe use commit_delay or async
commits. It would only need to store a few bytes.

Streaming replication mitigates the problem somewhat, by being a second
place to write WAL.

Regards,
Jeff Davis

In response to

Re: corrupt pages detected by enabling checksums at 2013-04-04 21:21:19 from Jeff Janes

Responses

Re: corrupt pages detected by enabling checksums at 2013-04-05 01:06:15 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeff Davis	2013-04-05 00:39:16	Re: corrupt pages detected by enabling checksums
Previous Message	Will Leinweber	2013-04-05 00:27:05	Re: patch to add \watch to psql