Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Dec 19, 2011 at 2:16 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> It seems to me that on a typical production system you would
>> probably have zero or one such page per OS crash, with zero being
>> far more likely than one. If we can get that one fixed (if it
>> exists) before enough time has elapsed for everyone to forget the
>> OS crash, the idea that we would be scaring the users and
>> negatively affecting the perception of reliability seems
>> far-fetched.
>
> The problem is that you can't "fix" them. If you come to a page
> with a bad CRC, you only have two choices: take it seriously, or
> don't. If you take it seriously, then you're complaining about
> something that may be completely benign. If you don't take it
> seriously, then you're ignoring something that may be a sign of
> data corruption.
I was thinking that we would warn when such was found, set hint bits
as needed, and rewrite with the new CRC. In the unlikely event that
it was a torn hint-bit-only page update, it would be a warning about
something which is a benign side-effect of the OS or hardware crash.
The argument was that it could happen months later, and people
might not remember the crash. My response to that is: don't let it
wait that long. By forcing a vacuum of all possibly-affected tables
(or all tables if the there's no way to rule any of them out), you
keep it within recent memory.
Of course, it would also make sense to document that such an error
after an OS or hardware crash might be benign or may indicate data
corruption or data loss, and give advice on what to do. There is
obviously no way for PostgreSQL to automagically "fix" real
corruption flagged by a CRC failure, under any circumstances.
There's also *always" a possibility that CRC error is a false
positive -- if only the bytes in the CRC were damaged. We're
talking quantitative changes here, not qualitative.
I'm arguing that the extreme measures suggested to achieve the
slight quantitative improvements are likely to cause more problems
than they solve. A better use of resources to improve the false
positive numbers would be to be more aggressive about setting hint
bits -- perhaps when a page is written with any tuples with
transaction IDs before the global xmin, the hint bits should be set
and the CRC calculated before write, for example. (But that would
be a different patch.)
-Kevin