Re: CRC was: Re: beta testing version

From: ncm(at)zembu(dot)com (Nathan Myers)
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: CRC was: Re: beta testing version
Date: 2000-12-08 00:01:23
Message-ID: 20001207160123.C30335@store.zembu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 07, 2000 at 04:35:00PM -0500, Tom Lane wrote:
> Remember that we are already sitting atop hardware that's really
> pretty reliable, despite the carping that's been going on in this
> thread. All that we have to do is detect the infrequent case where a
> block of data didn't get written due to system failure. It's wildly
> pessimistic to think that we might get called on to do so as much as
> once a day (if you are trying to run a reliable database, and are
> suffering power failures once a day, and haven't bought a UPS, you're
> a lost cause). A 32-bit CRC will fail to detect such an error with a
> probability of about 1 in 2^32. So, a 32-bit CRC will have an MBTF of
> 2^32 days, or 11 million years, on the wildly pessimistic side ---
> real installations probably 100 times better. That's plenty for me,
> and improving the odds to 2^64 or 2^128 is not worth any slowdown
> IMHO.

1. Computing a CRC-64 takes only about twice as long as a CRC-32, for
2^32 times the confidence. That's pretty cheap confidence.

2. I disagree with way the above statistics were computed. That eleven
million-year figure gets whittled down pretty quickly when you
factor in all the sources of corruption, even without crashes.
(Power failures are only one of many sources of corruption.) They
grow with the size and activity of the database. Databases are
getting very large and busy indeed.

3. Many users clearly hope to be able to pull the plug on their hardware
and get back up confidently. While we can't promise they won't have
to go to their backups, we should at least be equipped to promise,
with confidence, that they will know whether they need to.

4. For a way to mark the "current final" log entry, you want a lot more
confidence, because you read a lot more of them, and reading beyond
the end may cause you to corrupt a currently-valid database, which
seems a lot worse than just using a corrupted database.

Still, I agree that a 32-bit CRC is better than none at all.

Nathan Myers
ncm(at)zembu(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hiroshi Inoue 2000-12-08 00:06:25 Re: How to reset WAL enveironment
Previous Message The Hermit Hacker 2000-12-07 23:17:20 Re: v7.1 beta 1 ...packaged, finally ...