Re: Re: TODO list

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
Cc: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>, "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: TODO list
Date: 2001-04-06 02:52:08
Message-ID: 22440.986525528@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Philip Warner <pjw(at)rhyme(dot)com(dot)au> writes:
>> So the only real benefit of a block-level CRC would be to guard against
>> bits dropped in transit from the disk surface to someplace else

> What about guarding against file system problems, like blocks of one
> (non-PG) file erroneously writing to blocks of another (PG table) file?

Well, what about it? Can you offer numbers demonstrating that this risk
is probable enough to justify the effort and runtime cost of a block
CRC?

If we're in the business of expending cycles to guard against
nil-probability risks, let's checksum our executables every time we
start up, to make sure they're not overwritten. Actually, we'd better
re-checksum program text memory every few seconds, in case RAM dropped
a bit since we looked last. And let's follow every memcpy by a memcmp
to make sure that didn't drop a bit. Heck, let's keep a CRC on every
palloc'd memory block. And so on and so forth. Sooner or later you've
got to draw the line at diminishing returns, both for runtime costs
and for the programming effort you spent on this stuff (instead of on
finding/fixing bugs that might bite you with far greater frequency than
anything a CRC might catch for you).

To be perfectly clear: I have actually seen bug reports trace to
problems that I think a block-level CRC might have detected (not
corrected, of course, but at least the user might have realized he had
flaky hardware a little sooner). So I do not say that the upside to
a block CRC is nil. But I am unconvinced that it exceeds the downside,
in development effort, runtime, false failure reports (is that CRC error
really due to hardware trouble, or a software bug that failed to update
the CRC? and how do you get around the CRC error to get at your data??)
etc etc.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 2001-04-06 02:59:16 RC3 ... anyone have anything left outstanding?
Previous Message Mikheev, Vadim 2001-04-06 02:36:54 RE: Re: TODO list