Re: CRC was: Re: beta testing version

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Guenter <bruceg(at)em(dot)ca>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: CRC was: Re: beta testing version
Date: 2000-12-08 18:58:12
Message-ID: 1394.976301892@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Guenter <bruceg(at)em(dot)ca> writes:
> ... Taking an
> arbitrary 32 bits of a MD5 would likely be less collision prone than
> using a 32-bit CRC, and it appears faster as well.

... but that would be an algorithm that you know NOTHING about the
properties of. What is your basis for asserting it's better than CRC?

CRC is pretty well studied and its error-detection behavior is known
(and good). MD5 has been studied less thoroughly AFAIK, and in any
case what's known about its behavior is that the *entire* MD5 output
provides a good signature for a datastream. If you pick some ad-hoc
method like taking a randomly chosen subset of MD5's output bits,
you really don't know anything at all about what the error-detection
properties of the method are.

I am reminded of Knuth's famous advice about random number generators:
"Random numbers should not be generated with a method chosen at random.
Some theory should be used." Error-detection codes, like random-number
generators, have decades of theory behind them. Seat-of-the-pants
tinkering, even if it starts with a known-good method, is not likely to
produce an improvement.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Guenter 2000-12-08 19:01:27 Re: CRC was: Re: beta testing version
Previous Message Ian Lance Taylor 2000-12-08 18:36:39 Re: CRC was: Re: beta testing version