Re: Enabling Checksums

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-04-13 14:58:53
Message-ID: 2149.1365865133@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2013-04-13 09:14:26 -0400, Bruce Momjian wrote:
>> As I understand it, SIMD is just a CPU-optimized method for producing a
>> CRC checksum. Is that right? Does it produce the same result as a
>> non-CPU-optimized CRC calculation?

> No we are talking about a different algorithm that results in different
> results, thats why its important to choose now since we can't change it
> later without breaking pg_upgrade in further releases.
> http://en.wikipedia.org/wiki/SIMD_%28hash_function%29

[ squint... ] We're talking about a *cryptographic* hash function?
Why in the world was this considered a good idea for page checksums?

In the first place, it's probably not very fast compared to some
alternatives, and in the second place, the criteria by which people
would consider it a good crypto hash function have approximately nothing
to do with what we need for a checksum function. What we want for a
checksum function is high probability of detection of common hardware
failure modes, such as burst errors and all-zeroes. This is
particularly critical when we're going with only a 16-bit checksum ---
the probabilities need to be skewed in the right direction, or it's not
going to be all that terribly useful.

CRCs are known to be good for that sort of thing; it's what they were
designed for. I'd like to see some evidence that any substitute
algorithm has similar properties. Without that, I'm going to vote
against this idea.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-04-13 15:10:07 Re: Enabling Checksums
Previous Message Andres Freund 2013-04-13 13:29:30 Re: Enabling Checksums