Re: Substituting Checksum Algorithm (was: Enabling Checksums)

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Ants Aasma <ants(at)cybertec(dot)at>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: Substituting Checksum Algorithm (was: Enabling Checksums)
Date: 2013-04-23 08:24:40
Message-ID: 85364E17-11DD-4ED8-8D9F-47D5CA9CA2EE@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Apr23, 2013, at 09:17 , Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> I'd lean toward simplicity and closer adherence to the published version
> of the algorithm rather than detecting a few more obscure error
> patterns. It looks like the modification slows down the algorithm, too.

The pattern that plain FNV1 misses are not that obscure, unfortunately.
With plain FNV1A, the n-th bit of an input word (i.e. 32-bit block) only
affects bits n through 31 of the checksum. In particular, the most
significant bit of every 32-bit block only affects the MSB of the checksum,
making the algorithm miss any even number of flipped MSBs. More generally,
any form of data corruption that affects only the top N bits are missed
at least once out of 2^N times, since changing only those bits cannot
yield more than 2^N different checksum values.

Such corruption pattern may not be especially likely, given that we're
mainly protecting against disk corruption, not memory corruption. But
quantifying how unlikely exactly seems hard, thus providing at least some
form of protection against such errors seems prudent.

In addition, even with the shift-induced slowdown, FNV1+SHIFT still
performs similarly to hardware-accelerated CRC, reaching about 6bytes/cycle
on modern Intel CPUS. This really is plenty fast - if I'm not mistaken, it
translates to well over 10 GB/s.

So overall -1 for removing the shift.

best regards,
Florian Pflug

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ants Aasma 2013-04-23 08:44:24 Re: Substituting Checksum Algorithm (was: Enabling Checksums)
Previous Message Jeff Davis 2013-04-23 07:17:28 Substituting Checksum Algorithm (was: Enabling Checksums)