Re: Enabling Checksums

From: Ants Aasma <ants(at)cybertec(dot)at>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-04-16 21:41:30
Message-ID: CA+CSw_tcQHE12Lnp9xCbgm1c26R3tAZoqnY5YN+_xEvPPwpzFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 16, 2013 at 11:20 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Apr13, 2013, at 17:14 , Ants Aasma <ants(at)cybertec(dot)at> wrote:
>> Based on current analysis, it is particularly good at detecting single
>> bit errors, as good at detecting burst errors as can be expected from
>> 16 bits and not horrible at detecting burst writes of zeroes. It is
>> quite bad at detecting multiple uncorrelated single bit errors and
>> extremely bad at detecting repeating patterns of errors in low order
>> bits.
>
> I've read the patch and tried to understand why it's that bad at
> detecting repeating patterns of errors in low order bits, and to see
> if there might be a way to fix that without too much of a performance
> impact.
>
> Here's what I gather the algorithm does:
>
> It treats the input data, a page of L bytes, as a Nx64 matrix V
> of 16-bit quantities (N = L/64, obviously).
> It then first computes (using two primes p (PRIME1) and q (PRIME2))
>
> S = V[1,1]*p^63*q^63 + V[1,2]*p^63*q^62 + … + V[1,64]*p^63*q^0
> + V[2,1]*p^62*q^63 + V[2,2]*p^62*q^62 + … + V[2,64]*p^62*q^0
> + …
> + V[N,1]*p^0 *q^63 + V[N,2]*p^0 *q^62 + … + V[N,64]*p^0 *q^0
> (mod 2^16)
> = sum V[i,j]*p^(64-i)*q^(64-j)
>
> Note that it does that by first computing the row-wise sums without
> the q^i coefficient, and then (in what the code calls the aggregation
> phase) combines those row-wise sums into a total, adding the q^i-
> coefficients along the way.
>
> The final hash value is then
>
> H = S * p + B * q mod 2^16
>
> where B is a salt value intended to detect swapped pages (currently
> B is simply the page index)

Great job analyzing the analytic form of the algorithm and sorry I you
had to do it instead finding it in the documentation.

> This raises two question. First, why are there two primes? You could
> just as well using a single prime q and set p=q^64 mod 2^16. You then
> get
> S = sum V[i,j] * q^(64*(64-i) + (64-j)
> = sum V[i,j] * q^(4096 - 64*(i-1) - j)
> You get higher prime powers that way, but you can easily choose a prime
> that yields distinct values mod 2^16 for exponents up to 16383. Your
> PRIME2, for example, does. (It wraps around for 16384, i.e.
> PRIME2^16384 = 1 mod 2^16, but that's true for every possible prime since
> 16384 is the Carmichael function's value at 2^16)

The experimental detection rate is about the same if we use a single
prime. But I think you have the analytical form wrong here. It should
be given q = p:

S = sum V[i,j] * p^(64-i) * p^(64-j)
= sum V[i,j] * p^(64 - i + 64 - j)
= sum V[i,j] * p^(128 - i -j)

This makes the whole matrix symmetric. While I can't think of any real
world errors that would exhibit symmetry in this 64x64 matrix, it
seemed better to me to avoid the issue altogether and use different
primes. IIRC it helped a lot for the case of page[i] = i & 0xFF.

> Second, why does it use addition instead of XOR? It seems that FNV
> usually XORs the terms together instead of adding them?

Testing showed slightly better detection rate for adds. Intuitively I
think it's because the carry introduces some additional mixing.

> Regarding the bad behaviour for multiple low-bit errors - can you
> explain why it behaves badly in that case? I currently fail to see
> why that would be. I *can* see that the lowest bit of the hash depends
> only on the lowest bit of the input words, but as long as the lowest
> bits of the input words also affect other bits of the hash, that shouldn't
> matter. Which I think they do, but I might be missing something...

Looks like you're right. I was somehow concentrating only on how the
lowest bits depend on the input.

> Here, btw, is a page on FNV hashing. It mentions a few rules for
> picking suitable primes
>
> http://www.isthe.com/chongo/tech/comp/fnv

Unfortunately the rules don't apply here because of the hash size.

Thanks for your analysis. I will do my best to get this all into a
document and will do some more analysis to see if I can come up with
some kind of proof for the error cases.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2013-04-16 22:18:58 Re: PROPOSAL: tracking aggregated numbers from pg_stat_database
Previous Message Fabrízio de Royes Mello 2013-04-16 21:30:29 Re: [GENERAL] currval and DISCARD ALL