From: | Ants Aasma <ants(dot)aasma(at)eesti(dot)ee> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Florian Pflug <fgp(at)phlo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Subject: | Re: Substituting Checksum Algorithm (was: Enabling Checksums) |
Date: | 2013-04-23 08:44:24 |
Message-ID: | CA+CSw_temxBkeyQYykJfLea11LvTT0vYVjp4y6sVjnAraVWbRA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Apr 23, 2013 10:17 AM, "Jeff Davis" <pgsql(at)j-davis(dot)com> wrote:
> Attached is my reorganization of Ants's patch here:
>
> http://www.postgresql.org/message-id/CA
> +CSw_vinyf-w45i=M1m__MpJZY=e8S4Nt_KNnpEbtWjTOaSUA(at)mail(dot)gmail(dot)com
Thanks for your help. Some notes below.
> My changes:
>
> * wrest the core FNV algorithm from the specific concerns of a data page
> - PageCalcChecksum16 mixes the block number, reduces to 16 bits,
> and avoids the pd_checksum field
> - the checksum algorithm is just a pure block checksum with a 32-bit
> result
> * moved the FNV checksum into a separate file, checksum.c
I think the function should not be called checksum_fnv as it implies that
we use the well known straightforward implementation. Maybe checksum_block
or some other generic name.
> * added Ants's suggested compilation flags for better optimization
-msse4.1 is not safe to use in default builds. On the other hand it doesn't
hurt to just specify it in CFLAGS for the whole compile (possibly as
-march=native). We should just omit it and mention somewhere that SSE4.1
enabled builds will have better checksum performance.
> * slight update to the paragraph in the README that discusses concerns
> specific to a data page
>
> I do have a couple questions/concerns about Ants's patch though:
>
> * The README mentions a slight bias; does that come from the mod
> (2^16-1)? That's how I updated the README, so I wanted to make sure.
Yes.
> * How was the FNV_PRIME chosen?
I still haven't found the actual source for this value. It's specified as
the value to use for 32bit FNV-1a.
> * I can't match the individual algorithm step as described in the README
> to the actual code. And the comments in the README don't make it clear
> enough which one is right (or maybe they both are, and I'm just tired).
I will try to reword.
> The README says:
>
> hash = (hash ^ value) * ((hash ^ value) >> 3)
>
> (which is obviously missing the FNV_PRIME factor) and the code says:
>
> +#define CHECKSUM_COMP(checksum, value) do {\
> + uint32 __tmp = (checksum) ^ (value);\
> + (checksum) = __tmp * FNV_PRIME ^ (__tmp >> 3);\
> +} while (0)
>
> I'm somewhat on the fence about the "shift right". It was discussed in
> this part of the thread:
>
>
http://www.postgresql.org/message-id/99343716-5F5A-45C8-B2F6-74B9BA357396@phlo.org
>
> I think we should be able to state with a little more clarity in the
> README why there is a problem with plain FNV-1a, and why this
> modification is both effective and safe.
Florian already mentioned why it's effective. I have an intuition why it's
safe, will try to come up with a well reasoned argument.
Regards,
Antd Aasma
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-04-23 08:47:38 | Re: Substituting Checksum Algorithm (was: Enabling Checksums) |
Previous Message | Florian Pflug | 2013-04-23 08:24:40 | Re: Substituting Checksum Algorithm (was: Enabling Checksums) |