From: | "Dann Corbit" <DCorbit(at)connx(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us> |
Cc: | "Manfred Koizar" <mkoi-pg(at)aon(dot)at>, "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Cost of XLogInsert CRC calculations |
Date: | 2005-05-18 04:44:30 |
Message-ID: | D425483C2C5C9F49B5B7A41F89441547055BA9@postal.corporate.connx.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I probably shouldn't jump in, because I do not know the nature of the
usage of the CRC values.
But if the birthday paradox can come into play, with a 32 bit CRC, you
will get one false mismatch every 78,643 items or so.
http://mathworld.wolfram.com/BirthdayProblem.html
Probably you already knew that, and probably the birthday paradox does
not apply.
I generally use 64 bit CRCs (UMAC) for just about anything that needs a
CRC.
http://www.cs.ucdavis.edu/~rogaway/umac/
A plausible work-around is to compute two distinct 32-bit hash values
for platforms with awful 64 bit math/emulation (e.g. [SDBM hash and FNV
hash] or [Bob Jenkins hash and D. J. Bernstein hash]) to create two
distinct 32 bit hash values -- both of which must match.
> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Tom Lane
> Sent: Tuesday, May 17, 2005 9:26 PM
> To: Bruce Momjian
> Cc: Manfred Koizar; Mark Cave-Ayland; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Cost of XLogInsert CRC calculations
>
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Well, we need to understand exactly what is going on here. I'd not
> >> like to think that we dropped back from 64 to 32 bit because of one
> >> possibly-minor optimization bug in one compiler on one platform.
> >> Even if that compiler+platform is 90% of the market.
>
> > But isn't it obvious that almost any problem that CRC64 is going to
> > catch, CRC32 is going to catch, and we know CRC32 has to be faster
than
> > CRC64?
>
> Do we know that? The results I showed put at least one fundamentally
> 32bit platform (the PowerBook I'm typing this on) at dead par for
32bit
> and 64bit CRCs. We have also established that 64bit CRC can be done
> much faster on 32bit Intel than it's currently done by the default
> PG-on-gcc build (hint: don't use -O2 or above). So while Mark's
report
> that 64bit CRC is an issue on Intel is certainly true, it doesn't
> immediately follow that the only sane response is to give up 64bit
CRC.
> We need to study it and see what alternatives we have.
>
> I do personally feel that 32bit is the way to go, but that doesn't
> mean I think it's a done deal. We owe it to ourselves to understand
> what we are buying and what we are paying for it.
>
> regards, tom lane
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
your
> joining column's datatypes do not match
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-05-18 05:01:54 | Re: Learning curves and such (was Re: pgFoundry) |
Previous Message | Tom Lane | 2005-05-18 04:25:31 | Re: Cost of XLogInsert CRC calculations |