Re: Cost of XLogInsert CRC calculations

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Manfred Koizar" <mkoi-pg(at)aon(dot)at>, "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cost of XLogInsert CRC calculations
Date: 2005-05-18 04:44:30
Message-ID: D425483C2C5C9F49B5B7A41F89441547055BA9@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I probably shouldn't jump in, because I do not know the nature of the
usage of the CRC values.

But if the birthday paradox can come into play, with a 32 bit CRC, you
will get one false mismatch every 78,643 items or so.
http://mathworld.wolfram.com/BirthdayProblem.html

Probably you already knew that, and probably the birthday paradox does
not apply.

I generally use 64 bit CRCs (UMAC) for just about anything that needs a
CRC.
http://www.cs.ucdavis.edu/~rogaway/umac/

A plausible work-around is to compute two distinct 32-bit hash values
for platforms with awful 64 bit math/emulation (e.g. [SDBM hash and FNV
hash] or [Bob Jenkins hash and D. J. Bernstein hash]) to create two
distinct 32 bit hash values -- both of which must match.

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Tom Lane
> Sent: Tuesday, May 17, 2005 9:26 PM
> To: Bruce Momjian
> Cc: Manfred Koizar; Mark Cave-Ayland; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Cost of XLogInsert CRC calculations
>
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Well, we need to understand exactly what is going on here. I'd not
> >> like to think that we dropped back from 64 to 32 bit because of one
> >> possibly-minor optimization bug in one compiler on one platform.
> >> Even if that compiler+platform is 90% of the market.
>
> > But isn't it obvious that almost any problem that CRC64 is going to
> > catch, CRC32 is going to catch, and we know CRC32 has to be faster
than
> > CRC64?
>
> Do we know that? The results I showed put at least one fundamentally
> 32bit platform (the PowerBook I'm typing this on) at dead par for
32bit
> and 64bit CRCs. We have also established that 64bit CRC can be done
> much faster on 32bit Intel than it's currently done by the default
> PG-on-gcc build (hint: don't use -O2 or above). So while Mark's
report
> that 64bit CRC is an issue on Intel is certainly true, it doesn't
> immediately follow that the only sane response is to give up 64bit
CRC.
> We need to study it and see what alternatives we have.
>
> I do personally feel that 32bit is the way to go, but that doesn't
> mean I think it's a done deal. We owe it to ourselves to understand
> what we are buying and what we are paying for it.
>
> regards, tom lane
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
your
> joining column's datatypes do not match

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-05-18 05:01:54 Re: Learning curves and such (was Re: pgFoundry)
Previous Message Tom Lane 2005-05-18 04:25:31 Re: Cost of XLogInsert CRC calculations