Quick Links

Re: machine-dependent hash_any vs the regression tests

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Gregory Stark <stark(at)enterprisedb(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: machine-dependent hash_any vs the regression tests
Date:	2008-04-05 22:44:08
Message-ID:	11205.1207435448@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> Why do we have this hash function anyways? Is hashany faster than a decent
> crc32 implementation?

Yes, significantly. Times to hash 32K bytes 100000 times on a Xeon EM64T:

hash_crc(32K): 11.388755 s
hash_any_old(32K): 4.401945 s
hash_any(32K): 3.862427 s

hash_crc is our src/include/utils/pg_crc.h code, hash_any_old is current
CVS HEAD, hash_any is the word-wide version. For just 8 bytes (100M
repetitions)

hash_crc(8 bytes): 2.587647 s
hash_any_old(8 bytes): 1.581826 s
hash_any(8 bytes): 1.294480 s

so in both setup and per-byte terms CRC is more expensive. But the
bigger problem is that CRC isn't necessarily designed to have the
properties we need, in particular that all bits of the hash are about
equally random. It's designed to attack other problems.

regards, tom lane

In response to

Re: machine-dependent hash_any vs the regression tests at 2008-04-05 22:28:07 from Gregory Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2008-04-05 23:26:17	Re: [HACKERS] Patch queue -> wiki
Previous Message	Marc G. Fournier	2008-04-05 22:43:59	Re: [HACKERS] Patch queue -> wiki