From: | Greg Stark <gsstark(at)mit(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Greg Stark <gsstark(at)mit(dot)edu>, Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, Bruno Wolff III <bruno(at)wolff(dot)to>, "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Horribly slow hash join |
Date: | 2004-04-19 12:12:54 |
Message-ID: | 87vfjw5fp5.fsf@stark.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > If the hash tables were made a power of two then it would be possible to mix
> > the bits of the 32 bit value and just mask off the unneeded bits. I've found
> > one page via google that mentions mixing bits in a hash function, but I would
> > look for a more serious treatment somewhere.
> Modding by a *non* power of 2 (esp. a prime) mixes the bits quite well,
> and is likely faster than any multiple-instruction way to do the same.
Well a) any number that has any factors of two fails to mix in some bits.
That's a lot more common than non powers of two. b) The postgres code makes no
attempt to make the number of buckets a prime and c) Even if the number of
buckets were prime then it seems it would still be too easy to find real-world
data where all the data have that prime as a factor. As it is they only need
to have common factors to lose.
> The quoted article seems to be by someone who has spent a lot of time
> counting assembly cycles and none at all reading the last thirty years
> worth of CS literature.
Yes, well I did note that.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2004-04-19 12:16:36 | Re: Horribly slow hash join |
Previous Message | Dave Cramer | 2004-04-19 12:07:44 | Re: Horribly slow hash join |