From: | Kenneth Marshall <ktm(at)rice(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Hash support for arrays |
Date: | 2010-11-02 20:46:18 |
Message-ID: | 20101102204618.GW27429@aart.is.rice.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 02, 2010 at 04:42:19PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Really? I think "I don't understand when this fails" isn't obviously
> >> better than being able to predict when it fails ...
>
> > Isn't that the whole point of hash functions? Collisions are
> > inevitable, but you want them to be unpredictable. If you want a hash
> > function with predictable collision behavior, just has the first
> > element. It'll be highly predictable AND wicked fast.
>
> That seems like a rather poor straw man, since it suffers from exactly
> the defect I'm complaining about, namely failing to consider all the
> input values equally.
>
> >> I'd be happier about this approach if there were some actual theory
> >> behind it ... maybe there's some analysis out there, but the one link
> >> that was given was just about entirely unconvincing.
>
> > I think it's from Knuth, though unfortunately I don't have a copy to
> > check. There are probably better algorithms out there, but this one's
> > pretty simple.
>
> I don't see anything in Knuth suggesting a multiplier of 31. His
> recommendation for a multiplier, if you're going to use multiplicative
> hashing, is wordsize/phi (phi being the golden ratio) ... and he also
> wants you to keep the high order not the low order bits of the product.
>
> However, this is largely beside the point, because that theory, as well
> as the Java code you're arguing from, has to do with the initial hashing
> of a raw sequence of input items. Not with combining some existing hash
> values. The rotate-and-xor method I suggested for that is borrowed
> exactly from section 6.4 of Knuth (page 512, in the first edition of
> volume 3).
>
> regards, tom lane
>
Given that our hash implimentation mixes the input data well (It does.
I tested it.) then a simple rotate-and-xor method is all that should
be needed to maintain all of the needed information. The original
hash function has done the heavy lifting in this case.
Regards,
Ken
From | Date | Subject | |
---|---|---|---|
Next Message | Sam Mason | 2010-11-02 21:26:22 | Re: Hash support for arrays |
Previous Message | Tom Lane | 2010-11-02 20:42:19 | Re: Hash support for arrays |