From: | hernan gonzalez <hgonzalez(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Hash support for arrays |
Date: | 2010-11-01 21:27:12 |
Message-ID: | AANLkTimurNWF2Wq_g0G2jiceZjdT_Y5-Qk=qZwCNfPDm@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>Hmm. I am reminded of Knuth's famous dictum: "never generate random
numbers with a method chosen at random". Is there any actual theory
behind that algorithm, and if so what is it? The combination of
shifting with addition (not xor) seems more likely to lead to weird
cancellations than any improvement in the hash behavior.
For what's worth: the standard way in Java is multiplying by 31.
Which 31 amounts to a 5 bit shift and a substraction.
In general, a prime number is recommended though one would like that
Knuth made some analysis here... it all seems mostly empirical.
http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/
Perhaps the 5 bit shift is related to the spread of ascii charpoints,
as that hashcode() implementation was mainly tested and implemented
for a String. But now it's also suggested for general objects, and
it's even specified for standard collections: for example
http://download.oracle.com/javase/6/docs/api/java/util/List.html#hashCode()
Hernán J. González
From | Date | Subject | |
---|---|---|---|
Next Message | Alex Hunsaker | 2010-11-01 22:30:18 | Re: why does plperl cache functions using just a bool for is_trigger |
Previous Message | Alex Hunsaker | 2010-11-01 21:24:03 | Re: why does plperl cache functions using just a bool for is_trigger |