From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Combining hash values |
Date: | 2016-08-01 11:24:15 |
Message-ID: | CAEZATCV=xBugH1QPnG3sH06nR7yZ272n_C6jCBCLjXe2jypv9w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1 August 2016 at 08:19, Greg Stark <stark(at)mit(dot)edu> wrote:
> Surely combining multiple hashes is the same problem as hashing a block of
> memory? Shouldn't we just use the same algorithm as hash_any()?
>
Yes, I imagine that should work well. I suspect that Thomas's proposal
would also probably work well, as would a number of other hashing
algorithms with reasonable pedigree, such as the one used for array
hashing. I don't have any particular preference, but I do know that
what usually turns out to be disastrous is an arbitrary made-up
formula like rotating and xor'ing. The last thing we should attempt to
do is invent our own hashing algorithms.
On that subject, while looking at hashfunc.c, I spotted that
hashint8() has a very obvious deficiency, which causes disastrous
performance with certain inputs:
create table foo (a bigint);
insert into foo select (x*2^32)+x from generate_series(1,1000000) g(x);
analyse foo;
select * from foo f1, foo f2 where f1.a=f2.a;
With random values that query (using a hash join) takes around 2
seconds on my machine, but with the values above I estimate it will
take 20 hours (although I didn't wait that long!). The reason is
pretty obvious -- all the bigint values above hash to the same value.
I'd suggest using hash_uint32() for values that fit in a 32-bit
integer and hash_any() otherwise.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2016-08-01 11:25:59 | Re: Oddity in EXPLAIN for foreign/custom join pushdown plans |
Previous Message | Etsuro Fujita | 2016-08-01 11:15:03 | Re: Oddity in EXPLAIN for foreign/custom join pushdown plans |