From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Jim C(dot) Nasby" <jim(at)nasby(dot)net> |
Cc: | pgsql-performance(at)postgreSQL(dot)org |
Subject: | Re: Horribly slow hash join |
Date: | 2004-04-17 16:08:15 |
Message-ID: | 10603.1082218095@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
[ resending because I fat-fingered the cc: to the list ]
I see the problem: all the entries in your work_units column have the
low 32 bits equal to zero.
regression=# select distinct work_units % (2^32)::bigint from Trank_work_overall;
?column?
----------
0
(1 row)
The hash function for int8 only takes the low word into account, so all
of the entries end up on the same hash chain, resulting in worst-case
behavior. This applies to both your hash join and hash aggregate cases.
We could change the hash function, perhaps, but then we'd just have
different cases where there's a problem ... hashing will always fail on
*some* set of inputs. (Also, I have been harboring some notions of
supporting cross-type hash joins for integer types, which will not work
unless small int8 values hash the same as int4 etc.)
I guess the real issue is why are you encoding work_units like that?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Marcos Martínez (R) | 2004-04-17 20:35:09 | Re: Horribly slow hash join |
Previous Message | Jeff | 2004-04-17 14:29:06 | Re: [ SOLVED ] select count(*) very slow on an already |