From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>, pgsql-bugs(at)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |
Date: | 2018-01-26 23:45:18 |
Message-ID: | 43825a04-d36c-2632-91fd-332e0fa2bd72@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 01/27/2018 12:22 AM, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> I suspect you're right the hash is biased to lohalf bits, as you wrote
>> in the 19/12 message.
>
> I don't see any bias in what it's doing, which is basically xoring the
> two halves and hashing the result. It's possible though that Todd's
> data set contains values in which corresponding bits of the high and
> low halves are correlated somehow, in which case the xor would produce
> a lot of cancellation and a relatively small number of distinct outputs.
>
Hmm, that makes more sense than what I wrote. Probably time to get some
sleep or drink more coffee, I guess.
BTW what do you think about the fact that we only really generate ~63%
of the possible hash values (see my message from 11/12)? That seems a
bit unfortunate, although not unexpected for simple hash hunction.
> If we weren't bound by backwards compatibility, we could consider
> changing to logic more like "if the value is within the int4 range,
> apply int4hash, otherwise hash all 8 bytes normally". But I don't see
> how we can change that now that hash indexes are first-class
> citizens.
>
Yeah, I've been thinking about that too. But I think it's an issue only
for pg_upgraded clusters, which may have have hash indexes (and also
hash-partitioned tables). So couldn't we use new hash functions in fresh
clusters and use the backwards-compatible ones in pg_upgraded ones?r
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-01-26 23:45:37 | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |
Previous Message | Michael Paquier | 2018-01-26 23:44:28 | Re: pg_hba_file_rules: "scram-sha256" instead of "scram-sha-256" |