Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>, pgsql-bugs(at)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Date: 2018-01-26 23:45:18
Message-ID: 43825a04-d36c-2632-91fd-332e0fa2bd72@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 01/27/2018 12:22 AM, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> I suspect you're right the hash is biased to lohalf bits, as you wrote
>> in the 19/12 message.
>
> I don't see any bias in what it's doing, which is basically xoring the
> two halves and hashing the result. It's possible though that Todd's
> data set contains values in which corresponding bits of the high and
> low halves are correlated somehow, in which case the xor would produce
> a lot of cancellation and a relatively small number of distinct outputs.
>

Hmm, that makes more sense than what I wrote. Probably time to get some
sleep or drink more coffee, I guess.

BTW what do you think about the fact that we only really generate ~63%
of the possible hash values (see my message from 11/12)? That seems a
bit unfortunate, although not unexpected for simple hash hunction.

> If we weren't bound by backwards compatibility, we could consider
> changing to logic more like "if the value is within the int4 range,
> apply int4hash, otherwise hash all 8 bytes normally". But I don't see
> how we can change that now that hash indexes are first-class
> citizens.
>

Yeah, I've been thinking about that too. But I think it's an issue only
for pg_upgraded clusters, which may have have hash indexes (and also
hash-partitioned tables). So couldn't we use new hash functions in fresh
clusters and use the backwards-compatible ones in pg_upgraded ones?r

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2018-01-26 23:45:37 Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Previous Message Michael Paquier 2018-01-26 23:44:28 Re: pg_hba_file_rules: "scram-sha256" instead of "scram-sha-256"