From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |
Date: | 2017-12-06 22:55:19 |
Message-ID: | 16912.1512600919@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2017-12-06 16:38:18 -0500, Todd A. Cook wrote:
>> I found this problem when I dropped 10.1 into a test environment to see
>> what would happen. There was no deliberate attempt to break anything.
> Read Thomas' message at: http://archives.postgresql.org/message-id/263b03b1-3e1c-49ca-165a-8ac6751419c4%402ndquadrant.com
I'm confused by Tomas' claim that
>> (essentially hashint8 only ever produces 60% of
>> values from [0,1000000], which likely increases collision rate).
This is directly contradicted by the simple experiments I've done, eg
regression=# select count(distinct hashint8(v)) from generate_series(0,1000000::int8) v;
count
--------
999879
(1 row)
regression=# select count(distinct hashint8(v) & (1024*1024-1)) from generate_series(0,1000000::int8) v;
count
--------
644157
(1 row)
regression=# select count(distinct hashint8(v) & (1024*1024-1)) from generate_series(0,10000000::int8) v;
count
---------
1048514
(1 row)
It's certainly not perfect, but I'm not observing any major failure to
span the output space.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2017-12-06 23:45:34 | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |
Previous Message | Andres Freund | 2017-12-06 21:40:03 | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |