Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Date: 2018-01-29 01:02:13
Message-ID: 20180129010213.rt3nu73ilff756f6@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2018-01-26 18:48:35 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2017-12-10 23:09:42 +0100, Tomas Vondra wrote:
> >> FWIW I do agree the data sets shared in this thread are pretty extreme
> >> and it doesn't make much sense to slow the regular cases. I'll be
> >> perfectly happy if we stop the OOM, making those cases fast is a bonus.
>
> > Yea, agreed on that. I'm kinda inclined to go for stop-growing in 10,
> > and so something better in 11. And then later possibly backpatch if
> > we've grown some confidence?
>
> +1. I'd like to see some response to this included in 10.2, and time
> grows short for that.

Here are two patches that I think we want for 10.2, and the start of one
that I think we want for master. 0002 is needed because otherwise the
lack of extra growth leads to noticeably worse performance when filling
an underestimated a coordinator hash table from the workers - turns out
our hash combine (and most hash combines) let a lot of clustering
survive. By adding a final hashing round the bit perturbation is near
perfect. The commit messages need to be polished a bit, but other than
that I think these are reasonable fixes. Plan to push by Monday evening
at the latest.

The third patch is a version of the random IV discussed in this
thread. I do think we want to add usage of the extended hash functions,
as prototyped by Tomas, as that actually helps to fix issues with actual
hash conflicts. But we additionally need a fallback path for types
without extended hashtables, and the random IV is a good idea
nonetheless. There's no ABI difference in my patch, so I think this is
actually something we could backpatch. But I don't think it's urgent, so
I'm not planning to do that for 10.2. The one thing that could confuse
people is that it can lead to output order changes from run to run - I
think that's actually good, nobody should rely on hashagg etc output
being stable, but it might be a bit much in a stable release?

In my tests this solves the worst performance issues in Todd's case,
Tomas's, Thomas's and still does ok performancwith with a TPC-H Q18
(which showcases the underestimated worker hashtable into leader
hashtable issue).

Greetings,

Andres Freund

Attachment Content-Type Size
0001-Prevent-growth-of-simplehash-tables-when-they-re-too.patch text/x-diff 2.6 KB
0002-Improve-bit-perturbation-in-TupleHashTableHash.patch text/x-diff 4.4 KB
0003-Add-randomness-to-execGrouping.c-hashtable-IV.patch text/x-diff 7.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2018-01-29 13:41:49 Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Previous Message Tom Lane 2018-01-28 21:05:40 Re: BUG #15025: PSQL CLI - inconsistency when both -d and -U supplies a username