Re: Explanation for bug #13908: hash joins are badly broken

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Explanation for bug #13908: hash joins are badly broken
Date: 2016-02-06 21:17:52
Message-ID: 56B66300.7000405@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/06/2016 09:55 PM, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> On 02/06/2016 06:47 PM, Tom Lane wrote:
>>> I note also that while the idea of ExecHashRemoveNextSkewBucket is
>>> to reduce memory consumed by the skew table to make it available to
>>> the main hash table, in point of fact it's unlikely that the freed
>>> space will be of any use at all, since it will be in tuple-sized
>>> chunks far too small for dense_alloc's requests. So the spaceUsed
>>> bookkeeping being done there is an academic exercise unconnected to
>>> reality, and we need to rethink what the space management plan is.
>
>> I don't follow. Why would these three things (sizes of allocations in
>> skew buckets, chunks in dense allocator and accounting) be related?
>
> Well, what we're trying to do is ensure that the total amount of
> space used by the hashjoin table doesn't exceed spaceAllowed. My
> point is that it's kind of cheating to ignore space
> used-and-then-freed if your usage pattern is such that that space
> isn't likely to be reusable. A freed skew tuple represents space that
> would be reusable for another skew tuple, but is probably *not*
> reusable for the main hash table; so treating that space as
> interchangeable is wrong I think.

Ah, I see. And I agree that treating those areas as equal is wrong.

> I'm not entirely sure where to go with that thought, but maybe the
> answer is that we should just treat the skew table and main table
> storage pools as entirely separate with independent limits. That's
> not what's happening right now, though.

What about using the dense allocation even for the skew buckets, but not
one context for all skew buckets but one per bucket? Then when we delete
a bucket, we simply destroy the context (and free the chunks, just like
we do with the current dense allocator).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-02-06 21:18:43 Re: Explanation for bug #13908: hash joins are badly broken
Previous Message Tom Lane 2016-02-06 21:12:03 Re: Optimization- Check the set of conditionals on a WHERE clause against CHECK constraints.