From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: postpone building buckets to the end of Hash (in HashJoin) |
Date: | 2016-01-19 21:49:07 |
Message-ID: | 569EAF53.30202@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/19/2016 08:34 PM, Robert Haas wrote:
> On Mon, Jan 18, 2016 at 10:57 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>> If this doesn't regress performance in the case where the number of
>>>> buckets is estimated accurately to begin with, then I think this is
>>>> a great idea. Can you supply some performance tests results for that
>>>> case, and maybe some of the other cases also?
>>>
>>> I don't see how it could regress performance, and the benchmarks I've
>>> done confirm that. I'll do more thorough benchmarking and post the
>>> results here, but not now as this patch is in 2016-01 CF and I want to
>>> put all my time into reviewing patches from the open commitfest.
>>
>> I've finally got to do more thorough benchmarks, and surprisingly there
>> seems to be a minor regression.
> ...
>>
>> So even if we entirely skipped the bucket build, we would not recover the
>> difference. Not sure what's going on here.
>>
>> I've also done some profiling using perf, but I haven't really found
>> anything suspicious there.
>>
>> Any ideas what might be the cause of this?
>
> My guess is that memory locality is not as good with this patch. Consider this:
>
> copyTuple->next = hashtable->buckets[bucketno];
>
> We've only just populated copytuple via memcpy, so that cache line is
> presumably in whatever place cache lines go when they are really hot.
> We basically make one pass over the allocated space for the hash
> table, filling in the hash tuples and linking things into bucket
> chains. OTOH, with the patch, we don't do that on the first pass,
> just writing the tuples. Then when we come back through we still have
> to do that part:
>
> hashTuple->next = hashtable->buckets[bucketno];
>
> By the time we're doing this, especially at larger work_mem settings,
> we've probably pushed those cache lines out of the faster levels of
> the CPU cache, and we have to pull them back in again, and that takes
> time. If the tuples are small, we'll dirty every cache line twice;
> if they're larger, we might not dirty every cache line on the second
> pass, but just some fraction of them.
>
> I could be totally off base, but this is why I was concerned about the patch.
I can totally see why this would slow-down the BuildBuckets function,
but I don't quite see why it should make the other code significantly
slower. Yet BuildBuckets takes just ~25ms while the total duration
increases by ~200ms (for the 1x10 dataset).
I do understand calling BuildBuckets may affect the code executed after
that as it keeps other data in the cache, but i would not expect ~175ms.
Yet another thing is that BuildBuckets accesses the tuples in mostly
sequential way, so I'd expect prefetch to take care of that.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2016-01-19 21:50:20 | Re: Combining Aggregates |
Previous Message | David Rowley | 2016-01-19 21:44:16 | Re: Combining Aggregates |