Re: PATCH: postpone building buckets to the end of Hash (in HashJoin)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: postpone building buckets to the end of Hash (in HashJoin)
Date: 2016-01-19 19:34:50
Message-ID: CA+TgmoYr90JXogitzKVR6=LXmivm1OH=AevrF4pgskL53xRnGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 18, 2016 at 10:57 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>> If this doesn't regress performance in the case where the number of
>>> buckets is estimated accurately to begin with, then I think this is
>>> a great idea. Can you supply some performance tests results for that
>>> case, and maybe some of the other cases also?
>>
>> I don't see how it could regress performance, and the benchmarks I've
>> done confirm that. I'll do more thorough benchmarking and post the
>> results here, but not now as this patch is in 2016-01 CF and I want to
>> put all my time into reviewing patches from the open commitfest.
>
> I've finally got to do more thorough benchmarks, and surprisingly there
> seems to be a minor regression.
...
>
> So even if we entirely skipped the bucket build, we would not recover the
> difference. Not sure what's going on here.
>
> I've also done some profiling using perf, but I haven't really found
> anything suspicious there.
>
> Any ideas what might be the cause of this?

My guess is that memory locality is not as good with this patch. Consider this:

copyTuple->next = hashtable->buckets[bucketno];

We've only just populated copytuple via memcpy, so that cache line is
presumably in whatever place cache lines go when they are really hot.
We basically make one pass over the allocated space for the hash
table, filling in the hash tuples and linking things into bucket
chains. OTOH, with the patch, we don't do that on the first pass,
just writing the tuples. Then when we come back through we still have
to do that part:

hashTuple->next = hashtable->buckets[bucketno];

By the time we're doing this, especially at larger work_mem settings,
we've probably pushed those cache lines out of the faster levels of
the CPU cache, and we have to pull them back in again, and that takes
time. If the tuples are small, we'll dirty every cache line twice;
if they're larger, we might not dirty every cache line on the second
pass, but just some fraction of them.

I could be totally off base, but this is why I was concerned about the patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2016-01-19 19:52:04 Re: Advices on custom data type and extension development
Previous Message Andrew Dunstan 2016-01-19 19:15:01 Re: make error - libpqdll.def No such file or directory