From: | "Deng, Gang" <gang(dot)deng(at)intel(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: [PATCH] Resolve Parallel Hash Join Performance Issue |
Date: | 2020-01-10 01:18:39 |
Message-ID: | 0F44E799048C4849BAE4B91012DB910462E991B3@SHSMSX103.ccr.corp.intel.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Regarding to the reason of setting bit was not cheap anymore in parallel join. As I explain in my original mail, it is because 'false sharing cache coherence'. In short word, setting of the bit will cause the whole cache line (64 bytes) dirty. So that all CPU cores contain the cache line have to load it again, which will waste much cpu time. Article https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads explain more detail.
-----Original Message-----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Thursday, January 9, 2020 10:43 PM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Deng, Gang <gang(dot)deng(at)intel(dot)com>; pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Resolve Parallel Hash Join Performance Issue
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> Right, I see. The funny thing is that the match bit is not even used
> in this query (it's used for right and full hash join, and those
> aren't supported for parallel joins yet). Hmm. So, instead of the
> test you proposed, an alternative would be to use if (!parallel).
> That's a value that will be constant-folded, so that there will be no
> branch in the generated code (see the pg_attribute_always_inline
> trick). If, in a future release, we need the match bit for parallel
> hash join because we add parallel right/full hash join support, we
> could do it the way you showed, but only if it's one of those join
> types, using another constant parameter.
Can we base the test off the match type today, and avoid leaving something that will need to be fixed later?
I'm pretty sure that the existing coding is my fault, and that it's like that because I reasoned that setting the bit was too cheap to justify having a test-and-branch around it. Apparently that's not true anymore in a parallel join, but I have to say that it's unclear why. In any case, the reasoning probably still holds good in non-parallel cases, so it'd be a shame to introduce a run-time test if we can avoid it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David Steele | 2020-01-10 01:19:00 | Re: backup manifests |
Previous Message | Tom Lane | 2020-01-10 01:09:29 | Re: pgbench - use pg logging capabilities |