From: | Gregory Maxwell <gmaxwell(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Marko Kreen <marko(at)l-t(dot)ee>, pgsql-hackers(at)postgresql(dot)org, Michael Paesold <mpaesold(at)gmx(dot)at> |
Subject: | Re: Spinlocks, yet again: analysis and proposed patches |
Date: | 2005-09-16 03:48:09 |
Message-ID: | e692861c05091520486eb9307c@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 9/15/05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Yesterday's CVS tip:
> 1 32s 2 46s 4 88s 8 168s
> plus no-cmpb and spindelay2:
> 1 32s 2 48s 4 100s 8 177s
> plus just-committed code to pad LWLock to 32:
> 1 33s 2 50s 4 98s 8 179s
> alter to pad to 64:
> 1 33s 2 38s 4 108s 8 180s
>
> I don't know what to make of the 2-process time going down while
> 4-process goes up; that seems just weird. But both numbers are
> repeatable.
It is odd.
In the two process case there is, assuming random behavior, a 1/2
chance that you've already got the right line, but in the 4 process
case only a 1/4 chance (since we're on a 4 way box). This would
explain why we don't see as much cost in the intentionally misaligned
case. You'd expect the a similar pattern of improvement with the
64byte alignment (some in the two process case, but more in the 4
case), but here we see more improvement in the two way case.
If I had to guess I might say that the 64byte alignment is removing
much of the unneeded line bouncing in the the two process case but is
at the same time creating more risk of bouncing caused by aliasing.
Since two processes have 1/2 chance the aliasing isn't a problem so
the change is a win, but in the four process case it's no longer a win
because with aliasing there is still a lot of fighting over the cache
lines even if you pack well, and the decrease in packing makes odd
aliasing somewhat more likely. This might also explain why the
misaligned case performed so poorly in the 4process case, since the
misalignment didn't just increase the cost 2x, it also increased the
likelihood of a bogus bounce due to aliasing..
If this is the case, then it may be possible through very careful
memory alignment to make sure that no two high contention locks that
are likely to be contended at once share the same line (through either
aliasing or through being directly within the same line).
Then again I could be completely wrong, my understanding of
multiprocessor cache coherency is very limited, and I have no clue how
cache aliasing fits into it... So the above is just uninformed
conjecture.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-09-16 04:18:17 | Re: Spinlocks, yet again: analysis and proposed patches |
Previous Message | Tom Lane | 2005-09-16 03:24:23 | Re: pg_autovacuum settings not saved on dump |