From: | Matt Clarkson <mattc(at)catalyst(dot)net(dot)nz> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: 60 core performance with 9.3 |
Date: | 2014-07-30 23:36:03 |
Message-ID: | 1406763363.7654.73.camel@oldgreg.wgtn.cat-it.co.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I've been assisting Mark with the benchmarking of these new servers.
The drop off in both throughput and CPU utilisation that we've been
observing as the client count increases has let me to investigate which
lwlocks are dominant at different client counts.
I've recompiled postgres with Andres LWLock improvements, Kevin's
libnuma patch and with LWLOCK_STATS enabled.
The LWLOCK_STATS below suggest that ProcArrayLock might be the main
source of locking that's causing throughput to take a dive as the client
count increases beyond the core count.
wal_buffers = 256MB
checkpoint_segments = 1920
wal_sync_method = open_datasync
pgbench -s 2000 -T 600
Results:
clients | tps
---------+---------
6 | 9490
12 | 17558
24 | 25681
48 | 41175
96 | 48954
192 | 31887
384 | 15564
LWLOCK_STATS at 48 clients
Lock | Blk | SpinDelay | Blk % | SpinDelay %
--------------------+----------+-----------+-------+-------------
BufFreelistLock | 31144 | 11 | 1.64 | 1.62
ShmemIndexLock | 192 | 1 | 0.01 | 0.15
OidGenLock | 32648 | 14 | 1.72 | 2.06
XidGenLock | 35731 | 18 | 1.88 | 2.64
ProcArrayLock | 291121 | 215 | 15.36 | 31.57
SInvalReadLock | 32136 | 13 | 1.70 | 1.91
SInvalWriteLock | 32141 | 12 | 1.70 | 1.76
WALBufMappingLock | 31662 | 15 | 1.67 | 2.20
WALWriteLock | 825380 | 45 | 36.31 | 6.61
CLogControlLock | 583458 | 337 | 26.93 | 49.49
LWLOCK_STATS at 96 clients
Lock | Blk | SpinDelay | Blk % | SpinDelay %
--------------------+----------+-----------+-------+-------------
BufFreelistLock | 62954 | 12 | 1.54 | 0.27
ShmemIndexLock | 62635 | 4 | 1.54 | 0.09
OidGenLock | 92232 | 22 | 2.26 | 0.50
XidGenLock | 98326 | 18 | 2.41 | 0.41
ProcArrayLock | 928871 | 3188 | 22.78 | 72.57
SInvalReadLock | 58392 | 13 | 1.43 | 0.30
SInvalWriteLock | 57429 | 14 | 1.41 | 0.32
WALBufMappingLock | 138375 | 14 | 3.39 | 0.32
WALWriteLock | 1480707 | 42 | 36.31 | 0.96
CLogControlLock | 1098239 | 1066 | 26.93 | 27.27
LWLOCK_STATS at 384 clients
Lock | Blk | SpinDelay | Blk % | SpinDelay %
--------------------+----------+-----------+-------+-------------
BufFreelistLock | 184298 | 158 | 1.93 | 0.03
ShmemIndexLock | 183573 | 164 | 1.92 | 0.03
OidGenLock | 184558 | 173 | 1.93 | 0.03
XidGenLock | 200239 | 213 | 2.09 | 0.04
ProcArrayLock | 4035527 | 579666 | 42.22 | 98.62
SInvalReadLock | 182204 | 152 | 1.91 | 0.03
SInvalWriteLock | 182898 | 137 | 1.91 | 0.02
WALBufMappingLock | 219936 | 215 | 2.30 | 0.04
WALWriteLock | 3172725 | 457 | 24.67 | 0.08
CLogControlLock | 1012458 | 6423 | 10.59 | 1.09
The same test done with a readonly workload show virtually no SpinDelay
at all.
Any thoughts or comments on these results are welcome!
Regards,
Matt.
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Kirkwood | 2014-07-30 23:38:20 | Re: 60 core performance with 9.3 |
Previous Message | Mark Kirkwood | 2014-07-30 23:33:14 | Re: 60 core performance with 9.3 |