From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Scaling shared buffer eviction |
Date: | 2014-10-09 12:47:09 |
Message-ID: | CAA4eK1Je9ZBLHsfiavHD18GDwXUx21zFqPJgq_Dz_ZoA35nLpQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 26, 2014 at 7:04 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On another point, I think it would be a good idea to rebase the
> bgreclaimer patch over what I committed, so that we have a
> clean patch against master to test with.
Please find the rebased patch attached with this mail. I have taken
some performance data as well and done some analysis based on
the same.
Performance Data
----------------------------
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB
max_connections =300
Database Locale =C
checkpoint_segments=256
checkpoint_timeout =15min
shared_buffers=8GB
scale factor = 5000
Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
Duration of each individual run = 5mins
Below data is median of 3 runs.
patch_ver/client_count 1 8 32 64 128 256 HEAD 18884 118628 251093 216294
186625 177505 PATCH 18743 122578 247243 205521 179712 175031
Here we can see that the performance dips at higher client
count(>=32) which was quite surprising for me, as I was expecting
it to improve, because bgreclaimer reduces the contention by making
buffers available on free list. So I tried to analyze the situation by
using perf and found that in above configuration, there is a contention
around freelist spinlock with HEAD and the same is removed by Patch,
but still the performance goes down with Patch. On further analysis, I
observed that actually after Patch there is an increase in contention
around ProcArrayLock (shared LWlock) via GetSnapshotData which
sounds bit odd, but that's what I can see in profiles. Based on analysis,
few ideas which I would like to further investigate are:
a. As there is an increase in spinlock contention, I would like to check
with Andres's latest patch which reduces contention around shared
lwlocks.
b. Reduce some instructions added by patch in StrategyGetBuffer(),
like instead of awakening bgreclaimer at low threshold, awaken when
it tries to do clock sweep.
Thoughts?
Below is the profile data for 64 and 128 client count:
Head - 64 client count
-------------------------------------
+ 8.93% postgres postgres [.] s_lock
7.83% swapper [unknown] [H]
0x00000000011bc5ac
+ 3.09% postgres postgres [.]
GetSnapshotData
+ 3.06% postgres postgres [.] tas
+ 2.49% postgres postgres [.] AllocSetAlloc
+ 2.43% postgres postgres [.]
hash_search_with_hash_value
+ 2.13% postgres postgres [.] _bt_compare
Detailed Data
------------------------
- 8.93% postgres postgres [.] s_lock
- s_lock
- 4.97% s_lock
- 1.63% StrategyGetBuffer
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
- 1.63% ReleaseAndReadBuffer
- 0.93% index_fetch_heap
- index_getnext
- 0.93% IndexNext
ExecScanFetch
- 0.69% _bt_relandgetbuf
_bt_search
_bt_first
btgettuple
- index_getnext
- 0.69% IndexNext
ExecScanFetch
0
- 1.39% LWLockAcquireCommon
- LWLockAcquire
- 1.38% GetSnapshotData
- GetTransactionSnapshot
- 0.70% exec_bind_message
0
- 0.68% PortalStart
exec_bind_message
- 1.37% LWLockRelease
- 1.37% GetSnapshotData
- GetTransactionSnapshot
- 0.69% exec_bind_message
0
- 0.68% PortalStart
exec_bind_message
PostgresMain
0
- 1.07% StrategyGetBuffer
- 1.06% StrategyGetBuffer
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
- 1.06% ReleaseAndReadBuffer
- 0.62% index_fetch_heap
index_getnext
- 0.95% LWLockAcquireCommon
- 0.95% LWLockAcquireCommon
- LWLockAcquire
- 0.90% GetSnapshotData
GetTransactionSnapshot
- 0.94% LWLockRelease
- 0.94% LWLockRelease
- 0.90% GetSnapshotData
GetTransactionSnapshot
7.83% swapper [unknown] [H]
0x00000000011bc5ac
- 3.09% postgres postgres [.]
GetSnapshotData
- GetSnapshotData
- 3.06% GetSnapshotData
- 3.06% GetTransactionSnapshot
- 1.54% PortalStart
exec_bind_message
move_buffers_to_freelist_by_bgreclaimer_v1 - 64 Client count
----------------------------------------------------------------------------------------------
+ 11.52% postgres postgres [.] s_lock
7.57% swapper [unknown] [H] 0x00000000011d9034
+ 3.54% postgres postgres [.] tas
+ 3.02% postgres postgres [.] GetSnapshotData
+ 2.47% postgres postgres [.]
hash_search_with_hash_value
+ 2.33% postgres postgres [.] AllocSetAlloc
+ 2.03% postgres postgres [.] _bt_compare
+ 1.89% postgres postgres [.] calc_bucket
Detailed Data
---------------------
- 11.52% postgres postgres [.] s_lock
- s_lock
- 6.57% s_lock
- 2.72% LWLockAcquireCommon
- LWLockAcquire
- 2.71% GetSnapshotData
- GetTransactionSnapshot
- 1.38% exec_bind_message
0
- 1.33% PortalStart
exec_bind_message
0
- 2.69% LWLockRelease
- 2.69% GetSnapshotData
- GetTransactionSnapshot
- 1.35% exec_bind_message
PostgresMain
0
- 1.34% PortalStart
exec_bind_message
0
- 1.65% LWLockAcquireCommon
- 1.65% LWLockAcquireCommon
- LWLockAcquire
- 1.59% GetSnapshotData
- GetTransactionSnapshot
- 0.80% exec_bind_message
PostgresMain
0
- 0.79% PortalStart
exec_bind_message
0
- 0.79% PortalStart
exec_bind_message
0
- 1.62% LWLockRelease
- 1.62% LWLockRelease
- 1.58% GetSnapshotData
- GetTransactionSnapshot
- 0.79% exec_bind_message
PostgresMain
0
- 0.79% PortalStart
exec_bind_message
PostgresMain
0
- 0.63% hash_search_with_hash_value
- 0.63% hash_search_with_hash_value
BufTableDelete
BufferAlloc
- 0.59% get_hash_entry
- 0.59% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
Head - 128 Client count
---------------------------------------
+ 18.39% postgres postgres [.] s_lock
6.72% swapper [unknown] [H] 0x00000000011bc390
+ 3.37% postgres postgres [.] GetSnapshotData
+ 2.11% postgres postgres [.] tas
+ 2.05% postgres postgres [.] tas
+ 1.82% postgres postgres [.]
hash_search_with_hash_value
+ 1.77% postgres postgres [.] AllocSetAlloc
1.52% postgres [unknown] [H] 0x00000000012fdc00
+ 1.45% postgres postgres [.] tas
+ 1.42% postgres postgres [.] _bt_compare
- 18.39% postgres postgres [.] s_lock
- s_lock
- 12.35% s_lock
- 7.52% StrategyGetBuffer
BufferAlloc
ReadBuffer_common
- 1.86% LWLockAcquireCommon
- LWLockAcquire
- 1.83% GetSnapshotData
- GetTransactionSnapshot
- 0.95% exec_bind_message
0
- 0.88% PortalStart
exec_bind_message
0
- 1.78% LWLockRelease
- 1.76% GetSnapshotData
- GetTransactionSnapshot
- 0.91% exec_bind_message
- 0.86% PortalStart
exec_bind_message
0
- 0.60% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
- 0.58% hash_search_with_hash_value
- 0.58% BufTableDelete
BufferAlloc
- 3.18% StrategyGetBuffer
- 3.18% StrategyGetBuffer
BufferAlloc
0
- 0.88% LWLockAcquireCommon
- 0.87% LWLockAcquireCommon
- LWLockAcquire
- 0.81% GetSnapshotData
GetTransactionSnapshot
- 0.84% LWLockRelease
- 0.83% LWLockRelease
- 0.79% GetSnapshotData
GetTransactionSnapshot
- 0.55% hash_search_with_hash_value
- 0.55% hash_search_with_hash_value
BufTableDelete
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
0.55% ReleaseAndReadBuffer
- 0.54% get_hash_entry
- 0.54% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
0.54% ReleaseAndReadBuffer
move_buffers_to_freelist_by_bgreclaimer_v1 - 128 Client count
----------------------------------------------------------------------------------------------
+ 13.64% postgres postgres [.] s_lock
8.19% swapper [unknown] [H] 0x0000000000000c04
+ 3.62% postgres postgres [.] GetSnapshotData
+ 2.40% postgres postgres [.] calc_bucket
+ 2.38% postgres postgres [.] tas
+ 2.38% postgres postgres [.]
hash_search_with_hash_value
2.02% postgres [unknown] [H] 0x0000000000000f80
+ 1.73% postgres postgres [.] AllocSetAlloc
+ 1.68% postgres postgres [.] tas
Detailed Data
-----------------------
- 13.64% postgres postgres [.] s_lock
- s_lock
- 8.76% s_lock
- 3.03% LWLockAcquireCommon
- LWLockAcquire
- 2.97% GetSnapshotData
- GetTransactionSnapshot
- 1.55% exec_bind_message
0
- 1.42% PortalStart
exec_bind_message
- 2.87% LWLockRelease
- 2.82% GetSnapshotData
- GetTransactionSnapshot
- 1.46% exec_bind_message
0
- 1.36% PortalStart
exec_bind_message
0
- 1.35% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
0
- 1.29% hash_search_with_hash_value
- 1.29% BufTableDelete
BufferAlloc
0
- 1.19% LWLockAcquireCommon
- 1.19% LWLockAcquireCommon
- LWLockAcquire
- 1.11% GetSnapshotData
- GetTransactionSnapshot
- 0.56% exec_bind_message
PostgresMain
0
- 0.55% PortalStart
exec_bind_message
0
- 1.15% LWLockRelease
- 1.15% LWLockRelease
- 1.08% GetSnapshotData
- GetTransactionSnapshot
- 0.55% exec_bind_message
0
- 0.53% PortalStart
exec_bind_message
0
- 1.12% hash_search_with_hash_value
- 1.12% hash_search_with_hash_value
BufTableDelete
BufferAlloc
0
- 1.10% get_hash_entry
- 1.10% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
0
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
move_buffers_to_freelist_by_bgreclaimer_v1.patch | application/octet-stream | 48.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | MauMau | 2014-10-09 12:53:44 | Re: pgaudit - an auditing extension for PostgreSQL |
Previous Message | Andres Freund | 2014-10-09 12:24:58 | Re: Log notice that checkpoint is to be written on shutdown |