CentOS upgrade to 6.8 resulting in high load for PostgreSQL 9.4.6 server

From: "Hazuka (CTR), Brad" <brad(dot)hazuka(dot)ctr(at)usgs(dot)gov>
To: <pgsql-admin(at)postgresql(dot)org>
Subject: CentOS upgrade to 6.8 resulting in high load for PostgreSQL 9.4.6 server
Date: 2016-06-09 15:27:46
Message-ID: CA+eUnhUD2w9Z-544CPCJzijrKSHo658eG-UQNuAcGR+8WUXdPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I am hoping someone can help me out. I have a production server running
CentOS 6.8 and PostgreSQL community edition 9.4.6. The application hitting
the database is highly concurrent. When the database was running on the
latest 6.7 CentOS OS I had no problems, we were seeing great throughput and
my bottleneck was IO as I expected it to be. From the sar data my CPU
overhead was ~10% and my disk usage was 99%. Memory was not using swap.
After the SA upgraded the machine to 6.8 the profile changed. Now I am
seeing a high CPU bound load on the system caused by postgres. disk usage
is at ~50% and our application throughput has dropped dynamically. I was
able to pull some vmstat and perf metrics which can be seen below. Any
insight someone can offer would be great to return the system performance.
Unfortunately rolling back to 6.7 is not an option because of security
policy.

OS Info:

CentOS 6.8 Linux 2.6.32-642.1.1.el6.x86_64 #1 SMP Tue May 31 21:57:07 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux

DB Info:

PostgreSQL 9.4.6

HW Info:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 44
Model name: Intel(R) Xeon(R) CPU X5687 @ 3.60GHz
Stepping: 2
CPU MHz: 3600.000
BogoMIPS: 7199.83
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 12288K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15

total used free shared buffers cached
Mem: 99052244 98472232 580012 8620496 47452 95264812
-/+ buffers/cache: 3159968 95892276
Swap: 4194300 11060 4183240

Top information
=================
top - 10:22:07 up 1 day, 1:13, 3 users, load average: 15.64, 17.16, 17.39
Tasks: 546 total, 18 running, 528 sleeping, 0 stopped, 0 zombie
Cpu(s): 91.7%us, 3.8%sy, 0.0%ni, 3.8%id, 0.6%wa, 0.0%hi, 0.1%si,
0.0%st
Mem: 99052244k total, 98744272k used, 307972k free, 47208k buffers
Swap: 4194300k total, 11064k used, 4183236k free, 95472720k cached

PID PPID TIME+ %CPU %MEM PR NI S VIRT SWAP RES UID COMMAND

21968 8814 995:06.03 86.6 1.1 20 0 R 8640m 272 1.1g 600 postgres:
SELECT

21972 8814 994:41.35 85.9 1.1 20 0 R 8640m 272 1.1g 600 postgres:
SELECT

20923 8814 994:53.56 85.0 1.8 20 0 R 8641m 296 1.7g 600 postgres:
SELECT

7607 8814 16:39.25 80.1 8.0 20 0 R 8631m 268 7.6g 600 postgres:
SELECT

21058 8814 956:59.78 79.1 3.1 20 0 R 8641m 296 2.9g 600 postgres:
SELECT

7667 8814 17:22.53 77.1 0.9 20 0 R 8621m 288 855m 600 postgres:
SELECT

20899 8814 995:15.39 76.5 4.4 20 0 R 8641m 292 4.2g 600 postgres:
SELECT

7381 8814 28:59.06 75.8 0.9 20 0 R 8621m 288 865m 600 postgres:
SELECT

21966 8814 995:13.87 74.5 4.4 20 0 R 8641m 272 4.2g 600 postgres:
SELECT

7848 8814 10:22.82 73.2 0.9 20 0 R 8621m 288 840m 600 postgres:
SELECT

7806 8814 13:50.43 71.9 0.9 20 0 S 8621m 288 849m 600 postgres:
idle

7645 8814 18:08.60 70.9 0.9 20 0 R 8621m 288 852m 600 postgres:
SELECT

7619 8814 19:34.71 70.3 0.9 20 0 R 8621m 288 856m 600 postgres:
SELECT

7665 8814 17:40.34 69.6 0.9 20 0 S 8621m 288 855m 600 postgres:
COMMIT

7843 8814 13:35.96 68.6 0.9 20 0 R 8621m 288 849m 600 postgres:
SELECT

7264 8814 31:09.93 67.3 1.5 20 0 R 8633m 268 1.5g 600 postgres:
SELECT

7675 8814 15:00.69 66.7 0.9 20 0 R 8621m 288 849m 600 postgres:
SELECT

7574 8814 23:30.92 64.4 0.9 20 0 D 8621m 288 857m 600 postgres:
SELECT

6977 8814 34:06.98 64.1 2.8 20 0 R 9106m 268 2.7g 600 postgres:
SELECT

21973 8814 5:39.97 1.3 1.1 20 0 S 8640m 272 1.1g 600 postgres:
idle

9212 1 13:13.12 1.0 0.0 20 0 S 44120 540 1140 600
/opt/pgbouncer/bin/pgbouncer -d /etc/pgbouncer/pgbouncer.ini

VMStat Information
=================
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id
wa st
20 0 11072 749604 47000 95163200 0 0 75 230 18 26 82 0
15 2 0
15 0 11072 723244 47016 95176920 0 0 38 887 86610 33495 84 6
10 0 0
21 0 11072 708256 47024 95190704 0 0 28 3856 86047 23593 85 5
9 1 0
20 0 11072 702536 47040 95196160 0 0 1 3890 82383 10971 89 4
6 1 0
19 1 11072 696464 47048 95201504 0 0 17 522 84030 11061 90 4
5 0 0
21 0 11072 688740 47064 95206920 0 0 31 2238 84497 27806 87 5
8 0 0

perf top info
=================
Overhead Shared Object Symbol
12.83% postgres [.] heap_hot_search_buffer
6.82% postgres [.] ExecScanHashBucket
6.67% postgres [.] slot_deform_tuple
4.74% postgres [.] LWLockAcquire
3.81% postgres [.] pg_next_dst_boundary
3.43% postgres [.] texteq
2.97% postgres [.] _bt_checkkeys
2.79% postgres [.] j2date
2.76% postgres [.] LWLockRelease
2.69% postgres [.] slot_getattr
2.66% postgres [.] ExecMakeFunctionResultNoSets
2.48% postgres [.] DetermineTimeZoneOffsetInternal
2.25% postgres [.] hash_search_with_hash_value
2.04% postgres [.] tm2timestamp
2.04% postgres [.] ExecEvalParamExtern
1.99% postgres [.] dt2time
1.67% postgres [.] date2j
1.54% postgres [.] pgstat_end_function_usage
1.54% postgres [.] FunctionCall2Coll
1.52% postgres [.] ExecQual
1.48% postgres [.] timestamp2tm
1.43% postgres [.] index_fetch_heap
1.33% postgres [.] index_getnext_tid
1.22% postgres [.] _bt_next
1.18% postgres [.] ExecEvalNullTest
1.15% postgres [.] ExecScan
1.14% postgres [.] btgettuple
0.99% postgres [.] IndexNext
0.95% postgres [.] timestamp2timestamptz
0.94% postgres [.] ReleaseAndReadBuffer
0.93% postgres [.] PinBuffer
0.92% postgres [.] ExecEvalScalarVarFast
0.87% postgres [.] ExecStoreTuple
0.85% postgres [.] LockBuffer
0.82% postgres [.] heap_page_prune_opt
0.78% libc-2.12.so [.] __memcmp_sse4_1
0.74% postgres [.] toast_raw_datum_size
0.74% postgres [.] pg_detoast_datum_packed
0.72% postgres [.] ExecEvalRelabelType
0.68% postgres [.] CheckForSerializableConflictOut
0.68% postgres [.] HeapTupleSatisfiesMVCC
0.57% postgres [.] index_getnext
0.53% postgres [.] ReadBuffer_common
0.51% postgres [.] pgstat_init_function_usage

IO information
=====================

Tme DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz
await svctm %util
08:00:01 dev253-5 59.13 1.89 1564.72 26.50 2.57
43.43 5.90 34.90
08:10:01 dev253-5 58.47 2.19 1555.84 26.65 2.20
37.59 6.05 35.38
08:20:01 dev253-5 60.72 36.18 1598.49 26.92 2.34
38.53 5.92 35.94
08:30:01 dev253-5 60.21 12.68 1562.41 26.16 2.18
36.21 5.97 35.97
08:40:01 dev253-5 60.77 367.49 2624.93 49.25 2.94
48.33 6.09 36.98
08:50:01 dev253-5 63.26 217.68 2644.81 45.25 5.41
85.45 5.82 36.80
09:00:01 dev253-5 60.33 1.41 1612.09 26.75 2.16
35.80 5.93 35.77
09:10:01 dev253-5 58.45 2.96 1586.15 27.19 3.29
56.23 6.12 35.77
09:20:01 dev253-5 89.26 1139.63 3323.51 50.00 3.78
42.35 4.51 40.28
09:30:01 dev253-5 92.84 1320.94 4093.75 58.32 4.21
45.30 4.15 38.57
09:40:01 dev253-5 82.81 998.29 4179.65 62.53 5.09
61.35 4.69 38.85
09:50:01 dev253-5 58.08 22.32 5549.46 95.93 5.22
89.88 6.14 35.67
10:00:01 dev253-5 60.98 50.19 10181.89 167.78 8.34
136.79 6.04 36.81
10:10:01 dev253-5 61.11 256.26 2992.49 53.16 3.12
51.10 5.61 34.29
Average: dev253-5 83.02 4394.40 2225.11 79.73 3.18
38.35 4.69 38.93

Bradley J. Hazuka
Database Administrator
Innovate!, Inc.

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Whitney 2016-06-09 17:13:32 Re: Server problem
Previous Message Ton reddy 2016-06-09 05:55:50 Server problem