<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">于 2013/8/20 12:34, Jeff Janes 写道:<br>
</div>
<blockquote
cite="mid:CAMkU=1zWk8dmxCi4uGB4fCmOrb5BvwVB0+Bspu_1gmyO7UUjsg(at)mail(dot)gmail(dot)com"
type="cite">On Monday, August 19, 2013, Rural Hunter wrote:
<div><br>
</div>
<div>I think that this should generally not happen at the server
if you are using pgbouncer, as you should configure it so that
pgbouncer has a lower limit than postgresql itself does. What
pooling method (session, transaction, statement) are you using?</div>
</blockquote>
statement. Currently, I set the limit of pgbouncer connection to
same as db connection. But I also have a few connections connecting
to db server directly.
<blockquote
cite="mid:CAMkU=1zWk8dmxCi4uGB4fCmOrb5BvwVB0+Bspu_1gmyO7UUjsg(at)mail(dot)gmail(dot)com"
type="cite">
<div><br>
</div>
<div>Can you provide some example numbers for the io load?</div>
</blockquote>
I get some when the connection limit is reached(The database related
storage is on sdb/sdd/sde/sdf):<br>
root(at)ubtserver:~# iostat -xm 3<br>
Linux 3.5.0-22-generic (ubuntu) 2013年08月19日 _x86_64_ (32
CPU)<br>
<br>
avg-cpu: %user %nice %system %iowait %steal %idle<br>
14.71 0.00 2.86 0.48 0.00 81.96<br>
<br>
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util<br>
sda 0.00 0.26 0.04 0.36 0.00 0.00
24.71 0.00 0.55 3.01 0.30 0.29 0.01<br>
sdb 0.00 0.26 0.18 2.32 0.02 0.38
329.50 0.01 5.36 1.26 5.69 0.21 0.05<br>
sdc 0.01 4.59 10.13 45.75 0.30 0.92
44.65 0.05 5.14 7.49 4.62 0.63 3.50<br>
dm-0 0.00 0.00 0.00 0.01 0.00
0.00 8.00 0.00 6.37 6.38 6.36 3.62 0.00<br>
sdd 0.00 0.42 0.02 42.87 0.00 0.46
22.12 0.03 0.78 14.09 0.77 0.49 2.10<br>
sde 0.00 3.68 10.23 156.41 0.19 1.45
20.06 0.03 1.59 21.34 0.29 0.51 8.55<br>
sdf 0.00 2.56 6.29 66.00 0.29 0.71
28.42 0.04 0.56 4.52 0.19 0.37 2.71<br>
<br>
avg-cpu: %user %nice %system %iowait %steal %idle<br>
13.99 0.00 1.91 1.04 0.00 83.06<br>
<br>
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util<br>
sda 0.00 0.00 0.33 0.00 0.00 0.00
16.00 0.00 4.00 4.00 0.00 4.00 0.13<br>
sdb 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdc 0.00 15.33 5.33 14.33 0.13 0.21
34.98 0.03 1.63 6.00 0.00 1.02 2.00<br>
dm-0 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdd 0.00 0.00 0.00 31.33 0.00 0.26
17.19 0.01 0.34 0.00 0.34 0.34 1.07<br>
sde 0.00 0.00 43.00 163.67 0.59 1.29
18.55 2.56 21.34 72.06 8.01 1.69 34.93<br>
sdf 0.00 0.00 6.00 62.00 0.17 0.55
21.88 0.49 7.16 5.56 7.31 0.27 1.87<br>
<br>
avg-cpu: %user %nice %system %iowait %steal %idle<br>
15.84 0.00 2.63 1.70 0.00 79.83<br>
<br>
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util<br>
sda 0.00 1.67 0.00 2.00 0.00 0.01
14.67 0.07 33.33 0.00 33.33 25.33 5.07<br>
sdb 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdc 0.00 0.00 4.67 0.00 0.06 0.00
26.29 0.13 6.29 6.29 0.00 25.14 11.73<br>
dm-0 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdd 0.00 0.33 0.00 49.00 0.00 0.39
16.49 0.02 0.35 0.00 0.35 0.35 1.73<br>
sde 0.00 11.00 30.67 81.33 0.38 0.71
19.98 36.46 143.19 43.91 180.62 2.69 30.13<br>
sdf 0.00 9.33 3.00 326.00 0.09 2.75
17.69 3.51 10.66 5.33 10.71 0.11 3.60<br>
<br>
avg-cpu: %user %nice %system %iowait %steal %idle<br>
14.99 0.00 2.39 4.89 0.00 77.74<br>
<br>
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util<br>
sda 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdb 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdc 0.00 19.67 7.33 29.00 0.09 0.60
38.61 1.18 35.41 175.45 0.00 15.93 57.87<br>
dm-0 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdd 0.00 0.33 0.00 39.33 0.00 0.31
15.93 0.01 0.37 0.00 0.37 0.37 1.47<br>
sde 0.00 11.33 29.67 312.67 0.39 2.51
17.34 87.15 314.23 108.13 333.78 2.84 97.20<br>
sdf 0.00 0.00 8.33 0.00 0.17 0.00
42.24 0.05 6.56 6.56 0.00 2.40 2.00<br>
<br>
avg-cpu: %user %nice %system %iowait %steal %idle<br>
14.98 0.00 2.23 5.45 0.00 77.34<br>
<br>
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util<br>
sda 0.00 0.00 0.00 0.67 0.00 0.01
20.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdb 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdc 0.00 9.67 10.00 6.00 0.12 0.10
27.83 0.08 5.08 8.13 0.00 1.42 2.27<br>
dm-0 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdd 0.00 0.00 0.00 44.33 0.00 0.35
16.00 0.03 0.72 0.00 0.72 0.72 3.20<br>
sde 0.00 0.00 47.33 0.00 0.58 0.00
25.18 5.26 111.04 111.04 0.00 19.10 90.40<br>
sdf 0.00 11.00 3.33 683.33 0.12 7.38
22.37 12.05 17.54 244.00 16.44 0.49 33.33<br>
<br>
avg-cpu: %user %nice %system %iowait %steal %idle<br>
15.21 0.00 2.54 0.56 0.00 81.69<br>
<br>
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util<br>
sda 0.00 2.00 0.00 1.00 0.00 0.01
24.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdb 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdc 0.00 0.00 14.33 2.00 0.20 0.39
73.80 0.07 4.08 4.65 0.00 2.37 3.87<br>
dm-0 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdd 0.00 0.33 0.00 62.00 0.00 0.52
17.08 0.02 0.34 0.00 0.34 0.34 2.13<br>
sde 0.00 9.67 30.67 157.33 0.43 1.27
18.54 1.75 9.33 15.91 8.04 1.09 20.53<br>
sdf 0.00 9.67 6.67 0.67 0.13 0.04
46.91 0.04 5.09 5.60 0.00 2.36 1.73<br>
<br>
avg-cpu: %user %nice %system %iowait %steal %idle<br>
14.72 0.00 1.95 0.58 0.00 82.76<br>
<br>
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util<br>
sda 0.00 0.00 0.00 2.00 0.00
0.01 8.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdb 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdc 0.00 13.67 5.33 32.33 0.07 0.31
20.46 0.04 1.03 7.25 0.00 0.46 1.73<br>
dm-0 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00<br>
sdd 0.00 0.00 0.00 42.00 0.00 0.35
17.27 0.03 0.79 0.00 0.79 0.79 3.33<br>
sde 0.00 0.33 48.00 804.00 0.61 6.34
16.71 8.38 9.82 14.11 9.57 0.23 19.20<br>
sdf 0.00 0.00 8.00 463.00 0.09 4.12
18.30 5.00 10.62 7.17 10.68 0.11 5.20<br>
<blockquote
cite="mid:CAMkU=1zWk8dmxCi4uGB4fCmOrb5BvwVB0+Bspu_1gmyO7UUjsg(at)mail(dot)gmail(dot)com"
type="cite">
<div><br>
</div>
<div><br>
</div>
<div>Could you post the complete log message and a few lines of
context around it?</div>
</blockquote>
There is no context from the same connection around that message.
<blockquote
cite="mid:CAMkU=1zWk8dmxCi4uGB4fCmOrb5BvwVB0+Bspu_1gmyO7UUjsg(at)mail(dot)gmail(dot)com"
type="cite">
<div>How long had they been hanging there? It makes a big
difference whether there are several hanging there at one
moment, but a few milliseconds later there are several different
ones, versus the same few that hang around of many seconds or
minutes at a time.</div>
</blockquote>
The hanging connections never disappear until I restart pgbouncer.
It's like this, At minute 1, 3 connections left. At minute 2,
another 3 left, total 6. Another minute, another 3 left, total
9....till the limit reaches.<br>
<blockquote
cite="mid:CAMkU=1zWk8dmxCi4uGB4fCmOrb5BvwVB0+Bspu_1gmyO7UUjsg(at)mail(dot)gmail(dot)com"
type="cite"><br>
<div>If the identities of the "hung" processes are rapidly
changing, it could just be that you are hitting a throughput
limit. When you do a lot of inserts into indexed the tables, the
performance can drop precipitously once the size of the actively
updated part of the indexes exceeds shared_buffers. This would
usually show up in the io stats, but if you always have a lot of
io going on, it might not be obvious.</div>
<div><br>
</div>
<div>If it is the same few processes hung for long periods, I
would strace them, or gdb them and get a backtrace.</div>
</blockquote>
any detail guide to use strace/gdb on pg process?
<blockquote
cite="mid:CAMkU=1zWk8dmxCi4uGB4fCmOrb5BvwVB0+Bspu_1gmyO7UUjsg(at)mail(dot)gmail(dot)com"
type="cite">
<div>Sorry, I don't know what a connection snapshot in db2 looks
like.</div>
</blockquote>
<a class="moz-txt-link-freetext" href="http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.cmd.doc%2Fdoc%2Fr0001945.html">http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.cmd.doc%2Fdoc%2Fr0001945.html</a><br>
search for "<code>get snapshot for application". Note: some items in
the sample are marked as "</code><code>Not Collected" because some
monitor flags are turned off.<br>
</code>
<blockquote
cite="mid:CAMkU=1zWk8dmxCi4uGB4fCmOrb5BvwVB0+Bspu_1gmyO7UUjsg(at)mail(dot)gmail(dot)com"
type="cite">
<div> </div>
<div><br>
</div>
<div>Cheers,</div>
<div><br>
</div>
<div>Jeff</div>
</blockquote>
<br>
</body>
</html>