Re: Amazon High I/O instances

From: Sébastien Lorion <sl(at)thestrangefactory(dot)com>
To: John R Pierce <pierce(at)hogranch(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Amazon High I/O instances
Date: 2012-09-13 05:01:11
Message-ID: CAGa5y0OBKu1gJkeckQBGNgHGWezkK=JVQfjj+FOi-PGm-8-dLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

pgbench initialization has been going on for almost 5 hours now and still
stuck before vacuum starts .. something is definitely wrong as I don't
remember it took so long first time I created the db. Here are the current
stats now:

*iostat (xbd13-14 are WAL zpool)*

device r/s w/s kr/s kw/s qlen svc_t %b
xbd8 161.3 109.8 1285.4 3450.5 0 12.5 19
xbd7 159.5 110.6 1272.3 3450.5 0 11.4 14
xbd6 161.1 108.8 1284.4 3270.6 0 10.9 14
xbd5 159.5 109.0 1273.1 3270.6 0 11.6 15
xbd14 0.0 0.0 0.0 0.0 0 0.0 0
xbd13 0.0 0.0 0.0 0.0 0 0.0 0
xbd12 204.6 110.8 1631.3 3329.2 0 9.1 15
xbd11 216.0 111.2 1722.5 3329.2 1 8.6 16
xbd10 197.2 109.4 1573.5 3285.8 0 9.8 15
xbd9 195.0 109.4 1557.1 3285.8 0 9.9 15
*
*
*zpool iostat (db pool)*
pool alloc free read write read write
db 143G 255G 1.40K 1.53K 11.2M 12.0M

*vmstat*
*
*
procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr ad0 xb8 in sy cs us
sy id
0 0 0 5634M 28G 7 0 0 0 7339 0 0 245 2091 6358 20828
2 5 93
0 0 0 5634M 28G 10 0 0 0 6989 0 0 312 1993 6033 20090
1 4 95
0 0 0 5634M 28G 7 0 0 0 6803 0 0 292 1974 6111 22763
2 5 93
0 0 0 5634M 28G 10 0 0 0 7418 0 0 339 2041 6170 20838
2 4 94
0 0 0 5634M 28G 123 0 0 0 6980 0 0 282 1977 5906 19961
2 4 94
*
*
*top*
*
*
last pid: 2430; load averages: 0.72, 0.73, 0.69 up 0+04:56:16
04:52:53
32 processes: 1 running, 31 sleeping
CPU: 1.8% user, 0.0% nice, 5.3% system, 1.4% interrupt, 91.5% idle
Mem: 1817M Active, 25M Inact, 36G Wired, 24K Cache, 699M Buf, 28G Free
Swap:

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
1283 pgsql 1 34 0 3967M 1896M zio->i 5 80:14 21.00% postgres
1282 pgsql 1 25 0 25740K 3088K select 2 10:34 0.00% pgbench
1274 pgsql 1 20 0 2151M 76876K select 1 0:09 0.00% postgres

On Wed, Sep 12, 2012 at 9:16 PM, Sébastien Lorion
<sl(at)thestrangefactory(dot)com>wrote:

> I recreated the DB and WAL pools, and launched pgbench -i -s 10000. Here
> are the stats during the load (still running):
>
> *iostat (xbd13-14 are WAL zpool)*
> device r/s w/s kr/s kw/s qlen svc_t %b
> xbd8 0.0 471.5 0.0 14809.3 40 67.9 84
> xbd7 0.0 448.1 0.0 14072.6 39 62.0 74
> xbd6 0.0 472.3 0.0 14658.6 39 61.3 77
> xbd5 0.0 464.7 0.0 14433.1 39 61.4 76
> xbd14 0.0 0.0 0.0 0.0 0 0.0 0
> xbd13 0.0 0.0 0.0 0.0 0 0.0 0
> xbd12 0.0 460.1 0.0 14189.7 40 63.4 78
> xbd11 0.0 462.9 0.0 14282.8 40 61.8 76
> xbd10 0.0 477.0 0.0 14762.1 38 61.2 77
> xbd9 0.0 477.6 0.0 14796.2 38 61.1 77
>
> *zpool iostat (db pool)*
> pool alloc free read write read write
> db 11.1G 387G 0 6.62K 0 62.9M
>
> *vmstat*
> procs memory page disks faults cpu
> r b w avm fre flt re pi po fr sr ad0 xb8 in sy cs
> us sy id
> 0 0 0 3026M 35G 126 0 0 0 29555 0 0 478 2364 31201 26165
> 10 9 81
>
> *top*
> last pid: 1333; load averages: 1.89, 1.65, 1.08 up 0+01:17:08
> 01:13:45
> 32 processes: 2 running, 30 sleeping
> CPU: 10.3% user, 0.0% nice, 7.8% system, 1.2% interrupt, 80.7% idle
> Mem: 26M Active, 19M Inact, 33G Wired, 16K Cache, 25M Buf, 33G Free
>
>
>
> On Wed, Sep 12, 2012 at 9:02 PM, Sébastien Lorion <
> sl(at)thestrangefactory(dot)com> wrote:
> >
> > One more question .. I could not set wal_sync_method to anything else
> but fsync .. is that expected or should other choices be also available ? I
> am not sure how the EC2 SSD cache flushing is handled on EC2, but I hope it
> is flushing the whole cache on every sync .. As a side note, I got
> corrupted databases (errors about pg_xlog directories not found, etc) at
> first when running my tests, and I suspect it was because of
> vfs.zfs.cache_flush_disable=1, though I cannot prove it for sure.
> >
> > Sébastien
> >
> >
> > On Wed, Sep 12, 2012 at 8:49 PM, Sébastien Lorion <
> sl(at)thestrangefactory(dot)com> wrote:
> >>
> >> Is dedicating 2 drives for WAL too much ? Since my whole raid is
> comprised of SSD drives, should I just put it in the main pool ?
> >>
> >> Sébastien
> >>
> >>
> >> On Wed, Sep 12, 2012 at 8:28 PM, Sébastien Lorion <
> sl(at)thestrangefactory(dot)com> wrote:
> >>>
> >>> Ok, make sense .. I will update that as well and report back. Thank
> you for your advice.
> >>>
> >>> Sébastien
> >>>
> >>>
> >>> On Wed, Sep 12, 2012 at 8:04 PM, John R Pierce <pierce(at)hogranch(dot)com>
> wrote:
> >>>>
> >>>> On 09/12/12 4:49 PM, Sébastien Lorion wrote:
> >>>>>
> >>>>> You set shared_buffers way below what is suggested in Greg Smith
> book (25% or more of RAM) .. what is the rationale behind that rule of
> thumb ? Other values are more or less what I set, though I could lower the
> effective_cache_size and vfs.zfs.arc_max and see how it goes.
> >>>>
> >>>>
> >>>> I think those 25% rules were typically created when ram was no more
> than 4-8GB.
> >>>>
> >>>> for our highly transactional workload, at least, too large of a
> shared_buffers seems to slow us down, perhaps due to higher overhead of
> managing that many 8k buffers. I've heard other read-mostly workloads,
> such as data warehousing, can take advantage of larger buffer counts.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> john r pierce N 37, W 122
> >>>> santa cruz ca mid-left coast
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> >>>> To make changes to your subscription:
> >>>> http://www.postgresql.org/mailpref/pgsql-general
> >>>
> >>>
> >>
> >
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2012-09-13 05:29:45 Re: Amazon High I/O instances
Previous Message Chris Travers 2012-09-13 04:40:36 Re: Amazon High I/O instances