Re: System Load analyze

From: Peter Bauer <peter(dot)bauer(at)apus(dot)co(dot)at>
To: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: System Load analyze
Date: 2007-11-28 07:19:22
Message-ID: 200711280819.22714.peter.bauer@apus.co.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Am Dienstag 27 November 2007 schrieb Scott Marlowe:
> On Nov 24, 2007 10:57 AM, Peter Bauer <peter(dot)bauer(at)apus(dot)co(dot)at> wrote:
> > i have a system here with 2 2.4GHz Xeon Processors, 2GB RAM, ONE Disk on
> > a Battery Backed Write Cache SCSI Controller and PostgreSQL 8.1.4
> > running with the data on a DRBD Device for High Availability. The used
> > database is also replicated to two similar machines with slony1.
>
> Why are you running a version of PostgreSQL with known data eating
> bugs? If you care for your data, you will keep up to date on releases.
> 8.1.10 was released on 2007-09-17. 8.1.4 was released on 2006-05-23.
> That's 16 months of bug fixes you're missing. Go here:
> http://www.postgresql.org/docs/8.1/static/release.html and read up on
> the fixes you're missing. Then update. Or just update.
>
> OK, on the the issue at hand.
>
> > Since the load average is between 1 (most of the time) and 10 (peeks) i
> > am worried about the load and executed vmstat and iostat which show that
> > 1000-6000 Blocks are writen per second. Please check the attached output
> > for further details.
> > top shows that the CPUs are at least 80% idle most of the time so i
> > think there is an I/O bottleneck. I'm aware that this hardware setup is
> > probably not sufficient but is would like to investigate how critical
> > the situation is.
>
> Yes. Battery backed cache can only do so much, it's not magic pixie
> dust. Once it's full, the drive becomes the bottle neck. Real db
> servers have more than one disk drive. They usually have at least 4
> or so, and often dozens to hundreds. Also, not all battery backed
> caching RAID controllers are created equal.
>
> > procs -----------memory---------- ---swap-- -----io---- --system--
> > ----cpu---- r b swpd free buff cache si so bi bo in
> > cs us sy id wa 0 0 398256 78328 140612 1063556 0 0 0 1472
> > 2029 5081 4 3 92 0 0 2 398256 78268 140612 1063576 0 0 0
> > 2304 1928 4216 0 2 98 0 1 2 398256 78100 140612 1063576 0 0
> > 0 1464 1716 3994 1 3 96 0 0 0 398256 78704 140612 1063592 0
> > 0 0 916 1435 3570 5 3 91 0 0 0 398256 77876 140612 1063616
> > 0 0 0 0 305 1169 3 1 96 0
>
> See that dip in the last line above where the blocks in drop to 0,
> idle jumps to 96, and blocks out drops, and context switches drop?
> That's most likely where postgresql is checkpointing. Checkpointing
> is where it writes out all the dirty buffers to disk. If the bgwriter
> is not tuned aggresively enough, checkpoints happen and make the whole
> database slow down for a few seconds. If it's tuned too aggresively
> then the db spends too much CPU time tracking the dirty buffers and
> then writing them. If tuned just right, it will write out the dirty
> buffers just fast enough that a checkpoint is never needed.
>
> You tune the bgwriter to your machine and I/O subsystem. If you're
> planning on getting more hard drives, do that first. Then tune the
> bgwriter.
>
> btw, if this is "vmstat 1" running, it's showing a checkpoint every 20
> or so seconds I think
>
> > 0 2 398256 79136 140612 1063964 0 0 0 1736 1959 4494 4 2
> > 94 0
>
> checkpoint here:
> > 0 0 398256 79132 140612 1063964 0 0 0 4 260 1039 1 1
> > 98 0 0 0 398256 79052 140612 1063980 0 0 0 2444 3084 6955
> > 6 5 89 0 0 2 398256 79060 140612 1063988 0 0 0 948 1146
> > 3616 3 1 96 0 0 1 398256 78268 140612 1064056 0 0 0 1908
> > 1809 4086 6 5 88 0 0 1 398256 76728 140612 1064056 0 0 0
> > 6256 6637 15472 5 5 90 0 0 2 398256 77000 140612 1064064 0 0
> > 0 4916 5840 12107 1 4 95 0 0 2 398256 76956 140612 1064068 0
> > 0 0 6468 7432 15211 1 3 96 0 0 6 398256 77388 140612 1064072
> > 0 0 0 8116 7826 18265 1 8 91 0 0 2 398256 74312 140612
> > 1064076 0 0 0 7032 6886 16136 2 7 91 0 0 2 398256 74264
> > 140612 1064076 0 0 0 5680 7143 13411 0 5 95 0 0 2 398256
> > 72980 140612 1064140 0 0 0 5396 6377 13251 6 6 88 0 0 3
> > 398256 76972 140612 1064148 0 0 0 5652 6793 14079 4 9 87 0
> > 0 2 398256 77836 140612 1064148 0 0 0 3968 5321 14187 10 8
> > 82 0 1 0 398256 77280 140612 1064148 0 0 0 1608 3188 8974
> > 21 12 67 0 1 0 398256 77832 140612 1064152 0 0 0 236 834
> > 2625 7 5 87 0 0 0 398256 77464 140612 1064152 0 0 0 244
> > 505 1378 2 4 94 0 1 0 398256 77828 140612 1064164 0 0 0
> > 316 580 1954 7 2 91 0 0 0 398256 77804 140612 1064180 0 0
> > 0 740 673 2248 2 2 96 0 0 0 398256 77000 140612 1064180 0
> > 0 0 304 589 1739 1 3 96 0
>
> 20 rows later, checkpoint here:
> > 0 0 398256 77000 140612 1064184 0 0 0 0 216 886 0 1
> > 99 0 0 0 398256 75452 140612 1064184 0 0 0 432 755 2032
> > 6 1 93 0
> >
> > max_fsm_pages = 40000 # min max_fsm_relations*16, 6
> > bytes each, APUS
>
> This seems a little low for a busy server.
>
> > # - Background writer -
> >
> > bgwriter_delay = 100 # 10-10000 milliseconds between
> > rounds, APUS bgwriter_lru_percent = 2.0 # 0-100% of LRU
> > buffers scanned/round, APUS bgwriter_lru_maxpages = 10 #
> > 0-1000 buffers max written/round, APUS bgwriter_all_percent = 1
> > # 0-100% of all buffers scanned/round, APUS bgwriter_all_maxpages =
> > 10 # 0-1000 buffers max written/round, APUS
>
> So, bgwriter wakes up 10 times a second, and each time it processes 2%
> of the Least Recently Used pages for writing, and writes up to 10 of
> those pages. And it only checks 1% of the total pages and writes 10
> of those at the most. This is not aggresive enough, and given how
> much spare CPU you have left over, you can be a fair bit more
> aggresive. The main thing to increase is the maxes. Try changing
> them to the 100 to 300 range, and maybe increase your percentages to
> 5% or so. What we're shooting for is to see those checkpoints go
> away.
>
> Then, when running your benchmark, after a few minutes, run a
> checkpoint by hand and see if you get one of those slow downs like we
> saw in vmstat above. If your bgwriter is tuned properly, you should
> get an almost instant response from the checkpoint and no noticeable
> slow down in the vmstat numbers for context switches per second.
>
> Once you reach the point where the bgwriter is just keeping ahead of
> check points, there's little to be gained in more aggressive tuning of
> the bgwriter and you'll just be chewing up memory and cpu bandwidth if
> you do get too aggressive with it.

Hi Scott,

thank you for the great suggestions, i will keep the list informed.

br,
Peter

--
Peter Bauer
APUS Software G.m.b.H.
A-8074 Raaba, Bahnhofstrasse 1/1
Email: peter(dot)bauer(at)apus(dot)co(dot)at
Tel: +43 316 401629 24
Fax: +43 316 401629 9

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Magnus Hagander 2007-11-28 07:22:46 Re: Linux v.s. Mac OS-X Performance
Previous Message gongzhixiao 2007-11-28 06:58:17 I have a select statement on the issue.