Re: Configuring PostgreSQL to minimize impact of checkpoints

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: <jao(at)geophile(dot)com>
Cc: Rob Fielding <rob(at)dsvr(dot)net>, Matthew Nuzum <cobalt(at)bearfruit(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Configuring PostgreSQL to minimize impact of checkpoints
Date: 2004-05-11 17:30:31
Message-ID: Pine.LNX.4.33.0405111126290.22850-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 11 May 2004 jao(at)geophile(dot)com wrote:

> Quoting Rob Fielding <rob(at)dsvr(dot)net>:
>
> > Assuming you're running with optimal schema and index design (ie you're
> > not doing extra work unnecessarily), and your backend has
> > better-then-default config options set-up (plenty of tips around here),
> > then disk arrangement is critical to smoothing the ride.
>
> The schema and queries are extremely simple. I've been experimenting
> with config options. One possibility I'm looking into is whether
> shared_buffers is too high, at 12000. We have some preliminary evidence
> that setting it lower (1000) reduces the demand for IO bandwidth to
> a point where the spikes become almost tolerable.

If the shared_buffers are large, postgresql seems to have a performance
issue with handling them. Plus they can cause the kernel to dump cache on
things that would otherwise be right there and therefore forces the
database to hit the drives. You might wanna try settings between 1000 and
10000 and see where your sweet spot is.

> > First tip would to take your pg_xlog and put it on another disk (and
> > channel).
>
> That's on my list of things to try.
>
> > Next if you're running a journalled fs, get that journal off
> > onto another disk (and channel). Finally, get as many disks for the data
> > store and spread the load across spindles.
>
> Dumb question: how do I spread the data across spindles? Do you have
> a pointer to something I could read?

Look into a high quality hardware RAID controller with battery backed
cache on board. We use the ami/lsi megaraid and I'm quite pleased with
its writing performance.

How you configure your drives is up to you. For smaller numbers of
drives (6 or less) RAID 1+0 is usually a clear winner. For medium numbers
of drives, say 8 to 20, RAID 5 works well. For more drives than that,
many folks report RAID 5+0 or 0+5 to work well.

I've only played around with 12 or fewer drives, so I'm saying RAID 5+0 is
a good choice from my experience, just reflecting back what I've heard
here on the performance mailing list.

If you're not doing much writing, then a software RAID may be a good
intermediate solution, especially RAID1 with >2 disks under linux seems a
good setup for a mostly read database.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bjoern Metzdorf 2004-05-11 19:06:58 Quad processor options
Previous Message jao 2004-05-11 16:52:32 Re: Configuring PostgreSQL to minimize impact of checkpoints