hanging for 30sec when checkpointing

From: Shane Wright <me(at)shanewright(dot)co(dot)uk>
To: pgsql-admin(at)postgresql(dot)org
Subject: hanging for 30sec when checkpointing
Date: 2004-02-03 22:35:02
Message-ID: 40202216.4010608@shanewright.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi,

I'm running a reasonable sized (~30Gb) 7.3.4 database on Linux and I'm
getting some weird performance at times.

When the db is under medium-heavy load, it periodically spawns a
'checkpoint subprocess' which runs for between 15 seconds and a minute.
Ok, fair enough, the only problem is the whole box becomes pretty much
unresponsive during this time - from what I can gather it's because it
writes out roughly 1Mb (vmstat says ~1034 blocks) per second until its done.

Other processes can continue to run (e.g. vmstat) but other things do
not (other queries, mostly running 'ps fax', etc). So everything gets
stacked up till the checkpoint finishes and all is well again, untill
the next time...

This only really happens under medium/high load, but doesn't seem
related to the length/complexity of transactions done.

The box isn't doing a lot else at the same time - most queries some in
from separate web server boxes.

The disks, although IDE, can definately handle more than 1Mb/sec - even
with multiple concurrent writes. The box is powerful (2.6Ghz Xeon, 2Gb
RAM). Its a clean compile from source of 7.3.4, although I can't really
upgrade to 7.4.x at this time as I can't afford the 18 hours downtime to
dump/restore the database. Fsync is on. Most other settings at their
defaults.

I've looked at the documentation and various bits about adjusting
checkpoint segments and timings - but it seems reducing segments/timeout
is implied to be bad, but it seems to me that increasing either will
just make the same thing happen less often but more severely.

If it makes any odds, this seems to happen much more often when doing
bulk UPDATEs and INSERTs - athough these are in transactions grouping
them together - and they don't affect the same tables as other queries
that still get stalled (no lock contention causing the problem).

What am I missing? I'm sure I'm missing something blatantly obvious,
but as it's only really happening on production systems (only place with
the load and the volume of data) I'm loathe to experiment.

Any help appreciated,

Cheers,

Shane

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2004-02-03 23:44:18 Re: hanging for 30sec when checkpointing
Previous Message Bruno Wolff III 2004-02-03 20:18:08 Re: pg_hba.conf and postgresql 7.2