From: | Greg Smith <greg(at)2ndquadrant(dot)com> |
---|---|
To: | John R Pierce <pierce(at)hogranch(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Slow Query / Check Point Segments |
Date: | 2010-01-23 02:11:18 |
Message-ID: | 4B5A5AC6.1090302@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
John R Pierce wrote:
> we're having a similar problem with a very update intensive database
> that is part of a 24/7 manufacturing operation (no breathing time
> unless there's an unusual line down situtation)
Your problem is actually a bit different.
> 2010-01-23 01:08:13 MYTLOG: checkpoint complete: wrote 7535 buffers
> (2.9%); 0 transaction log file(s) added, 0 removed, 20 recycled;
> write=86.510 s, sync=0.260 s, total=86.791 s
> 2010-01-23 01:13:22 MYTLOG: checkpoint complete: wrote 12786 buffers
> (4.9%); 0 transaction log file(s) added, 0 removed, 17 recycled;
> write=95.028 s, sync=0.135 s, total=95.182 s
> 2010-01-23 01:16:22 MYTLOG: checkpoint complete: wrote 11720 buffers
> (4.5%); 0 transaction log file(s) added, 0 removed, 19 recycled;
> write=41.854 s, sync=0.640 s, total=42.518 s
> 2010-01-23 01:19:11 MYTLOG: checkpoint complete: wrote 10436 buffers
> (4.0%); 0 transaction log file(s) added, 0 removed, 20 recycled;
> write=50.330 s, sync=0.510 s, total=50.861 s
>
Your interval between checkpoints isn't too bad--3 minutes on the worst
of these. And the amount per checkpoint isn't that high either: 12786
buffers=100MB. Your problem is that the sync on that 100MB write is
taking an awfully long time. Spacing your checkpoints out further would
help, sure, but I wouldn't be surprised to find that doesn't do much for
you. Worth a shot, unlikely to hurt anything, just not optimistic about
it being the right think the way I was on the other message.
> This is running on a 16 core UltrasparcIV server, solaris 10, zfs,
> tons of disks on SAN with multiple tablespaces on different ZFS's,
> with the block sizes tuned appropriately
My guess is that there's something wrong with your config such that
writes followed by fsync are taking longer than they should. When I see
"sync=0.640 s" into a SAN where that sync operation should be near
instant, I'd be looking for issues in the ZFS intent log setup, how much
data you've setup ZFS to write cache, and the write cache policy on the
SAN hardware. There's something weird going on here, that sync should
be near instant on your hardware with such a small write volume.
Unfortunately, you may have to take the server down to find out exactly
what's going on, which doesn't sound practical for your environment.
(And people wonder why I have near religious zeal about testing disk
hardware before systems go into production)
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2010-01-23 02:41:11 | Re: Shall I convert all my stored procedure to use "EXECUTE", to ensure I get index-scan |
Previous Message | PG User 2010 | 2010-01-23 00:06:33 | Re: VACUUM FULL performance issues with pg_largeobject table |