Re: Checkpoint Tuning Question

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dan Armbrust <daniel(dot)armbrust(dot)list(at)gmail(dot)com>
Cc: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Checkpoint Tuning Question
Date: 2009-07-08 22:22:54
Message-ID: 11530.1247091774@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dan Armbrust <daniel(dot)armbrust(dot)list(at)gmail(dot)com> writes:
> Almost all of the slow query log messages are logged within about 3
> seconds of the checkpoint starting message.

> LOG: checkpoint complete: wrote 9975 buffers (77.9%); 0 transaction
> log file(s) added, 0 removed, 15 recycled; write=156.576 s, sync=0.065
> s, total=156.662 s

Huh. And there's just about no daylight between the total checkpoint
time and the write+sync time (barely more than 20msec in both examples).
So that seems to wipe out the thought I had that maybe we'd
underestimated the work involved in one of the other steps of
checkpoint.

As Greg commented upthread, we seem to be getting forced to the
conclusion that the initial buffer scan in BufferSync() is somehow
causing this. There are a couple of things it'd be useful to try
here:

* see how the size of the hiccup varies with shared_buffers;

* try inserting a delay into that scan loop, as per attached
quick-and-dirty patch. (Numbers pulled from the air, but
we can worry about tuning after we see if this is really
where the problem is.)

regards, tom lane

Attachment Content-Type Size
unknown_filename text/plain 309 bytes

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ivan Sergio Borgonovo 2009-07-09 00:00:14 temp tables and replication/connection sharing/pooling
Previous Message Niederland 2009-07-08 22:21:01 Postgres 8.4 literal escaping