Quick Links

Re: Controlling Load Distributed Checkpoints

From:	Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <gsmith(at)gregsmith(dot)com>, Hannu Krosing <hannu(at)skype(dot)net>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
Subject:	Re: Controlling Load Distributed Checkpoints
Date:	2007-06-07 17:59:28
Message-ID:	46684780.6010903@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

Tom Lane wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>> Tom Lane wrote:
>>> I don't think it's a historical artifact at all: it's a valid reflection
>>> of the fact that we don't know enough about disk layout to do low-level
>>> I/O scheduling. Issuing more fsyncs than necessary will do little
>>> except guarantee a less-than-optimal scheduling of the writes.
>
>> I'm not proposing to issue any more fsyncs. I'm proposing to change the
>> ordering so that instead of first writing all dirty buffers and then
>> fsyncing all files, we'd write all buffers belonging to a file, fsync
>> that file only, then write all buffers belonging to next file, fsync,
>> and so forth.
>
> But that means that the I/O to different files cannot be overlapped by
> the kernel, even if it would be more efficient to do so.

True. On the other hand, if we issue writes in essentially random order,
we might fill the kernel buffers with random blocks and the kernel needs
to flush them to disk as almost random I/O. If we did the writes in
groups, the kernel has better chance at coalescing them.

I tend to agree that if the goal is to finish the checkpoint as quickly
as possible, the current approach is better. In the context of load
distributed checkpoints, however, it's unlikely the kernel can do any
significant overlapping since we're trickling the writes anyway.

Do we need both strategies?

I'm starting to feel we should give up on smoothing the fsyncs and
distribute the writes only, for 8.3. As we get more experience with that
and it's shortcomings, we can enhance our checkpoints further in 8.4.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Re: Controlling Load Distributed Checkpoints at 2007-06-07 17:43:49 from Tom Lane

Responses

Re: Controlling Load Distributed Checkpoints at 2007-06-11 06:27:48 from ITAGAKI Takahiro

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Fuhr	2007-06-07 18:28:28	Re: Vacuuming anything zeroes shared table stats
Previous Message	Tom Lane	2007-06-07 17:43:49	Re: Controlling Load Distributed Checkpoints

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Greg Smith	2007-06-07 18:58:50	Re: Controlling Load Distributed Checkpoints
Previous Message	Tom Lane	2007-06-07 17:43:49	Re: Controlling Load Distributed Checkpoints