Quick Links

Re: sync_file_range()

From:	ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To:	"Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: sync_file_range()
Date:	2006-06-19 10:33:44
Message-ID:	20060619184910.9EB3.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu> wrote:

> > I'm interested in it, with which we could improve responsiveness during
> > checkpoints. Though it is Linux specific system call, but we could use
> > the combination of mmap() and msync() instead of it; I mean we can use
> > mmap only to flush dirty pages, not to read or write pages.
>
> Can you specify details? As the TODO item inidcates, if we mmap data file, a
> serious problem is that we don't know when the data pages hit the disks --
> so that we may voilate the WAL rule.

I'm thinking about fuzzy checkpoints, where we writes and flushes buffers
as need as we should. Then sync_file_range() helps us to control to flush
buffers by better granularity. We can stretch a checkpoint length to avoid
storage-overload at a burst, using sync_file_range() and cost-based delay,
like vacuum.

I did not want to modify buffers by mmap, just to say the following
pseudo-code. (I don't know it works in fact...)

my_sync_file_range(fd, offset, nbytes, ...)
{
void *p = mmap(NULL, nbytes, ..., fd, offset);
msync(p, nbytes, MS_ASYNC);
munmap(p, nbytes);
}

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Re: sync_file_range() at 2006-06-19 07:32:40 from Qingqing Zhou

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2006-06-19 11:29:15	Re: sync_file_range()
Previous Message	Joachim Wieland	2006-06-19 09:47:58	modular pg_regress.sh