Re: sync_file_range()

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: sync_file_range()
Date: 2006-06-19 10:33:44
Message-ID: 20060619184910.9EB3.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu> wrote:

> > I'm interested in it, with which we could improve responsiveness during
> > checkpoints. Though it is Linux specific system call, but we could use
> > the combination of mmap() and msync() instead of it; I mean we can use
> > mmap only to flush dirty pages, not to read or write pages.
>
> Can you specify details? As the TODO item inidcates, if we mmap data file, a
> serious problem is that we don't know when the data pages hit the disks --
> so that we may voilate the WAL rule.

I'm thinking about fuzzy checkpoints, where we writes and flushes buffers
as need as we should. Then sync_file_range() helps us to control to flush
buffers by better granularity. We can stretch a checkpoint length to avoid
storage-overload at a burst, using sync_file_range() and cost-based delay,
like vacuum.

I did not want to modify buffers by mmap, just to say the following
pseudo-code. (I don't know it works in fact...)

my_sync_file_range(fd, offset, nbytes, ...)
{
void *p = mmap(NULL, nbytes, ..., fd, offset);
msync(p, nbytes, MS_ASYNC);
munmap(p, nbytes);
}

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2006-06-19 11:29:15 Re: sync_file_range()
Previous Message Joachim Wieland 2006-06-19 09:47:58 modular pg_regress.sh