From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: sync_file_range() |
Date: | 2006-06-19 10:33:44 |
Message-ID: | 20060619184910.9EB3.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu> wrote:
> > I'm interested in it, with which we could improve responsiveness during
> > checkpoints. Though it is Linux specific system call, but we could use
> > the combination of mmap() and msync() instead of it; I mean we can use
> > mmap only to flush dirty pages, not to read or write pages.
>
> Can you specify details? As the TODO item inidcates, if we mmap data file, a
> serious problem is that we don't know when the data pages hit the disks --
> so that we may voilate the WAL rule.
I'm thinking about fuzzy checkpoints, where we writes and flushes buffers
as need as we should. Then sync_file_range() helps us to control to flush
buffers by better granularity. We can stretch a checkpoint length to avoid
storage-overload at a burst, using sync_file_range() and cost-based delay,
like vacuum.
I did not want to modify buffers by mmap, just to say the following
pseudo-code. (I don't know it works in fact...)
my_sync_file_range(fd, offset, nbytes, ...)
{
void *p = mmap(NULL, nbytes, ..., fd, offset);
msync(p, nbytes, MS_ASYNC);
munmap(p, nbytes);
}
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2006-06-19 11:29:15 | Re: sync_file_range() |
Previous Message | Joachim Wieland | 2006-06-19 09:47:58 | modular pg_regress.sh |