From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> |
Cc: | Takayuki Tsunakawa <tunakawa(at)soft(dot)fujitsu(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Load distributed checkpoint |
Date: | 2006-12-08 04:40:38 |
Message-ID: | 20061208131001.6655.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> wrote:
> >> 1. Query information (REDO pointer, next XID etc.)
> >> 2. Write dirty pages in buffer pool
> >> 3. Flush all modified files
> >> 4. Update control file
> >
> > Hmm. Isn't it possible that step 3 affects the performance greatly?
> > I'm sorry if you have already identified step 2 as disturbing
> > backends.
>
> It seems to me that virtual memory settings of the OS will determine
> if step 2 or step 3 causes much of the actual disk I/O.
>
> if the dirty_expire_centisecs number is low, most write()s
> from step 2 would happen before step 3 because of the pdflush daemons.
Exactly. It depends on OSes, kernel settings, and filesystems.
I tested the patch on Linux kernel 2.6.9-39, default settings, and ext3fs.
Maybe pdflush daemons were strong enough to write dirty buffers in kernel,
so step 2 was a main part and 3 was not.
There are technical issues to distribute step 3. We can write buffers
on a page basis, that is granular enough. However, fsync() is on a file
basis (1GB), so we can only control granularity of fsync roughly.
sync_file_range (http://lwn.net/Articles/178199/) or some special APIs
would be a help, but there are portability issues...
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-12-08 05:06:38 | Re: Weak passwords and brute force attacks |
Previous Message | Josh Berkus | 2006-12-08 04:31:09 | Re: SQL/PSM implemenation for PostgreSQL (roadmap) |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-12-08 05:21:38 | Re: Load distributed checkpoint |
Previous Message | Greg Smith | 2006-12-08 03:59:36 | Re: Load distributed checkpoint |