From: | Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> |
---|---|
To: | Takayuki Tsunakawa <tunakawa(at)soft(dot)fujitsu(dot)com> |
Subject: | Re: Load distributed checkpoint |
Date: | 2006-12-07 19:02:33 |
Message-ID: | 45786549.2000602@cheapcomplexdevices.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Takayuki Tsunakawa wrote:
> Hello, Itagaki-san
>> Checkpoint consists of the following four steps, and the major
>> performance
>> problem is 2nd step. All dirty buffers are written without interval
>> in it.
>> 1. Query information (REDO pointer, next XID etc.)
>> 2. Write dirty pages in buffer pool
>> 3. Flush all modified files
>> 4. Update control file
>
> Hmm. Isn't it possible that step 3 affects the performance greatly?
> I'm sorry if you have already identified step 2 as disturbing
> backends.
>
> As you know, PostgreSQL does not transfer the data to disk when
> write()ing. Actual transfer occurs when fsync()ing at checkpoints,
> unless the filesystem cache runs short. So, disk is overworked at
> fsync()s.
It seems to me that virtual memory settings of the OS will determine
if step 2 or step 3 causes much of the actual disk I/O.
In particular, on Linux, things like /proc/sys/vm/dirty_expire_centisecs
and dirty_writeback_centisecs and possibly dirty_background_ratio
would affect this. If those numbers are high, ISTM most write()s
from step 2 would wait for the flush in step 3. If I understand
correctly, if the dirty_expire_centisecs number is low, most write()s
from step 2 would happen before step 3 because of the pdflush daemons.
I expect other OS's would have different but similar knobs to tune this.
It seems to me that the most portable way postgresql could force
the I/O to be balanced would be to insert otherwise unnecessary
fsync()s into step 2; but that it might (not sure why) be better
to handle this through OS-specific tuning outside of postgres.
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2006-12-07 19:32:48 | Dead code in _bt_split? |
Previous Message | Kevin Grittner | 2006-12-07 16:25:06 | Re: old synchronized scan patch |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2006-12-07 19:32:48 | Dead code in _bt_split? |
Previous Message | Heikki Linnakangas | 2006-12-07 19:01:26 | Re: Index split WAL reduction |