| From: | Jan Wieck <JanWieck(at)Yahoo(dot)com> | 
|---|---|
| To: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> | 
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zeugswetter Andreas SB SD <ZeugswetterA(at)spardat(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Win32 port list <pgsql-hackers-win32(at)postgresql(dot)org> | 
| Subject: | Re: [HACKERS] Sync vs. fsync during checkpoint | 
| Date: | 2004-02-09 14:33:09 | 
| Message-ID: | 40279A25.6020600@Yahoo.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers pgsql-hackers-win32 | 
Bruce Momjian wrote:
> Jan Wieck wrote:
>> Tom Lane wrote:
>> 
>> > "Zeugswetter Andreas SB SD" <ZeugswetterA(at)spardat(dot)at> writes:
>> >> So Imho the target should be to have not much IO open for the checkpoint, 
>> >> so the fsync is fast enough, even if serial.
>> > 
>> > The best we can do is push out dirty pages with write() via the bgwriter
>> > and hope that the kernel will see fit to write them before checkpoint
>> > time arrives.  I am not sure if that hope has basis in fact or if it's
>> > just wishful thinking.  Most likely, if it does have basis in fact it's
>> > because there is a standard syncer daemon forcing a sync() every thirty
>> > seconds.
>> 
>> Looking at the response time charts I did for showing how vacuum delay 
>> is doing, it seems at least on Linux there is hope that that is the 
>> case. Those charts have just a regular 5 minute checkpoint with enough 
>> checkpoint segments for that, and no other sync effort done at all.
>> 
>> The system has a hard time to handle a larger scaled test DB, so it is 
>> definitely well saturated with IO. The charts are here:
>> 
>>      http://developer.postgresql.org/~wieck/vacuum_cost/
>> 
>> > 
>> > That means that instead of an I/O storm every checkpoint interval,
>> > we get a smaller I/O storm every 30 seconds.  Not sure this is a big
>> > improvement.  Jan already found out that issuing very frequent sync()s
>> > isn't a win.
>> 
>> In none of those charts I can see any checkpoint caused IO storm any 
>> more. Charts I'm currently doing for 7.4.1 show extremely clear spikes 
>> at checkpoints. If someone is interested in those as well I will put 
>> them up.
> 
> So, Jan, are you basically saying that the background writer has solved
> the checkpoint I/O flood problem, and we just need to deal with changing
> sync to multiple fsync's at checkpoint?
ISTM that the background writer at least has the ability to lower the 
impact of a checkpoint significantly enough that one might not care 
about it any more. "Has the ability" means, it needs to be adjusted to 
the actual DB usage. The charts I produced where not done with the 
default settings, but rather after making the bgwriter a bit more 
agressive against dirty pages.
The whole sync() vs. fsync() discussion is in my opinion nonsense at 
this point. Without the ability to limit the amount of files to a 
reasonable number, by employing tablespaces in the form of larger 
container files, the risk of forcing excessive head movement is simply 
too high.
Jan
-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck(at)Yahoo(dot)com #
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Stark | 2004-02-09 14:33:11 | Re: Transaction aborts on syntax error. | 
| Previous Message | Rod Taylor | 2004-02-09 13:29:15 | Re: RFC: Very large scale postgres support | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2004-02-09 16:26:43 | Re: [HACKERS] Sync vs. fsync during checkpoint | 
| Previous Message | Magnus Hagander | 2004-02-08 22:25:22 | Re: [PATCHES] Updated win32 readdir patch |