Re: sorted writes for checkpoints

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: sorted writes for checkpoints
Date: 2010-11-08 23:27:35
Message-ID: AANLkTikhbk3jdk1DNTwsVSQn03+KiTZaVQPUo4D02BoK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 7, 2010 at 4:13 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Jeff Janes wrote:
>
>> Assuming the ordering is useful, the only way the OS can do as good a
>> job as the checkpoint code can, is if the OS stores the entire
>> checkpoint worth of data as dirty blocks and doesn't start writing
>> until an fsync comes in.  This strikes me as a pathologically
>> configured OS/FS.  (And would explain problems with fsyncs)
>>
>
> This can be exactly the situation with ext3 on Linux, which I believe is one
> reason the write sorting patch didn't go anywhere last time it came
> up--that's certainly what I tested it on.

Interesting. I think the default mount options for ext3 is to do a
journal sync at least every 5 seconds, which should also flush out
dirty OS buffers, preventing them from building up to such an extent.
Is that default being changed here, or does it simply not work the way
I think it does?

Thanks for the link to the slides.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2010-11-09 00:47:15 Re: W3C Specs: Web SQL
Previous Message David E. Wheeler 2010-11-08 23:15:13 Re: Query Plan Columns