From: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Load distributed checkpoint |
Date: | 2006-12-28 21:28:48 |
Message-ID: | 45943710.1060305@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane wrote:
> To my mind the problem with fsync is not that it gives us too little
> control but that it gives too much: we have to specify a particular
> order of writing out files. What we'd really like is a version of
> sync(2) that tells us when it's done but doesn't constrain the I/O
> scheduler's choices at all. Unfortunately there's no such API ...
The problem I see with fsync is that it causes an immediate I/O storm as
the OS tries to flush everything out as quickly as possible. But we're
not in a hurry. What we'd need is a lazy fsync, that would tell the
operating system "let me know when all these dirty buffers are written
to disk, but I'm not in a hurry, take your time". It wouldn't change the
scheduling of the writes, just inform the caller when they're done.
If we wanted more precise control of the flushing, we could use
sync_file_range on Linux, but that's not portable. Nevertheless, I think
it would be OK to have an ifdef and use it on platforms that support
it, if it gave a benefit.
As a side note, with full_page_writes on, a checkpoint wouldn't actually
need to fsync those pages that have been written to WAL after the
checkpoint started. Doesn't make much difference in most cases, but we
could take that into account if we start taking more control of the
flushing.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2006-12-28 21:32:55 | Re: TODO: GNU TLS |
Previous Message | Andrew Dunstan | 2006-12-28 21:10:34 | Re: TODO: GNU TLS |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2006-12-28 22:15:25 | Re: Recent SIGSEGV failures in buildfarm HEAD |
Previous Message | Stefan Kaltenbrunner | 2006-12-28 21:02:22 | Re: Recent SIGSEGV failures in buildfarm HEAD |