From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Ragnar Kjørstad <postgres(at)ragnark(dot)vestdata(dot)no> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gaetano Mendola <mendola(at)bigfoot(dot)com>, pgsql-admin(at)postgresql(dot)org |
Subject: | Re: fsync or fdatasync |
Date: | 2002-09-10 21:07:30 |
Message-ID: | 200209102107.g8AL7UN23077@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Ragnar Kjrstad wrote:
> > open_datasync is the first choice if available.
>
> I assume open_datasync means open with O_SYNC flag..
Yes.
> > > Why? That will slow tings down...
> >
> > On what evidence do you assert that?
> >
> > In theory open_datasync can be the fastest alternative for WAL writing,
> > because it should cause the kernel to force each WAL write() request
> > down to disk immediately. fdatasync will result in the same amount of
> > I/O, but it will also require the kernel to scan its disk cache to see
> > if there are any other dirty blocks that need to be written. On many
> > kernels this check is not very efficient and can chew substantial
> > amounts of CPU time.
>
> Yes, I see your argument.
> However, I've just checked the linux-implementation of fsync() and I
> can't really see how it could chew substantial amounts of CPU time. The
> way it works every inode has a list of dirty data buffers - all it does
> it traverse that list and do a write on each.
Remember we support >15 platforms, and I know there is at least one
(HPUX?) which does the fsync/fdatasync block finding inefficiently. It
may have even been old Linux; I can not remember.
> Anyway - I'm sure this is not enough to convince you, so I'll have to
> set up a test instead. But not tonight.
Again, that is a test case for only one OS. It is helpful if we are
going to start doing per-OS defaults, which is something we have talked
about. What would be great is a test program we can run on different
OS's to find out which is more efficient.
>
>
> > The tradeoff is that open_datasync syncs each WAL
> > block individually, which is unnecessary if you are committing
> > multiple blocks worth of WAL entries at once --- but there's no hard
> > evidence that that slows things down, especially not when the WAL logs
> > are on their own disk spindle.
>
> Well, in theory fsync() will allow the disk to reorder the writes, and
> that should give significantly better performance, because it will
> reduce the required number of seeks. If the WAL is on a seperate spindel
> there will very few seeks in the first place, so there is less to gain,
> but for the case with the WAL on the same disk as something else there
> is probably some gain. But it makes sense to optimize for the
> WAL-on-seperate-disk case...
Remember, in most cases, we are fsync'ing only one block so there is no
_gathering_ to do.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Worsdall | 2002-09-10 21:52:07 | Do the datatypes have set id/oid that are constant? |
Previous Message | Ragnar Kjørstad | 2002-09-10 20:48:30 | Re: fsync or fdatasync |