From: | Giles Lean <giles(at)nemeton(dot)com(dot)au> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Kevin Brown <kevin(at)sysexperts(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: sync() |
Date: | 2003-01-13 08:31:08 |
Message-ID: | 4993.1042446668@nemeton.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane writes:
> Right. "Portably" was the key word in my comment (sorry for not
> emphasizing this more clearly). The real problem here is how to know
> what is the actual behavior of each platform? I'm certainly not
> prepared to trust reading-between-the-lines-of-some-man-pages. And I
> can't think of a simple yet reliable direct test.
Is the "Single Unix Standard, version 2" (aka UNIX98) any better?
It says for fsync():
"The fsync() function forces all currently queued I/O operations
associated with the file indicated by file descriptor fildes to
the synchronised I/O completion state. All I/O operations are
completed as defined for synchronised I/O file integrity
completion."
This to me clearly says that changes to the file must be written,
not just changes made via this file descriptor.
I did have to test this behaviour once (for a customer, strange
situation) but I couldn't find a portable way to do it, either.
What I did was read the appropriate disk block from the raw device to
bypass the buffer cache. As this required low level knowledge of the
on-disk filesystem layout it was not very portable. For anyone
interested Tom Christiansen's "icat" program can be ported to UFS
derived filesystems fairly easily:
http://www.rosat.mpe-garching.mpg.de/mailing-lists/perl5-porters/1997-04/msg00487.html
> AFAIK, all Unix implementations are paranoid about consistency of
> filesystem metadata, including directory contents. So fsync'ing
> directories from a user process strikes me as a waste of time, ...
There is one variant where this is not the case: Linux using ext2fs
and possibly other filesystems.
There was a flame fest of great entertainment value a few years ago
between Linus Torvalds and Dan Bernstein. Of course, neither was able
to influence the opinion of the other to any noticible degree, but it
made fun reading. I think this might be a starting point:
http://www.ornl.gov/cts/archives/mailing-lists/qmail/1998/05/msg00667.html
A more recent posting from Linus where he continues to recommend
fsync() is this:
http://www.cs.helsinki.fi/linux/linux-kernel/2001-29/0659.html
I've not heard that any other Unix-like OS has abandoned the
traditional and POSIX semantic.
> assuming that it were portable, which I doubt. What we need to worry
> about is whether fsync'ing a bunch of our own data files is a practical
> substitute for a global sync() call.
I wish that it were. There are situations (serveral GB buffer caches,
for example) where I mistrust the current use of sync() to have all
writes completed before the sleep() returns. My concern is
theoretical at the moment -- I never get to play with machines that
large!
Regards,
Giles
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Kalchev | 2003-01-13 08:43:38 | Re: default to WITHOUT OIDS? |
Previous Message | Kevin Brown | 2003-01-13 07:38:33 | Re: COLUMN MODIFY |