From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: fsync reliability |
Date: | 2011-04-21 15:55:55 |
Message-ID: | 24001.1303401355@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> Daniel Farina points out to me that the Linux man page for fsync() says
> "Calling fsync() does not necessarily ensure that the entry in the directory
> containing the file has also reached disk. For that an
> explicit fsync() on a
> file descriptor for the directory is also needed."
> http://www.kernel.org/doc/man-pages/online/pages/man2/fsync.2.html
> This point appears to have been discussed before
Yes ...
> Tom said
> "We don't try to "fsync the
> directory" after a normal table create for instance"
> which is fine because we don't need to. In the event of a crash a
> missing table would be recreated during crash recovery.
Nonsense. Once a checkpoint occurs after the WAL record that says to
create the table, we won't replay that action. Or are you proposing
to have checkpoints run around and fsync every directory in the data
tree?
The traditional standard is that the filesystem is supposed to take
care of its own metadata, and even Linux filesystems have pretty much
figured that out. I don't really see a need for us to be nursemaiding
the filesystem. At most there's a documentation issue here, ie, we
ought to be more explicit about which filesystems and which mount
options we recommend.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Farina | 2011-04-21 16:05:50 | Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog? |
Previous Message | Robert Haas | 2011-04-21 15:51:46 | Re: "stored procedures" |