From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: silent data loss with ext4 / all current versions |
Date: | 2015-12-01 22:00:10 |
Message-ID: | 565E186A.1070608@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/01/2015 10:44 PM, Peter Eisentraut wrote:
> On 11/27/15 8:18 AM, Michael Paquier wrote:
>> On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra
>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>> So, what's going on? The problem is that while the rename() is atomic, it's
>>>> not guaranteed to be durable without an explicit fsync on the parent
>>>> directory. And by default we only do fdatasync on the recycled segments,
>>>> which may not force fsync on the directory (and ext4 does not do that,
>>>> apparently).
>> Yeah, that seems to be the way the POSIX spec clears things.
>> "If _POSIX_SYNCHRONIZED_IO is defined, the fsync() function shall
>> force all currently queued I/O operations associated with the file
>> indicated by file descriptor fildes to the synchronized I/O completion
>> state. All I/O operations shall be completed as defined for
>> synchronized I/O file integrity completion."
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html
>> If I understand that right, it is guaranteed that the rename() will be
>> atomic, meaning that there will be only one file even if there is a
>> crash, but that we need to fsync() the parent directory as mentioned.
>
> I don't see anywhere in the spec that a rename needs an fsync of the
> directory to be durable. I can see why that would be needed in
> practice, though. File system developers would probably be able to
> give a more definite answer.
Yeah, POSIX is the smallest common denominator. In this case the spec
seems not to require this durability guarantee (rename without fsync on
directory), which allows a POSIX-compliant filesystem.
At least that's my conclusion from reading https://lwn.net/Articles/322823/
However, as I explained in the original post, it's more complicated as
this only seems to be problem with fdatasync. I've been unable to
reproduce the issue with wal_sync_method=fsync.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2015-12-01 22:05:52 | Re: silent data loss with ext4 / all current versions |
Previous Message | Robert Haas | 2015-12-01 21:59:08 | Re: Re: Multixact slru doesn't don't force WAL flushes in SlruPhysicalWritePage() |