From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: silent data loss with ext4 / all current versions |
Date: | 2015-11-27 12:35:06 |
Message-ID: | 56584DFA.6090101@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> What happens is that when we recycle WAL segments, we rename them and then sync
> them using fdatasync (which is the default on Linux). However fdatasync does not
> force fsync on the parent directory, so in case of power failure the rename may
> get lost. The recovery won't realize those segments actually contain changes
Agree. Some time ago I faced with this, although it wasn't a postgres.
> So, what's going on? The problem is that while the rename() is atomic, it's not
> guaranteed to be durable without an explicit fsync on the parent directory. And
> by default we only do fdatasync on the recycled segments, which may not force
> fsync on the directory (and ext4 does not do that, apparently).
>
> This impacts all current kernels (tested on 2.6.32.68, 4.0.5 and 4.4-rc1), and
> also all supported PostgreSQL versions (tested on 9.1.19, but I believe all
> versions since spread checkpoints were introduced are vulnerable).
>
> FWIW this has nothing to do with storage reliability - you may have good drives,
> RAID controller with BBU, reliable SSDs or whatever, and you're still not safe.
> This issue is at the filesystem level, not storage.
Agree again.
> I plan to do more power failure testing soon, with more complex test scenarios.
> I suspect there might be other similar issues (e.g. when we rename a file before
> a checkpoint and don't fsync the directory - then the rename won't be replayed
> and will be lost).
It would be very useful, but I hope you will not find a new bug :)
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2015-11-27 13:18:23 | Re: silent data loss with ext4 / all current versions |
Previous Message | Teodor Sigaev | 2015-11-27 12:13:44 | Re: Tsvector editing functions |