Quick Links

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc:	Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date:	2018-03-29 05:06:22
Message-ID:	CAEepm=1YNv1hic3MVRJiB817eofmL-wfiD=zhJnt0RjaHnfwig@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> The retries are the source of the problem ; the first fsync() can return EIO,
> and also *clears the error* causing a 2nd fsync (of the same data) to return
> success.

What I'm failing to grok here is how that error flag even matters,
whether it's a single bit or a counter as described in that patch. If
write back failed, *the page is still dirty*. So all future calls to
fsync() need to try to try to flush it again, and (presumably) fail
again (unless it happens to succeed this time around).

--
Thomas Munro
http://www.enterprisedb.com

In response to

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS at 2018-03-29 05:00:31 from Justin Pryzby

Responses

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS at 2018-03-29 05:25:51 from Craig Ringer

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2018-03-29 05:10:46	Re: pgsql: Add documentation for the JIT feature.
Previous Message	Justin Pryzby	2018-03-29 05:00:31	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS