From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Mark Dilger <hornschnorter(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Subject: | Re: fsync failure in durable_unlink ignored in xlog.c? |
Date: | 2019-05-23 18:18:20 |
Message-ID: | 20190523181820.t4x22nqjonttwmwp@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-05-23 14:06:57 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2019-05-23 10:46:02 -0700, Mark Dilger wrote:
> >> Is this code safe against fsync failures? If so, can I get an explanation
> >> that I might put into a code comment patch?
>
> > What's the danger you're thinking of here? The issue with ignoring fsync
> > failures is that it could be the one signal about data corruption we get
> > for a write()/fsync() that failed - i.e. that durability cannot be
> > guaranteed. But we don't care about the file contents of those files.
>
> Hmm ... if we don't care, why are we issuing an fsync at all?
Fair point. I think we do care in most of those cases, but we don't need
to trigger a PANIC. We'd be in trouble if e.g. an older tablespace map
file were to "revive" later. Looking at the cases:
- durable_unlink(TABLESPACE_MAP, DEBUG1) - we definitely care about a
failure to unlink/remove, but *not* about ENOENT, because that's expected.
- /* Force installation: get rid of any pre-existing segment file */
durable_unlink(path, DEBUG1);
same.
- RemoveXlogFile():
rc = durable_unlink(path, LOG);
It's probably *tolerable* to fail here. Not sure why this is a
durable_unlink(LOG) - doesn't make a ton of sense to me.
- durable_unlink(BACKUP_LABEL_FILE, ERROR);
This is a "whaa, bad shit is happening" kind of situation. But
crashing probably would make it even worse, because we'd restart
assuming we're restoring from a backup.
ISTM that durable_unlink() for the first two cases really needs a
separate 'missing_ok' type argument. And that the reason for using
DEBUG1 here is solely an outcome of that not existing.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Dilger | 2019-05-23 18:29:15 | ClosePipeStream failure ignored in pg_import_system_collations |
Previous Message | Mark Dilger | 2019-05-23 18:14:13 | Re: fsync failure in durable_unlink ignored in xlog.c? |