From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(dot)ringer(at)2ndquadrant(dot)com> |
Subject: | Re: Some thoughts on NFS |
Date: | 2019-02-19 18:45:28 |
Message-ID: | CA+Tgmoa4V=nwXo4C8Pkni-PE0DMmkPJavWBu8mLLdGzkjdUkyg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Feb 19, 2019 at 1:29 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Is that a new thing? I ran across PostgreSQL-over-iSCSI a number of
> > years ago and the evidence strongly suggested that it did not reliably
> > report disk errors back to PostgreSQL, leading to corruption.
>
> How many years ago are we talking? I think it's been mostly robust in
> the last 6-10 years, maybe?
I think it was ~9 years ago.
> But note that the postgres + linux fsync
> issues would have plagued that use case just as well as it did local
> storage, at a likely higher incidence of failures (i.e. us forgetting to
> retry fsyncs in checkpoints, and linux throwing away dirty data after
> fsync failure would both have caused problems that aren't dependent on
> iSCSI).
IIRC, and obviously that's difficult to do after so long, there were
clearly disk errors in the kernel logs, but no hint of a problem in
the PostgreSQL logs. So it wasn't just a case of us responding to
errors with sufficient vigor -- either they weren't being reported at
all, or only to system calls we weren't checking, e.g. close or
something.
> And I think it's not that likely that we'd not screw up a
> number of times implementing iSCSI ourselves - not to speak of the fact
> that that seems like an odd place to focus development on, given that
> it'd basically require all the infrastructure also needed for local DIO,
> which'd likely gain us much more.
I don't really disagree with you here, but I also think it's important
to be honest about what size hammer is likely to be sufficient to fix
the problem. Project policy for many years has been essentially
"let's assume the kernel guys know what they are doing," but, I don't
know, color me a little skeptical at this point. We've certainly made
lots of mistakes all of our own, and it's certainly true that
reimplementing large parts of what the kernel does in user space is
not very appealing ... but on the other hand it looks like filesystem
error reporting isn't even really reliable for local operation (unless
we do an incredibly complicated fd-passing thing that has deadlock
problems we don't know how to solve and likely performance problems
too, or convert the whole backend to use threads) or for NFS operation
(though maybe your suggestion will fix that) so the idea that iSCSI is
just going to be all right seems a bit questionable to me. Go ahead,
call me a pessimist...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2019-02-19 18:46:45 | Re: WAL insert delay settings |
Previous Message | Tomas Vondra | 2019-02-19 18:43:14 | Re: WAL insert delay settings |