Re: On Linux Filesystems

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: pgman(at)candle(dot)pha(dot)pa(dot)us
Cc: Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-performance(at)postgresql(dot)org
Subject: Re: On Linux Filesystems
Date: 2003-08-12 04:16:41
Message-ID: 200308120416.h7C4Gfn11426@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Here is one talking about ext2 corruption from power failure from 2002:

http://groups.google.com/groups?q=ext2+corrupt+%22power+failure%22&hl=en&lr=&ie=UTF-8&selm=alvrj5%249in%241%40usc.edu&rnum=9

---------------------------------------------------------------------------

pgman wrote:
>
> As I remember, there were clear cases that ext2 would fail to recover,
> and it was known to be a limitation of the file system implementation.
> Some of the ext2 developers were in the room at Red Hat when I said
> that, so if it was incorrect, they would hopefully have spoken up. I
> addressed the comments directly to them.
>
> To be recoverasble, you have to be careful how you sync metadata to
> disk. All the journalling file systems, and the BSD UFS do that. I am
> told ext2 does not. I don't know much more than that.
>
> As I remember years ago, ext2 was faster than UFS, but it was true
> because ext2 didn't guarantee failure recovery. Now, with UFS soft
> updates, the have similar performance characteristics, but UFS is still
> crash-safe.
>
> However, I just tried google and couldn't find any documented evidence
> that ext2 isn't crash-safe, so maybe I am wrong.
>
> ---------------------------------------------------------------------------
>
> Christopher Browne wrote:
> > Bruce Momjian commented:
> >
> > "Uh, the ext2 developers say it isn't 100% reliable" ... "I mentioned
> > it while I was visiting Red Hat, and they didn't refute it."
> >
> > 1. Nobody has gone through any formal proofs, and there are few
> > systems _anywhere_ that are 100% reliable. NASA has occasionally lost
> > spacecraft to software bugs, so nobody will be making such rash claims
> > about ext2.
> >
> > 2. Several projects have taken on the task of introducing journalled
> > filesystems, most notably ext3 (sponsored by RHAT via Stephen Tweedy)
> > and ReiserFS (oft sponsored by SuSE). (I leave off JFS/XFS since they
> > existed long before they had any relationship with Linux.)
> >
> > Participants in such projects certainly have interest in presenting
> > the notion that they provide improved reliability over ext2.
> >
> > 3. There is no "apologist" for ext2 that will either (stupidly and
> > futilely) claim it to be flawless. Nor is there substantial interest
> > in improving it; the sort people that would be interested in that sort
> > of thing are working on the other FSes.
> >
> > This also means that there's no one interested in going into the
> > guaranteed-to-be-unsung effort involved in trying to prove ext2 to be
> > "formally reliable."
> >
> > 4. It would be silly to minimize the impact of commercial interest.
> > RHAT has been paying for the development of a would-be ext2 successor.
> > For them to refute your comments wouldn't be in their interests.
> >
> > Note that these are "warm and fuzzy" comments, the whole lot. The
> > 80-some thousand lines of code involved in ext2, ext3, reiserfs, and
> > jfs are no more amenable to absolute mathematical proof of reliability
> > than the corresponding BSD FFS code.
> >
> > 6. Such efforts would be futile, anyways. Disks are mechanical
> > devices, and, as such, suffer from substantial reliability issues
> > irrespective of the reliability of the software. I have lost sleep on
> > too many occasions due to failures of:
> > a) Disk drives,
> > b) Disk controllers [the worst Oracle failure I encountered resulted
> > from this], and
> > c) OS memory management.
> >
> > I used ReiserFS back in its "bleeding edge" days, and find myself a
> > lot more worried about losing data to flakey disk controllers.
> >
> > It frankly seems insulting to focus on ext2 in this way when:
> >
> > a) There aren't _hard_ conclusions to point to, just soft ones;
> >
> > b) The reasons for you hearing vaguely negative things about ext2
> > are much more likely political than they are technical.
> >
> > I wish there were more "hard and fast" conclusions to draw, to be able
> > to conclusively say that one or another Linux filesystem was
> > unambiguously preferable for use with PostgreSQL. There are not
> > conclusive metrics, either in terms of speed or of some notion of
> > "reliability." I'd expect ReiserFS to be the poorest choice, and for
> > XFS to be the best, but I only have fuzzy reasons, as opposed to
> > metrics.
> >
> > The absence of measurable metrics of the sort is _NOT_ a proof that
> > (say) FreeBSD is conclusively preferable, whatever your own
> > preferences (I'll try to avoid characterizing it as "prejudices," as
> > that would be unkind) may be. That would represent a quite separate
> > debate, and one that doesn't belong here, certainly not on a thread
> > where the underlying question was "Which Linux FS is preferred?"
> >
> > If the OSDB TPC-like benchmarks can get "packaged" up well enough to
> > easily run and rerun them, there's hope of getting better answers,
> > perhaps even including performance metrics for *BSD. That, not
> > Linux-baiting, is the answer...
> > --
> > select 'cbbrowne' || '@' || 'acm.org';
> > http://www.ntlug.org/~cbbrowne/sap.html
> > (eq? 'truth 'beauty) ; to avoid unassigned-var error, since compiled code
> > ; will pick up previous value to var set!-ed,
> > ; the unassigned object.
> > -- from BBN-CL's cl-parser.scm
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Browse pgsql-performance by date

  From Date Subject
Next Message Neil Conway 2003-08-12 04:37:19 Re: Perfomance Tuning
Previous Message Bruce Momjian 2003-08-12 04:07:12 Re: On Linux Filesystems