Quick Links

Re: Weird XFS WAL problem

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Craig James <craig_james(at)emolecules(dot)com>, Matthew Wakeling <matthew(at)flymine(dot)org>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Weird XFS WAL problem
Date:	2010-07-07 14:42:55
Message-ID:	201007071442.o67EgtU26973@momjian.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Greg Smith wrote:
> Kevin Grittner wrote:
> > I don't know at the protocol level; I just know that write barriers
> > do *something* which causes our controllers to wait for actual disk
> > platter persistence, while fsync does not
>
> It's in the docs now:
> http://www.postgresql.org/docs/9.0/static/wal-reliability.html
>
> FLUSH CACHE EXT is the ATAPI-6 call that filesystems use to enforce
> barriers on that type of drive. Here's what the relevant portion of the
> ATAPI spec says:
>
> "This command is used by the host to request the device to flush the
> write cache. If there is data in the write
> cache, that data shall be written to the media.The BSY bit shall remain
> set to one until all data has been
> successfully written or an error occurs."
>
> SAS systems have a similar call named SYNCHRONIZE CACHE.
>
> The improvement I actually expect to arrive here first is a reliable
> implementation of O_SYNC/O_DSYNC writes. Both SAS and SATA drives that
> capable of doing Native Command Queueing support a write type called
> "Force Unit Access", which is essentially just like a direct write that
> cannot be cached. When we get more kernels with reliable sync writing
> that maps under the hood to FUA, and can change wal_sync_method to use
> them, the need to constantly call fsync for every write to the WAL will
> go away. Then the "blow out the RAID cache when barriers are on"
> behavior will only show up during checkpoint fsyncs, which will make
> things a lot better (albeit still not ideal).

Great information! I have added the attached documentation patch to
explain the write-barrier/BBU interaction. This will appear in the 9.0
documentation.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

Attachment	Content-Type	Size
/rtmp/diff	text/x-diff	4.4 KB

In response to

Re: Weird XFS WAL problem at 2010-06-05 22:50:27 from Greg Smith

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Richard Yen	2010-07-07 16:39:11	Re: [Slony1-general] WAL partition overloaded--by autovacuum?
Previous Message	damien hostin	2010-07-07 14:39:38	Re: Slow query with planner row strange estimation