From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Greg Smith <greg(at)2ndquadrant(dot)com> |
Cc: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Craig James <craig_james(at)emolecules(dot)com>, Matthew Wakeling <matthew(at)flymine(dot)org>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Weird XFS WAL problem |
Date: | 2010-07-07 14:42:55 |
Message-ID: | 201007071442.o67EgtU26973@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Greg Smith wrote:
> Kevin Grittner wrote:
> > I don't know at the protocol level; I just know that write barriers
> > do *something* which causes our controllers to wait for actual disk
> > platter persistence, while fsync does not
>
> It's in the docs now:
> http://www.postgresql.org/docs/9.0/static/wal-reliability.html
>
> FLUSH CACHE EXT is the ATAPI-6 call that filesystems use to enforce
> barriers on that type of drive. Here's what the relevant portion of the
> ATAPI spec says:
>
> "This command is used by the host to request the device to flush the
> write cache. If there is data in the write
> cache, that data shall be written to the media.The BSY bit shall remain
> set to one until all data has been
> successfully written or an error occurs."
>
> SAS systems have a similar call named SYNCHRONIZE CACHE.
>
> The improvement I actually expect to arrive here first is a reliable
> implementation of O_SYNC/O_DSYNC writes. Both SAS and SATA drives that
> capable of doing Native Command Queueing support a write type called
> "Force Unit Access", which is essentially just like a direct write that
> cannot be cached. When we get more kernels with reliable sync writing
> that maps under the hood to FUA, and can change wal_sync_method to use
> them, the need to constantly call fsync for every write to the WAL will
> go away. Then the "blow out the RAID cache when barriers are on"
> behavior will only show up during checkpoint fsyncs, which will make
> things a lot better (albeit still not ideal).
Great information! I have added the attached documentation patch to
explain the write-barrier/BBU interaction. This will appear in the 9.0
documentation.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ None of us is going to be here forever. +
Attachment | Content-Type | Size |
---|---|---|
/rtmp/diff | text/x-diff | 4.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Yen | 2010-07-07 16:39:11 | Re: [Slony1-general] WAL partition overloaded--by autovacuum? |
Previous Message | damien hostin | 2010-07-07 14:39:38 | Re: Slow query with planner row strange estimation |