From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WALWriteLock contention |
Date: | 2015-05-17 21:57:04 |
Message-ID: | CAEepm=2+e+3-LtU_hSwiNLOWHoyU1xr68f3zh9KKSyZ-P-RL8A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, May 17, 2015 at 2:15 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> http://oldblog.antirez.com/post/fsync-different-thread-useless.html
>
> It suggests that an fsync in progress blocks out not only other
> fsyncs, but other writes to the same file, which for our purposes is
> just awful. More Googling around reveals that this is apparently
> well-known to Linux kernel developers and that they don't seem excited
> about fixing it. :-(
He doesn't say, but I wonder if that is really Linux, or if it is the
ext2, 3 and maybe 4 filesystems specifically. This blog post talks
about the per-inode mutex that is held while writing with direct IO.
Maybe fsyncing buffered IO is similarly constrained in those
filesystems.
https://www.facebook.com/notes/mysql-at-facebook/xfs-ext-and-per-inode-mutexes/10150210901610933
> <crazy-idea>I wonder if we could write WAL to two different files in
> alternation, so that we could be writing to one file which fsync-ing
> the other.</crazy-idea>
If that is an ext3-specific problem, using multiple files might not
help you anyway because ext3 famously fsyncs *all* files when you
asked for one file to be fsynced, as discussed in Greg Smith's
PostgreSQL 9.0 High Performance in chapter 4 (page 79).
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2015-05-17 22:22:30 | Re: jsonb concatenate operator's semantics seem questionable |
Previous Message | Peter Geoghegan | 2015-05-17 21:56:55 | Re: jsonb concatenate operator's semantics seem questionable |