Re: Why does splitting $PGDATA and xlog yield a performance benefit?

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: David Kerr <dmk(at)mr-paradox(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Why does splitting $PGDATA and xlog yield a performance benefit?
Date: 2015-08-25 17:45:29
Message-ID: 20150825134529.9c4be3ce14260bd7da65ff8b@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 25 Aug 2015 10:08:48 -0700
David Kerr <dmk(at)mr-paradox(dot)net> wrote:

> Howdy All,
>
> For a very long time I've held the belief that splitting PGDATA and xlog on linux systems fairly universally gives a decent performance benefit for many common workloads.
> (i've seen up to 20% personally).
>
> I was under the impression that this had to do with regular fsync()'s from the WAL
> interfearing with and over-reaching writing out the filesystem buffers.
>
> Basically, I think i was conflating fsync() with sync().
>
> So if it's not that, then that just leaves bandwith (ignoring all of the other best practice reasons for reliablity, etc.). So, in theory if you're not swamping your disk I/O then you won't really benefit from relocating your XLOGs.

Disk performance can be a bit more complicated than just "swamping." Even if
you're not maxing out the IO bandwidth, you could be getting enough that some
writes are waiting on other writes before they can be processed. Consider the
fact that old-style ethernet was only able to hit ~80% of its theoretical
capacity in the real world, because the chance of collisions increased with
the amount of data, and each collision slowed down the overall transfer speed.
Contrasted with modern ethernet that doesn't do collisions, you can get much
closer to 100% of the rated bandwith because the communications are effectively
partitioned from each other.

In the worst case scenerion, if two processes (due to horrible luck) _always_
try to write at the same time, the overall responsiveness will be lousy, even
if the bandwidth usage is only a small percent of the available. Of course,
that worst case doesn't happen in actual practice, but as the usage goes up,
the chance of hitting that interference increases, and the effective response
goes down, even when there's bandwidth still available.

Separate the competing processes, and the chance of conflict is 0. So your
responsiveness is pretty much at best-case all the time.

> However, I know from experience that's not entirely true, (although it's not always easy to measure all aspects of your I/O bandwith).
>
> Am I missing something?

--
Bill Moran

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Kerr 2015-08-25 17:54:09 Re: Why does splitting $PGDATA and xlog yield a performance benefit?
Previous Message Andomar 2015-08-25 17:16:37 Re: Why does splitting $PGDATA and xlog yield a performance benefit?