From: | David Kerr <dmk(at)mr-paradox(dot)net> |
---|---|
To: | Bill Moran <wmoran(at)potentialtech(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Why does splitting $PGDATA and xlog yield a performance benefit? |
Date: | 2015-08-25 18:14:40 |
Message-ID: | 81D83748-A8A7-4EC6-B12C-522F2FEF25D6@mr-paradox.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> On Aug 25, 2015, at 10:45 AM, Bill Moran <wmoran(at)potentialtech(dot)com> wrote:
>
> On Tue, 25 Aug 2015 10:08:48 -0700
> David Kerr <dmk(at)mr-paradox(dot)net> wrote:
>
>> Howdy All,
>>
>> For a very long time I've held the belief that splitting PGDATA and xlog on linux systems fairly universally gives a decent performance benefit for many common workloads.
>> (i've seen up to 20% personally).
>>
>> I was under the impression that this had to do with regular fsync()'s from the WAL
>> interfearing with and over-reaching writing out the filesystem buffers.
>>
>> Basically, I think i was conflating fsync() with sync().
>>
>> So if it's not that, then that just leaves bandwith (ignoring all of the other best practice reasons for reliablity, etc.). So, in theory if you're not swamping your disk I/O then you won't really benefit from relocating your XLOGs.
>
> Disk performance can be a bit more complicated than just "swamping." Even if
Funny, on revision of my question, I left out basically that exact line for simplicity sake. =)
> you're not maxing out the IO bandwidth, you could be getting enough that some
> writes are waiting on other writes before they can be processed. Consider the
> fact that old-style ethernet was only able to hit ~80% of its theoretical
> capacity in the real world, because the chance of collisions increased with
> the amount of data, and each collision slowed down the overall transfer speed.
> Contrasted with modern ethernet that doesn't do collisions, you can get much
> closer to 100% of the rated bandwith because the communications are effectively
> partitioned from each other.
>
> In the worst case scenerion, if two processes (due to horrible luck) _always_
> try to write at the same time, the overall responsiveness will be lousy, even
> if the bandwidth usage is only a small percent of the available. Of course,
> that worst case doesn't happen in actual practice, but as the usage goes up,
> the chance of hitting that interference increases, and the effective response
> goes down, even when there's bandwidth still available.
>
> Separate the competing processes, and the chance of conflict is 0. So your
> responsiveness is pretty much at best-case all the time.
Understood. Now in my previous delve into this issue, I showed minimal/no disk queuing, the SAN showed nothing on it's queues and no retries. (of course #NeverTrustTheSANGuy) but I still yielded a 20% performance increase by splitting the WAL and $PGDATA
But that's besides the point and my data on that environment is long gone.
I'm content to leave this at "I/O is complicated" I just wanted to make sure that i wasn't correct but for a slightly wrong reason.
Thanks!
From | Date | Subject | |
---|---|---|---|
Next Message | Karsten Hilbert | 2015-08-25 18:38:14 | Re: PostgreSQL Developer Best Practices |
Previous Message | David Kerr | 2015-08-25 17:54:09 | Re: Why does splitting $PGDATA and xlog yield a performance benefit? |