From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: fallocate / posix_fallocate for new WAL file creation (etc...) |
Date: | 2013-05-17 20:48:38 |
Message-ID: | CAHyXU0w9s-ibu2TKDRwPzStDVi6ZMk+xveHKjoFay=gRDQ3SxA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, May 17, 2013 at 8:29 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Fri, May 17, 2013 at 4:47 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> On 2013-05-15 16:46:33 -0500, Jon Nelson wrote:
>>> > * Is wal file creation performance actually relevant? Is the performance
>>> > of a system running on fallocate()d wal files any different?
>>>
>>> In my limited testing, I noticed a drop of approx. 100ms per WAL file.
>>> I do not have a good idea for how to really stress the WAL-file
>>> creation area without calling pg_start_backup and pg_stop_backup over
>>> and over (with archiving enabled).
>>
>> My point is that wal file creation usually isn't all that performance
>> sensitive. Once the cluster has enough WAL files it will usually recycle
>> them and thus never allocate new ones. So for this to be really
>> beneficial it would be interesting to show different performance during
>> normal running. You could also check out of how many extents a wal file
>> is made out of with fallocate in comparison to the old style method
>> (filefrag will give you that for most filesystems).
>
> But why does it have to be *really* beneficial? We're already making
> optional posix_fxxx calls and fallocate seems to do exactly what we
> would want in this context. Even if the 100ms drop doesn't show up
> all that often, I'd still take it just for the defragmentation
> benefits and the patch is fairly tiny.
Here is sample output of filefrag on a somewhat busy database from our
testing environment that exactly duplicates our production workloads..
It does a lot of batch processing at night and a mix of 80%oltp 20%
olap during the day. This is on ext3. Interestingly, on ext4 servers
I never saw more than 2 extents per file (but those servers are mostly
not as busy).
[root(at)rpisatysw001 pg_xlog]# filefrag *
00000001000006D200000064: 490 extents found, perfection would be 1 extent
00000001000006D200000065: 33 extents found, perfection would be 1 extent
00000001000006D200000066: 43 extents found, perfection would be 1 extent
00000001000006D200000067: 71 extents found, perfection would be 1 extent
00000001000006D200000068: 43 extents found, perfection would be 1 extent
00000001000006D200000069: 156 extents found, perfection would be 1 extent
00000001000006D20000006A: 52 extents found, perfection would be 1 extent
00000001000006D20000006B: 108 extents found, perfection would be 1 extent
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2013-05-17 20:55:24 | Re: counting algorithm for incremental matview maintenance |
Previous Message | Jaime Casanova | 2013-05-17 20:46:05 | Re: [PATCH]Tablesample Submission |