From: | Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> |
---|---|
To: | Greg Smith <greg(at)2ndquadrant(dot)com> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: fallocate / posix_fallocate for new WAL file creation (etc...) |
Date: | 2013-06-30 23:41:57 |
Message-ID: | CAKuK5J2QK0M1ON=DiB00Yp4_0B__Ks1J=axAYeJYBiA55FUqbg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jun 30, 2013 at 5:55 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>
>
> pwrite(4, "\0", 1, 16769023) = 1
> pwrite(4, "\0", 1, 16773119) = 1
> pwrite(4, "\0", 1, 16777215) = 1
>
> That's glibc helpfully converting your call to posix_fallocate into small
> writes, because the OS doesn't provide a better way in that kernel. It's
> not hard to imagine this being slower than what the WAL code is doing right
> now. I'm not worried about correctness issues anymore, but my gut paranoia
> about this not working as expected on older systems was justified. Everyone
> who thought I was just whining owes me a cookie.
I had noted in the very early part of the thread that glibc emulates
posix_fallocate when the (Linux-specific) 'fallocate' systemcall
fails. In this case, it's writing 4 bytes of zeros and then
essentially seeking forward 4092 (4096-4) bytes. This prevents files
with holes in them because the holes have to be at least 4kiB in size,
if I recall properly. It's *not* writing out 16MiB in 4 byte
increments.
--
Jon
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2013-06-30 23:49:20 | Re: fallocate / posix_fallocate for new WAL file creation (etc...) |
Previous Message | Greg Smith | 2013-06-30 22:55:39 | Re: fallocate / posix_fallocate for new WAL file creation (etc...) |