From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Scott Marlowe <smarlowe(at)g2switchworks(dot)com> |
Cc: | Michael Stone <mstone+postgres(at)mathom(dot)us>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: WAL sync behaviour |
Date: | 2005-11-10 16:39:34 |
Message-ID: | 9791.1131640774@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Scott Marlowe <smarlowe(at)g2switchworks(dot)com> writes:
> On Thu, 2005-11-10 at 08:43, Michael Stone wrote:
>> There's no reason to use a journaled filesystem for the wal. Use ext2 in
>> preference to ext3.
> Not from what I understood. Ext2 can't guarantee that your data will
> even be there in any form after a crash. I believe only metadata
> journaling is needed though.
No, Mike is right: for WAL you shouldn't need any journaling. This is
because we zero out *and fsync* an entire WAL file before we ever
consider putting live WAL data in it. During live use of a WAL file,
its metadata is not changing. As long as the filesystem follows
the minimal rule of syncing metadata about a file when it fsyncs the
file, all the live WAL files should survive crashes OK.
We can afford to do this mainly because WAL files can normally be
recycled instead of created afresh, so the zero-out overhead doesn't
get paid during normal operation.
You do need metadata journaling for all non-WAL PG files, since we don't
fsync them every time we extend them; which means the filesystem could
lose track of which disk blocks belong to such a file, if it's not
journaled.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | mark | 2005-11-10 16:53:13 | Re: WAL sync behaviour |
Previous Message | Alex Turner | 2005-11-10 16:34:03 | Re: Sort performance on large tables |