From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> |
Cc: | Jim Nasby <jim(at)nasby(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: corrupt pages detected by enabling checksums |
Date: | 2013-05-13 13:32:48 |
Message-ID: | 20130513133248.GA27618@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2013-05-12 19:41:26 -0500, Jon Nelson wrote:
> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
> > On 5/10/13 1:06 PM, Jeff Janes wrote:
> >>
> >> Of course the paranoid DBA could turn off restart_after_crash and do a
> >> manual investigation on every crash, but in that case the database would
> >> refuse to restart even in the case where it perfectly clear that all the
> >> following WAL belongs to the recycled file and not the current file.
> >
> >
> > Perhaps we should also allow for zeroing out WAL files before reuse (or just
> > disable reuse). I know there's a performance hit there, but the reuse idea
> > happened before we had bgWriter. Theoretically the overhead creating a new
> > file would always fall to bgWriter and therefore not be a big deal.
>
> For filesystems like btrfs, re-using a WAL file is suboptimal to
> simply creating a new one and removing the old one when it's no longer
> required. Using fallocate (or posix_fallocate) (I have a patch for
> that!) to create a new one is - by my tests - 28 times faster than the
> currently-used method.
I don't think the comparison between just fallocate()ing and what we
currently do is fair. fallocate() doesn't guarantee that the file is the
same size after a crash, so you would still need an fsync() or we
couldn't use fdatasync() anymore. And I'd guess the benefits aren't all
that big anymore in that case?
That said, using posix_fallocate seems like a good idea in lots of
places inside pg, its just not all that easy to do in some of the
places.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Singer | 2013-05-13 13:36:51 | Re: Re: [GENERAL] pg_upgrade fails, "mismatch of relation OID" - 9.1.9 to 9.2.4 |
Previous Message | Jon Nelson | 2013-05-13 13:24:24 | Re: corrupt pages detected by enabling checksums |