From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: regression test failed when enabling checksum |
Date: | 2013-04-02 09:45:51 |
Message-ID: | 20130402094551.GA2415@alap2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2013-04-01 19:51:19 -0700, Jeff Janes wrote:
> On Mon, Apr 1, 2013 at 10:37 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> > On Tue, Mar 26, 2013 at 4:23 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> >
> >>
> >> Patch attached. Only brief testing done, so I might have missed
> >> something. I will look more closely later.
> >>
> >
> > After applying your patch, I could run the stress test described here:
> >
> > http://archives.postgresql.org/pgsql-hackers/2012-02/msg01227.php
> >
> > But altered to make use of initdb -k, of course.
> >
> > Over 10,000 cycles of crash and recovery, I encountered two cases of
> > checksum failures after recovery, example:
> > ...
> >
>
>
> > Unfortunately I already cleaned up the data directory before noticing the
> > problem, so I have nothing to post for forensic analysis. I'll try to
> > reproduce the problem.
> >
> >
> I've reproduced the problem, this time in block 74 of relation
> base/16384/4931589, and a tarball of the data directory is here:
>
> https://docs.google.com/file/d/0Bzqrh1SO9FcELS1majlFcTZsR0k/edit?usp=sharing
>
> (the table is in database jjanes under role jjanes, the binary is commit
> 9ad27c215362df436f8c)
>
> What I would probably really want is the data as it existed after the crash
> but before recovery started, but since the postmaster immediately starts
> recovery after the crash, I don't know of a good way to capture this.
>
> I guess one thing to do would be to extract from the WAL the most recent
> FPW for block 74 of relation base/16384/4931589 (assuming it hasn't been
> recycled already) and see if it matches what is actually in that block of
> that data file, but I don't currently know how to do that.
Since I bragged somewhere else recently that it should be easy to do now that
we have pg_xlogdump I hacked it up so it dumps all the full page writes into
the directory specified by --dump-bkp=PATH. It currently overwrites previous
full page writes to the same page but that should be trivial to change if you
want by adding %X.%X for the lsn into the path sprintf.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
0001-pg_xlogdump-add-option-for-dumping-full-page-writes-.patch | text/x-patch | 4.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-04-02 10:32:39 | Re: Page replacement algorithm in buffer cache |
Previous Message | Simon Riggs | 2013-04-02 09:07:02 | Re: regression test failed when enabling checksum |