From: | Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [BUG] Archive recovery failure on 9.3+. |
Date: | 2014-01-09 15:13:11 |
Message-ID: | CAC55fYf+=zf+xpgJKvFSCH9YaxJpTuaQLCOpAB9cqti-zx3zCg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Somebody is reading this thread?
This problem seems still remaining on REL9_3_STABLE.
Many users would face this problem, so we should
resolve this in next release.
I think his patch is reasonable to fix this problem.
Please check this again.
regards,
--------------------------
Tomonari Katsumata
2013/12/12 Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> Hello, we happened to see server crash on archive recovery under
> some condition.
>
> After TLI was incremented, there should be the case that the WAL
> file for older timeline is archived but not for that of the same
> segment id but for newer timeline. Archive recovery should fail
> for the case with PANIC error like follows,
>
> | PANIC: record with zero length at 0/1820D40
>
> Replay script is attached. This issue occured for 9.4dev, 9.3.2,
> and not for 9.2.6 and 9.1.11. The latter search pg_xlog for the
> TLI before trying archive for older TLIs.
>
> This occurrs during fetching checkpoint redo record in archive
> recovery.
>
> > if (checkPoint.redo < RecPtr)
> > {
> > /* back up to find the record */
> > record = ReadRecord(xlogreader, checkPoint.redo, PANIC, false);
>
> And this is caused by that the segment file for older timeline in
> archive directory is preferred to that for newer timeline in
> pg_xlog.
>
> Looking into pg_xlog before trying the older TLIs in archive like
> 9.2- fixes this issue. The attached patch is one possible
> solution for 9.4dev.
>
> Attached files are,
>
> - recvtest.sh: Replay script. Step 1 and 2 makes the condition
> and step 3 causes the issue.
>
> - archrecvfix_20131212.patch: The patch fixes the issue. Archive
> recovery reads pg_xlog before trying older TLI in archive
> similarly to 9.1- by this patch.
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | MauMau | 2014-01-09 15:15:34 | Re: Standalone synchronous master |
Previous Message | Dean Rasheed | 2014-01-09 15:09:44 | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |