From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, matioli(dot)matheus(at)gmail(dot)com, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>, Максим Панченко <Panchenko(at)gw(dot)tander(dot)ru>, Сизов Сергей Павлович <sizov_sp(at)gw(dot)tander(dot)ru> |
Subject: | Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages |
Date: | 2014-01-13 21:29:59 |
Message-ID: | 52D45AD7.9080406@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On 01/13/2014 11:02 PM, Andres Freund wrote:
> On 2014-01-13 22:40:32 +0200, Heikki Linnakangas wrote:
>> With RBM_NORMAL_ZERO_OK, AFAICS we're talking about a tiny patch to
>> XLogReadBufferExtended. bufmgr.c doesn't need to do anything about the new
>> mode, as it's XLogReadBuffer that does the the check for zero pages. Per
>> attached patch (for demonstration purposes only, you also need to add the
>> new mode to the header file and adjust comments).
>
> I thought about that approach at first as well, but I am not so sure
> it's sufficient. Isn't it quite possible that we'd end up reading a page
> that was *partially* written during a crash and due to that has a
> corrupted checksum? There won't be any protection due to WAL replay/full
> page writes against that case here.
Good point. Normally, we expect the checksum to match on all pages that
we read during WAL replay, because full page writes will initialize any
page that is modified to an untorn state, before it's ever read. But we
can't rely on that in the extra read that btree_xlog_vacuum() does. It's
possible that there's a torn page on disk on block X, and we're
vacuuming page X + 1. The page will be fixed by a later record in the
WAL, before we reach consistency, but the ReadBuffer call from
btree_xlog_vacuum() will throw an error.
> Now, you could argue that that shouldn't be the case because we're only
> entering that codepath once STANDBY_SNAPSHOT_READY and you might be
> right...
I don't think that saves us. standbyMode can be STANDBY_SNAPSHOT_READY,
before we reach consistency. Adding a check for reachedConsistency,
though, ought to fix it.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-01-13 21:36:41 | Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages |
Previous Message | Tom Lane | 2014-01-13 21:25:48 | Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages |
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2014-01-13 21:30:07 | Re: Standalone synchronous master |
Previous Message | Greg Stark | 2014-01-13 21:29:02 | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |