Re: Recovery inconsistencies, standby much larger than primary

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Recovery inconsistencies, standby much larger than primary
Date: 2014-01-31 11:13:58
Message-ID: 20140131111358.GC13199@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-01-31 11:09:14 +0000, Greg Stark wrote:
> On Sun, Jan 26, 2014 at 5:45 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >
> >> We're also seeing log entries about "wal contains reference to invalid
> >> pages" but these errors seem only vaguely correlated. Sometimes we get
> >> the errors but the tables don't grow noticeably and sometimes we don't
> >> get the errors and the tables are much larger.
> >
> > Uhm. I am a bit confused. You see those in the standby's log? At !debug
> > log levels? That'd imply that the standby is dead and needed to be
> > recloned, no? How do you continue after that?

> So in chatting with Heikki last night we came up with a scenario where
> this check is insufficient.

But that seems unrelated to the issue at hand, right?

> If you have multiple checkpoints during the base backup then there
> will be restartpoints during recovery. If the reference to the invalid
> page is before the restartpont then after crashing recovery and coming
> back up the recovery will go forward fine.

We don't perform restartpoints if there are invalid pages
registered. Check the XLogHaveInvalidPages() call in xlog.c.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-01-31 11:26:55 Re: Recovery inconsistencies, standby much larger than primary
Previous Message Greg Stark 2014-01-31 11:09:14 Re: Recovery inconsistencies, standby much larger than primary