Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Steve Kehlet <steve(dot)kehlet(at)gmail(dot)com>, Forums postgresql <pgsql-general(at)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date: 2015-06-01 21:19:41
Message-ID: 20150601211941.GE2988@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Alvaro Herrera wrote:
> Robert Haas wrote:

> > In the process of investigating this, we found a few other things that
> > seem like they may also be bugs:
> >
> > - As noted upthread, replaying an older checkpoint after a newer
> > checkpoint has already happened may lead to similar problems. This
> > may be possible when starting from an online base backup; or when
> > restarting a standby that did not perform a restartpoint when
> > replaying the last checkpoint before the shutdown.
>
> I'm going through this one now, as it's closely related and caused
> issues for us.

FWIW I've spent a rather long while trying to reproduce the issue, but
haven't been able to figure out. Thomas already commented on it: the
server notices that a file is missing and, instead of failing, it "reads
the file as zeroes". This is because of this hack in slru.c, which is
there to cover for a pg_clog replay consideration:

/*
* In a crash-and-restart situation, it's possible for us to receive
* commands to set the commit status of transactions whose bits are in
* already-truncated segments of the commit log (see notes in
* SlruPhysicalWritePage). Hence, if we are InRecovery, allow the case
* where the file doesn't exist, and return zeroes instead.
*/
fd = OpenTransientFile(path, O_RDWR | PG_BINARY, S_IRUSR | S_IWUSR);
if (fd < 0)
{
if (errno != ENOENT || !InRecovery)
{
slru_errcause = SLRU_OPEN_FAILED;
slru_errno = errno;
return false;
}

ereport(LOG,
(errmsg("file \"%s\" doesn't exist, reading as zeroes",
path)));
MemSet(shared->page_buffer[slotno], 0, BLCKSZ);
return true;
}

I was able to cause an actual problem by #ifdefing out the "if
errno/inRcovery" line, of course, but how can I be sure that fixing that
would also fix my customer's problem? That's no good.

Anyway here's a quick script to almost-reproduce the problem. I
constructed many variations of this, trying to find the failing one, to
no avail. Note I'm using the pg_burn_multixact() function to create
multixacts quickly (this is cheating in itself, but it seems to me that
if I'm not able to reproduce the problem with this, I'm not able to do
so without it either.) This script takes an exclusive backup
(cp -pr) excluding pg_multixact, then does some multixact stuff
(including truncation), then copies pg_multixact. Evidently I'm missing
some critical element but I can't see what it is.

Another notable variant was to copy pg_multixact first, then do stuff
(create lots of multixacts, then mxact-freeze/checkpoint),
then copy the rest of the data dir. No joy either: when replay starts,
the missing multixacts are created on the standby and so by the time we
reach the checkpoint record the pg_multixact/offset file has already
grown to the necessary size.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2015-06-01 21:30:14 Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Previous Message Adrian Klaver 2015-06-01 20:57:39 Re: odbc to emulate mysql for end programs

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-06-01 21:30:14 Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Previous Message Tom Lane 2015-06-01 21:18:35 Re: nested loop semijoin estimates