Re: BUG #8632: file "pg_subtrans/CEC0" doesn't exist, reading as zeroes

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: strahinjak(at)nordeus(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8632: file "pg_subtrans/CEC0" doesn't exist, reading as zeroes
Date: 2013-11-28 02:56:54
Message-ID: 20131128025653.GD5513@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

strahinjak(at)nordeus(dot)com wrote:

Uh, this is a bit funny. It failed to find the file for a long time and
didn't think to error out, instead choosing to read the requested page
as zeroes:

> 2013-11-26 12:24:54 CET [6393]: [7-1]LOG: file "pg_subtrans/CEC0" doesn't
> exist, reading as zeroes
> 2013-11-26 12:24:54 CET [6393]: [8-1]CONTEXT: xlog redo xid assignment xtop
> 3468737450: subxacts: 3468738448 3468738450 3468738453 3468738455 3468738457
> 3468738459 3468738461 3468738463 3468738465 3468738467 3468738469 3468738471
> 3468738473 3468738475 3468738477 3468738479 3468738481 3468738483 3468738485
> 3468738487 3468738489 3468738491 3468738493 3468738495 3468738497 3468738499
> 3468738501 3468738503 3468738505 3468738507 3468738530 3468738532 3468738534
> 3468738536 3468738538 3468738540 3468738542 3468738544 3468738546 3468738548
> 3468738550 3468738552 3468738554 3468738556 3468738558 3468738560 3468738562
> 3468738564 3468738566 3468738568 3468738570 3468738572 3468738574 3468738576
> 3468738578 3468738580 3468738582 3468738584 3468738586 3468738588 3468738590
> 3468738592 3468738595 3468738597

But as soon as the pg_subtrans file existed, any other error (seek or
read failure) is fatal:

> 2013-11-26 12:24:57 CET [6393]: [103-1]FATAL: could not access status of
> transaction 3468818432
> 2013-11-26 12:24:57 CET [6393]: [104-1]DETAIL: Could not read from file
> "pg_subtrans/CEC1" at offset 253952: Success.

Both these things happen in SlruPhysicalReadPage(). I think those hard
failures are a mistake. In other words, we should do something like
this, which matches what happens if ENOENT:

if (lseek(fd, (off_t) offset, SEEK_SET) < 0)
{
if (InRecovery)
{
ereport(LOG,
(errmsg("file \"%s\" doesn't contain page %u, reading as zeroes",
path, page_number)));
MemSet(shared->page_buffer[slotno], 0, BLCKSZ);
return true;
}

slru_errcause = SLRU_SEEK_FAILED;
slru_errno = errno;
close(fd);
return false;
}

And equivalently for the read() failure.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomonari Katsumata 2013-11-28 04:00:33 Re: BUG #8434: Why does dead lock occur many times ?
Previous Message Tomonari Katsumata 2013-11-28 00:52:24 Re: BUG #8434: Why does dead lock occur many times ?