Re: Streaming replication slave crash

From: Quentin Hartman <qhartman(at)direwolfdigital(dot)com>
To: Lonni J Friedman <netllama(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Streaming replication slave crash
Date: 2013-03-29 16:35:55
Message-ID: CAJ48qNZMgEv03O_rYS_K_E60CKWfanTSDpr34LjKhh_w67Uh9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Mar 29, 2013 at 10:23 AM, Lonni J Friedman <netllama(at)gmail(dot)com>wrote:

> Looks like you've got some form of coruption:
> page 1441792 of relation base/63229/63370 does not exist
>

Thanks for the insight. I thought that might be it, but never having seen
this before I'm glad to have some confirmation.

The question is whether it was corrupted on the master and then
> replicated to the slave, or if it was corrupted on the slave. I'd
> guess that the pg_dump tried to read from that page and barfed. It
> would be interesting to try re-running the pg_dump again to see if
> this crash can be replicated. If so, does it also replicate if you
> run pg_dump against the master? If not, then the corruption is
> isolated to the slave, and you might have a hardware problem which is
> causing the data to get corrupted.
>

Yes, we've gotten several clean dumps form the slave since then w/o
crashing. We're running these machines on EC2 so we sadly have no control
over the hardware. With your confirmation, and an apparently clean state
now, I'm inclined to chalk this up to an EC2 hiccup getting caught by
Postgres and get on with life.

Thanks!

QH

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2013-03-29 16:37:01 Re: Streaming replication slave crash
Previous Message D'Arcy J.M. Cain 2013-03-29 16:32:54 Re: Money casting too liberal?