Re: emergency outage requiring database restart

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: emergency outage requiring database restart
Date: 2016-10-17 19:16:02
Message-ID: CAHyXU0wDNLi-dtbRwLZv0_Eqr_dXZJHvD4pEx3mYZQ6pTjVVZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 17, 2016 at 2:04 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Merlin Moncure wrote:
>
>> castaging=# CREATE OR REPLACE VIEW vw_ApartmentSample AS
>> castaging-# SELECT ...
>> ERROR: 42809: "pg_cast_oid_index" is an index
>> LINE 11: FROM ApartmentSample s
>> ^
>> LOCATION: heap_openrv_extended, heapam.c:1304
>>
>> should I be restoring from backups?
>
> It's pretty clear to me that you've got catalog corruption here. You
> can try to fix things manually as they emerge, but that sounds like a
> fool's errand.

agreed. current plan is to restore from backups, and recover as much
data as I can. Also doing bugfix release and going to enable
checksums.

We had several good backups since the previous outage so it's not
clear the events are related but after months of smooth operation I
find that coincidence highly suspicious. As always, we need to suspect
hardware problems but I'm highly abstracted from them -- using esx +
san.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-10-17 19:21:31 Re: emergency outage requiring database restart
Previous Message Alvaro Herrera 2016-10-17 19:04:03 Re: emergency outage requiring database restart