Re: corrupted item pointer in streaming based replication

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jigar Shah <jshah(at)pandora(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: corrupted item pointer in streaming based replication
Date: 2013-04-03 20:18:52
Message-ID: 23013.1365020332@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Jigar Shah <jshah(at)pandora(dot)com> writes:
> Postgres version = 9.1.2

Um, you do realize this is over a year out of date right?
(Fortunately, you will have an excellent opportunity to update tomorrow.)

> Few days ago we had a situation where our Primary started to through the error messages below indicating corruption in the database. It crashed sometimes and showed a panic message in the logs

> [d: u:radio p:31917 242] ERROR: could not open file "base/16384/114846.39" (target block 360448000): No such file or directory [d: u:radio p:31917 243]

> 2013-03-27 11:07:51.348 PDT FATAL: corrupted item pointer: offset = 0, size = 0
> 2013-03-27 11:07:51.348 PDT CONTEXT: xlog redo split_l: rel 1663/16384/115085 left 4256959, right 5861610, next 5044459, level 0, firstright 192

Look up relfilenodes 114846 and 115085 in pg_class of whichever database
has OID 16384. I'm guessing the latter is an index of the former. If
that's true, then both of these messages suggest corruption in the index
--- the latter pretty obviously, and the former because it looks like
it's an attempt to fetch from a silly block number, which could have
come out of a corrupted index entry. So if you're really lucky and
nothing but that index is corrupted, a REINDEX will fix it. Personally
I'd be wondering about what's the underlying cause and whether there is
corruption elsewhere, though. Try looking for evidence of flaky RAM or
flaky disk drives on your primary. See if you can pg_dump (not just
for forensic reasons, but so you've got some kind of backup if things
go downhill from here).

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Robert Fitzpatrick 2013-04-03 20:22:21 Re: could not load plperl library SOLVED
Previous Message Lonni J Friedman 2013-04-03 20:06:13 Re: corrupted item pointer in streaming based replication