Re: Data corruption

From: Ron <ronljohnsonjr(at)gmail(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Data corruption
Date: 2021-07-21 18:33:45
Message-ID: 608090c5-0972-9fee-fb6f-9462ee3fc941@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I've zeroes out Pg 8.4 pc_clog files using dd (instructions are out there on
the web) without any visible negative effect.

Naturally, YMMV.

On 7/21/21 11:14 AM, Konrad Garus wrote:
> Hello,
>
> we have a database that apparently got corrupted in a crash a few months
> ago, long enough that there is no "good" backup.
>
> There are at least two tables occasionally emitting errors like this:
>
> ERROR: invalid page header in block 39684 of relation base/12926/32397
> while performing SQL query: SQL DML: insert into foo (a, b, c) values (?,
> ?, ?) | PARAMETERS: foo, bar, baz: org.postgresql.util.PSQLException:
> ERROR: invalid page header in block 39684 of relation base/12926/32397
>
> The files in question (e.g. 32397) correspond to the table itself, not an
> index. The larger of these tables has two files (32397 & 32397.1), almost
> 2G total.
>
> To make things worse, vacuum ends with:
>
> ERROR:  could not access status of transaction 2828785257
> DETAIL:  Could not open file "pg_clog/0A89": No such file or directory.
>
> That file is nowhere to be found.
>
> When I do this:
>
> set zero_damaged_pages = on;
> select count(*) from foo
>
> It shows 3 errors like this:
>
> WARNING:  invalid page header in block 78550 of relation base/12926/31642;
> zeroing out page
> WARNING:  invalid page header in block 78551 of relation base/12926/31642;
> zeroing out page
> WARNING:  invalid page header in block 78552 of relation base/12926/31642;
> zeroing out page
>
> One idea I found is to generate a zero file for the missing clog, set
> zero_damaged_pages=on, then vacuum or recreate the table. Looks like it
> might work, but I'm trying to understand the implications.
>
> 1. When I generate a zero clog file like this, what impact will it have on
> everything? Can it break anything else in the database, introduce some
> kind of inconsistency with other (correctly committed) table data?
>
> 2. How can I gauge how much actual data (e.g. how many rows) will be lost
> by doing this (zero clog file + vacuum/recreate table)?
>
> 3. Is there some place I can look to reliably tell how many rows the table
> should have, so I can compare to how many I can read after setting
> zero_damaged_pages?
>
> 4. Is there any way at all to recover/repair the corrupted data at this point?
>
> --
> Konrad Garus

--
Angular momentum makes the world go 'round.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Laurenz Albe 2021-07-21 20:02:15 Re: Data corruption
Previous Message Tom Lane 2021-07-21 18:17:18 Re: Postgres 13.3 times out when attempting to connect via odbc & pgAdmin4