Re: multiple tables got corrupted

From: Vasu Madhineni <vasumdba1515(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: multiple tables got corrupted
Date: 2020-09-15 10:46:34
Message-ID: CAFacQoFDNM1A67RUQv2WLj2224NC8rcL8rdcxoxHsahKeKdXMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Is it possible to identify which rows are corrupted in particular tables.

On Tue, Sep 15, 2020 at 5:36 PM Magnus Hagander <magnus(at)hagander(dot)net> wrote:

>
>
> On Tue, Sep 15, 2020 at 11:15 AM Vasu Madhineni <vasumdba1515(at)gmail(dot)com>
> wrote:
>
>> Hi All,
>>
>> In one of my postgres databases multiple tables got corrupted and
>> followed the below steps but still the same error.
>>
>> 1.SET zero_damaged_pages = on
>> 2. VACUUM ANALYZE, VACUUM FULL
>> but still same error.
>>
>
>
> That is a very destructive first attempt. I hope you took a full
> disk-level backup of the database before you did that, as it can ruin your
> chances for forensics and data recovery for other issues.
>
>
> moh_fa=# VACUUM FULL;
>> ERROR: could not read block 9350 in file "base/1156523/1270812":
>> Input/output error
>>
>> Tried to take backup of tables with pg_dump but same error. files exist
>> physically in base location.
>>
>> How to proceed on this, no backup to restore.
>>
>>
> This is clearly some sort of disk error, and with no backups to restore
> you will definitely be losing data.
>
> I'd start by figuring out which tables have no corruption and do work, and
> back those up (with pg_dump for example) as soon as possible to a different
> machine -- since it's not exactly unlikely that further disk errors will
> appear.
>
> Once you've done that, identify the tables, and then try to do partial
> recovery. For example, if you look at the file 1270812, how big it is?
> PostgreSQL is failing to read block 9350 which is 76595200 bytes into the
> file. If this is at the very end of the file, you can for example try to
> get the data out until that point with LIMIT. If it's in the middle of the
> file, it gets more ticky, but similar approaches can be done.
>
> Also, unless you are running with data checksums enabled, I wouldn't fully
> trust the data in the tables that you *can* read either. Since you clearly
> have disk issues, they may have caused corruption elsewhere as well, so
> whatever verification you can do against other tables, you should do as
> well.
>
>
> You'll of course also want to check any kernel logs or storage system logs
> to see if they can give you a hint as to what happened, but they are
> unlikely to actually give you something that will help you fix the problem.
>
> --
> Magnus Hagander
> Me: https://www.hagander.net/ <http://www.hagander.net/>
> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rébeli-Szabó Tamás 2020-09-15 14:12:39 How to calculate shared memory size
Previous Message Magnus Hagander 2020-09-15 09:36:02 Re: multiple tables got corrupted