Re: multiple tables got corrupted

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Vasu Madhineni <vasumdba1515(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: multiple tables got corrupted
Date: 2020-09-18 07:42:26
Message-ID: CABUevEypD1tBwan0bV5_gUnJP-k7Ody-j4kdYnPRgJzARcRw_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

That depends on what the problem is and how they fix it. Most likely yes --
especially since if you haven't enabled data checksums you won't *know* if
things are OK or not. So I'd definitely recommend it even if things *look*
OK.

//Magnus

On Wed, Sep 16, 2020 at 5:06 AM Vasu Madhineni <vasumdba1515(at)gmail(dot)com>
wrote:

> I could see block read I/O errors in /var/log/syslog. if those error fixed
> by OS team, will it require recovery.
>
> Also can i use LIMIT and OFFSET to locate corrupted rows?
>
> Thanks in advance
>
> Regards,
> Vasu Madhineni
>
> On Wed, Sep 16, 2020, 01:58 Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
>> Try reading them "row by row" until it breaks. That is, SELECT * FROM ...
>> LIMIT 1, then LIMIT 2 etc. For more efficiency use a binary search starting
>> at what seems like a reasonable place looking at the size of the table vs
>> the first failed block to make it faster, but the principle is the same.
>> Once it fails, you've found a corrupt block...
>>
>> //Magnus
>>
>>
>> On Tue, Sep 15, 2020 at 12:46 PM Vasu Madhineni <vasumdba1515(at)gmail(dot)com>
>> wrote:
>>
>>> Is it possible to identify which rows are corrupted in particular tables.
>>>
>>> On Tue, Sep 15, 2020 at 5:36 PM Magnus Hagander <magnus(at)hagander(dot)net>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Sep 15, 2020 at 11:15 AM Vasu Madhineni <vasumdba1515(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> In one of my postgres databases multiple tables got corrupted and
>>>>> followed the below steps but still the same error.
>>>>>
>>>>> 1.SET zero_damaged_pages = on
>>>>> 2. VACUUM ANALYZE, VACUUM FULL
>>>>> but still same error.
>>>>>
>>>>
>>>>
>>>> That is a very destructive first attempt. I hope you took a full
>>>> disk-level backup of the database before you did that, as it can ruin your
>>>> chances for forensics and data recovery for other issues.
>>>>
>>>>
>>>> moh_fa=# VACUUM FULL;
>>>>> ERROR: could not read block 9350 in file "base/1156523/1270812":
>>>>> Input/output error
>>>>>
>>>>> Tried to take backup of tables with pg_dump but same error. files
>>>>> exist physically in base location.
>>>>>
>>>>> How to proceed on this, no backup to restore.
>>>>>
>>>>>
>>>> This is clearly some sort of disk error, and with no backups to restore
>>>> you will definitely be losing data.
>>>>
>>>> I'd start by figuring out which tables have no corruption and do work,
>>>> and back those up (with pg_dump for example) as soon as possible to a
>>>> different machine -- since it's not exactly unlikely that further disk
>>>> errors will appear.
>>>>
>>>> Once you've done that, identify the tables, and then try to do partial
>>>> recovery. For example, if you look at the file 1270812, how big it is?
>>>> PostgreSQL is failing to read block 9350 which is 76595200 bytes into the
>>>> file. If this is at the very end of the file, you can for example try to
>>>> get the data out until that point with LIMIT. If it's in the middle of the
>>>> file, it gets more ticky, but similar approaches can be done.
>>>>
>>>> Also, unless you are running with data checksums enabled, I wouldn't
>>>> fully trust the data in the tables that you *can* read either. Since you
>>>> clearly have disk issues, they may have caused corruption elsewhere as
>>>> well, so whatever verification you can do against other tables, you should
>>>> do as well.
>>>>
>>>>
>>>> You'll of course also want to check any kernel logs or storage system
>>>> logs to see if they can give you a hint as to what happened, but they are
>>>> unlikely to actually give you something that will help you fix the problem.
>>>>
>>>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vasu Madhineni 2020-09-18 08:16:30 Re: multiple tables got corrupted
Previous Message Øystein Kolsrud 2020-09-18 07:05:29 Re: PostgreSQL processes use large amount of private memory on Windows