Re: Re: corruption issue after server crash - ERROR: unexpected chunk number 0

From: Mike Broers <mbroers(at)gmail(dot)com>
To: Shaun Thomas <sthomas(at)optionshouse(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Re: corruption issue after server crash - ERROR: unexpected chunk number 0
Date: 2013-11-26 17:59:50
Message-ID: CAB9893hYnB=Z8TG57-8C71t_+5P=2em85S3O+y86CQOOU5Qubg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

The restore of a post-crash production backup worked as hoped and the 2nd
replication slave is back into its happy hot standby state.

So if this problem replicated to our standby servers does that indicate
that the potential problematic fsync occurred during a pg_xlog write?
Would breaking replication at the time of the crash have prevented this
from cascading or was it already too late at that point?

Thanks again for the input, its been very helpful!
Mike

On Mon, Nov 25, 2013 at 12:20 PM, Mike Broers <mbroers(at)gmail(dot)com> wrote:

> Thanks Shaun,
>
> Im planning to schedule a time to do the vacuum freeze suggested
> previously. So far the extent of the problem seems limited to the one
> session table and the one session row that was being used by a heavy bot
> scan at the time of the crash. Currently Im testing a recovery of a
> production backup from today to rebase one of the replication targets that
> I was using to test fixes last week. Hopefully that validates the current
> backups and I can proceed inquiring with our managed services provider
> about the false notification of the disk write and ways to prevent that
> going forward.
>
> I'll update the list if I uncover anything interesting in the process
> and/or need more advice, thanks again for your input - its much appreciated
> as always. Nothing like a little crash corruption to get the blood flowing!
>
> Mike
>
>
> On Mon, Nov 25, 2013 at 10:29 AM, Shaun Thomas <sthomas(at)optionshouse(dot)com>wrote:
>
>> > Update - I have two hot replication slaves of this db, both have the
>> problem.
>> > I took one out of recovery and ran REINDEX table session_session and it
>> > fixed the errors about this row. Now Im going to run vacuum and see if
>> > there are other tables that complain, but Im guessing if so I will need
>> to see
>> > if there is a way to force vacuum to continue on error, worst case I
>> might
>> > have to script a table by table vacuum script I guess.. If anyone has
>> a better
>> > suggestion for determining the extent of the damage Id appreciate it.
>>
>> Oh man. I'm sorry, Mike.
>>
>> One of the cardinal rules I have is to disconnect any replication
>> following a database crash. It's just too easy for damaged replicated rows
>> to be propagated unless you're on 9.3 and have checksums enabled. If you
>> want to perform a table-by-table check, don't vacuum the database, but the
>> individual tables. I'd go with a DO loop and have it raise notices into the
>> log so you can investigate further:
>>
>> COPY (
>> SELECT 'VACUUM ' || oid::regclass::text || ';'
>> FROM pg_class
>> WHERE relkind = 'r'
>> ) to '/tmp/vac_all.sql';
>>
>> Run the /tmp/vac_all.sql through psql and pipe the contents into a log
>> file. Any table that doesn't vacuum successfully will need to be repaired
>> manually. One way you can do this if there are dupes, is by checking the
>> ctid value after disabling index scans:
>>
>> SET enable_indexscan TO False;
>>
>> SELECT ctid, * FROM [broken_table] WHERE ...;
>>
>> Just construct the WHERE clause based on the error output, and you should
>> get all rows if there are dupes. You'll need to figure out which row to
>> keep, then delete the bad row based on the ctid. Do this as many times as
>> it takes, then reindex to make sure the proper row versions are indexed.
>>
>> It's also a good idea to dump any table that came back with an error,
>> just in case.
>>
>> After you've done all of that, you should re-base your replicas once
>> you've determined your production system is usable. In the meantime, I
>> highly recommend you set up a VIP you can assign to one of your replicas if
>> your production system dies again, and remove any autostart code. If your
>> production system crashes, switch the VIP immediately to a replica, and
>> invalidate your old production system. Data corruption is insidious when
>> streaming replication is involved.
>>
>> Look into tools like repmgr to handle managing your replicas as a cluster
>> to make forced invalidation and re-basing easier.
>>
>> Good luck!
>>
>> --
>> Shaun Thomas
>> OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604
>> 312-676-8870
>> sthomas(at)optionshouse(dot)com
>>
>> ______________________________________________
>>
>> See http://www.peak6.com/email_disclaimer/ for terms and conditions
>> related to this email
>>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jesus Rafael Sanchez Medrano 2013-11-26 18:24:01 Any advantage of using SSL with a certificate of authority?
Previous Message maillists0 2013-11-26 17:50:34 Re: AccessShareLock and Resource Contention