Re: Trying to handle db corruption 9.6

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: Bimal <internetuser2008(at)yahoo(dot)com>
Cc: Greg Clough <Greg(dot)Clough(at)ihsmarkit(dot)com>, PostgreSQL mailing lists <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Trying to handle db corruption 9.6
Date: 2019-05-20 17:20:33
Message-ID: CA+t6e1=+Wmw8WwrMTQR0umR3ktkGiVfFUho8xeSni2Jjm4wKtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-performance

Hey Greg,
Basically my backup was made after the first pg_resetxlog so I was wrong.
However, the customer had a secondary machine that wasn't synced for a
month. I have all the walls since the moment the secondary went out of
sync. Once I started it I hoped that it will start recover the wals and
fill the gap. However I got an error in the secondary :
2019-05-20 10:11:28 PDT 19021 LOG: entering standby mode
2019-05-20 10:11:28 PDT 19021 LOG: invalid primary checkpoint record
2019-05-20 10:11:28 PDT 19021 LOG: invalid secondary checkpoint link in
control file
2019-05-20 10:11:28 PDT 19021 PANIC: could not locate a valid checkpoint
record
2019-05-20 10:11:28 PDT 19018 LOG: startup process (PID 19021) was
terminated by signal 6: Aborted
2019-05-20 10:11:28 PDT 19018 LOG: aborting startup due to startup
process failure
2019-05-20 10:11:28 PDT 19018 LOG: database system is shut down.
I checked my secondary archive dir and pg_xlog dir and
it seems that the restore command doesnt work. My restore_command:
restore_command = 'rsync -avzhe ssh postgres(at)x(dot)x(dot)x(dot)x:/var/lib/pgsql/archive/%f
/var/lib/pgsql/archive/%f ; gunzip < /var/lib/pgsql/archive/%f > %p'
archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
/var/lib/pgsql/archive %r'

On Mon, May 20, 2019, 7:20 PM Bimal <internetuser2008(at)yahoo(dot)com> wrote:

> I had ran into same issue about year back, luckily I had standby to
> quickly promote. But, I wish there was better a documentation on how to
> handle WAL log fill up and resetting them.
>
> On Monday, May 20, 2019, 9:08:19 AM PDT, Mariel Cherkassky <
> mariel(dot)cherkassky(at)gmail(dot)com> wrote:
>
>
> A backup was made after the corruption appeared but before I tried using
> the pg_resetxlog command. Basically I just want to start the database with
> the data that is available in the files(I'm ok with loosing data that was
> in the cache and wasnt written to disk).
> My question is how can I continue from here ?
> I also sent this mail to pgadmin mail list..
>
> ‫בתאריך יום ב׳, 20 במאי 2019 ב-18:59 מאת ‪Greg Clough‬‏ <‪
> Greg(dot)Clough(at)ihsmarkit(dot)com‬‏>:‬
>
> > Yes I understand that.. I'm trying to handle it after the backup that I
> have taken..
>
>
> IMHO the best option here is to keep safe a copy as you have already done
> and then restore from a backup, and replay whatever WAL you have. The
> database you have is terminally corrupted, and should never be relied upon
> going forward.
>
>
>
> You can try to get it running, and then extract the data with pg_dump...
> but even then you will need to manually verify it’s OK because you have no
> idea which dirty blocks from memory have been written to disk and which
> have not. Without the WAL you have no way of making it consistent, and if
> they have been destroyed then you’re out of luck.
>
>
>
> If you don’t have backups and archived WAL then fixing what you’ve got may
> be your only option, but you should only go down that route if you have
> to. If you have to “repair”, then I’d recommend engaging a reputable
> PostgreSQL consultancy to help you.
>
>
>
> Regards,
>
> Greg.
>
> P.S. This conversation should probably be moved to something like
> pgsql-admin
>
>
>
> ------------------------------
>
> This e-mail, including accompanying communications and attachments, is
> strictly confidential and only for the intended recipient. Any retention,
> use or disclosure not expressly authorised by IHSMarkit is prohibited. This
> email is subject to all waivers and other terms at the following link:
> https://ihsmarkit.com/Legal/EmailDisclaimer.html
>
> Please visit www.ihsmarkit.com/about/contact-us.html for contact
> information on our offices worldwide.
>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tomas Vondra 2019-05-20 20:55:53 Re: Trying to handle db corruption 9.6
Previous Message Bimal 2019-05-20 16:20:45 Re: Trying to handle db corruption 9.6

Browse pgsql-performance by date

  From Date Subject
Next Message Jeremy Altavilla 2019-05-20 20:19:48 Re: Analyze results in more expensive query plan
Previous Message Bimal 2019-05-20 16:20:45 Re: Trying to handle db corruption 9.6