Re: Checkpoint Err on Startup of Rsynced System

From: Jim Longwill <longwill(at)psmfc(dot)org>
To: Scott Mead <scottm(at)openscg(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Checkpoint Err on Startup of Rsynced System
Date: 2016-05-31 22:11:41
Message-ID: 574E0C1D.5050507@psmfc.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Scott,
Thanks. If I understand you correctly.. Actually, we did have M1
shutdown when the inital clone was done (some weeks ago). That was done
using the VMWare system, not rsync. My main problem is that I don't
have WAL archiving setup yet (I've not changed the Postgres defaults on
this so far). That's part of what the new machine M2 is for.. to
practice doing this before adjusting our production machine (M1). As
regards doing a snapshot, I thought that the manual 'CHECKPOINT' would
take care of it.

So, this time around I may try to do a manual (initdb.. & pg_restore
from backup files) on the new machine in order to get a roughly
equivalent installation going. One early goal I have is to get
archiving setup & working at beyond the minimal level.

Jim Longwill

On 05/31/2016 11:50 AM, Scott Mead wrote:
>
>
> On Tue, May 31, 2016 at 1:13 PM, Jim Longwill <longwill(at)psmfc(dot)org
> <mailto:longwill(at)psmfc(dot)org>> wrote:
>
> I am trying to setup a 2nd, identical, db server (M2) for
> development and I've run into a problem with starting up the 2nd
> Postgres installation.
>
> Here's what I've done:
> 1) did a 'clone' of 1st (production) machine M1 (so both
> machines on Cent OS 7.2)
> 2) setup an rsync operation, did a complete 'rsync' from M1 to M2
> 3) did a final 'CHECKPOINT' command on M1 postgres
> 4) shutdown postgres on M1 with 'pg_ctl stop'
> 5) did final 'rsync' operation (then restarted postgres on M1
> with 'pg_ctl start')
> 6) tried to startup postgres on M2
>
> It won't start, & in the log file gives the error message:
> ...
> < 2016-05-31 09:02:52.337 PDT >LOG: invalid primary checkpoint record
> < 2016-05-31 09:02:52.337 PDT >LOG: invalid secondary checkpoint
> record
> < 2016-05-31 09:02:52.337 PDT >PANIC: could not locate a valid
> checkpoint record
> < 2016-05-31 09:02:53.184 PDT >LOG: startup process (PID 26680)
> was terminated by signal 6: Aborted
> < 2016-05-31 09:02:53.184 PDT >LOG: aborting startup due to
> startup process failure
>
> I've tried several times to do this but always get this result.
> So, do I need to do a new 'initdb..' operation on machine M2 +
> restore from M1 backups? Or is there another way to fix this?
>
>
> You should have stopped M1 prior to taking the backup. If you can't do
> that, it can be done online via:
>
> 1. Setup archiving
> 2. select pg_start_backup('some label');
> 3. <run rsync>
> 4. select pg_stop_backup();
>
> Without archiving and the pg_[start|stop]_backup, you're not
> guaranteed anything. You could use an atomic snapshot (LVM, storage,
> etc...), but it's got to be a true snapshot. Without that, you need
> archiving + start / stop backup.
>
> Last section of:
> https://wiki.postgresql.org/wiki/Simple_Configuration_Recommendation#Physical_Database_Backups
> will take you to:
> https://www.postgresql.org/docs/current/static/continuous-archiving.html
>
> --Scott
>
>
> --o--o--o--o--o--o--o--o--o--o--o--o--
> Jim Longwill
> PSMFC Regional Mark Processing Center
> JLongwill(at)psmfc(dot)org <mailto:JLongwill(at)psmfc(dot)org>
> --o--o--o--o--o--o--o--o--o--o--o--o--
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org
> <mailto:pgsql-general(at)postgresql(dot)org>)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
> --
> --
> Scott Mead
> Sr. Architect
> /OpenSCG <http://openscg.com>/
> http://openscg.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-05-31 23:48:48 Re: Row security policies documentation question
Previous Message Venkata Balaji N 2016-05-31 22:00:44 Re: Checkpoint Err on Startup of Rsynced System