Re: WAL scenario valid?

From: prakhar jauhari <prak840(at)gmail(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, Keith Ouellette <Keith(dot)Ouellette(at)airgas(dot)com>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: WAL scenario valid?
Date: 2013-06-19 10:15:43
Message-ID: CAEd0_=8hg_LgFg+a7nNdeXvDfs0BbvmM8C42WtV9XOsfujDZJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi,

Ya the error logs came, but only when the DB1 was corrupted. Here is a
excerpt from the logs.

*LOG: database system was shut down at 2013-06-19 09:40:55 UTC
LOG: restored log file "00000004.history" from archive
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or
directory
LOG: restored log file "00000004.history" from archive
LOG: entering standby mode
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/000000030000000000000013': No such
file or directory
LOG: consistent recovery state reached at 0/13000080
LOG: record with zero length at 0/13000080
LOG: database system is ready to accept read only connections
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/000000030000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or
directory
LOG: streaming replication successfully connected to primary
ERROR: cannot execute CREATE ROLE in a read-only transaction
STATEMENT: CREATE USER replicationuser REPLICATION;
LOG: invalid record length at 0/13000080
FATAL: terminating walreceiver process due to administrator command
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or
directory
LOG: invalid record length at 0/13000080
*

This is what happened..
When DB1 came up as a new standby, It connected SR with DB2 (new master)
then after some time it terminated the walreceiver process.
And the invalid record length logs went on, until i took base backup on DB1
and restarted it.

Also On DB1 (before DB1 was started in standby mode) : pg_controldata
/data/pgsql/9.2/data/

pg_control version number: 922
Catalog version number: 201204301
Database system identifier: 5891091665573732008
Database cluster state: shut down
pg_control last modified: Wed Jun 19 09:40:55 2013
Latest checkpoint location: 0/13000020
Prior checkpoint location: 0/124F1BC0
Latest checkpoint's REDO location: 0/13000020
Latest checkpoint's TimeLineID: 3
Latest checkpoint's full_page_writes: off
Latest checkpoint's NextXID: 0/8409
Latest checkpoint's NextOID: 18470
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 669
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Time of latest checkpoint: Wed Jun 19 09:40:54 2013
Minimum recovery ending location: 0/0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
Current wal_level setting: hot_standby
Current max_connections setting: 300
Current max_prepared_xacts setting: 0
Current max_locks_per_xact setting: 64
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value

Will it be wrong to compare* the "last replayed xlog id" on DB2* (which is
the new master) with "Latest checkpoint location" on DB1 before starting
DB1 in standby mode and if "Latest checkpoint location" on DB1(old master)
is greater than *"last replayed xlog id*" on DB2(new master) then i have to
go for basebackup.

regards,
Prakhar.

On Wed, Jun 19, 2013 at 1:11 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>wrote:

> prakhar jauhari wrote:
> > I am facing a similar kind of problem, but in a two node setup.
> Streaming replication is being used
> > with a high wal_keep_segments, using log shipping to bump up timelines
> on the standby server to setup
> > SR(streaming replication).
> > DB1 - master
> >
> > DB2 - standby
> >
> >
> > When a switchover happens DB2 becomes the new master and when DB1 comes
> up it will act as the standby
> > to the new master (.history files from new master are copied to DB1 to
> bump up its timeline so as to
> > setup SR). DB1 is not recreated from scratch. This runs fine in normal
> switchover, but there seems to
> > be problem in the following situation, leading to database corruption:
> >
> >
> > Current state :
> > DB1 - master
> > DB2 - standby
> >
> >
> > Now the failing scenario:
> >
> >
> >
> > 1. DB2 machine goes down.
> >
> > 2. After some time DB1 machine also goes down (DB2 is still down).
> >
> > 3. Now DB2 comes up (it will join the cluster as master as DB1 is still
> down).
> >
> > 4. DB2 is started as master postgresql.
> >
> > 5. Now DB1 comes up (it will join the cluster as standby to DB2)
> >
> > 6. Now when DB1 attempts to SR with DB2, DB1 gets corrupted.
> >
> >
> > Looked into the issue and found that when DB1 went down initially, it
> created some WAL's which were
> > not synced to DB2 as it was already down.
> >
> > Now when DB2 started as master it still had not played the last few
> WAL's created by DB1(when it was
> > master). DB2 starts as master properly.
> >
> > When DB1 came as standby to DB2, it bumped it timeline using history
> file from DB2, but when SR was
> > setup with DB2, DB1 gets corrupted.
> >
> >
> > Now the question is:
> >
> >
> > 1. Is this a theoretically valid approach?
> >
> > 2. If it is a valid approach, then how can i detect such a scenario
> (where SR will corrupt the DB)? So
> > that i can go for a basebackup in such situation.
>
> If you want to use the old primary as new standby without a new backup,
> you have to ascertain that all transactions from the former have
> been replayed at the latter.
>
> To figure out where the primary currently is, you can
> SELECT pg_current_xlog_location();
>
> To figure how much the standby has replayed, you can
> SELECT pg_last_xlog_replay_location();
>
> Of course this only works if both are up.
>
> I think that it would be tricky to automatize that; I'd choose
> making a new backup after each failover.
>
> In the event of a controlled failover it might be an option.
>
> I am surprised that the scenario you described leads to
> corruption; I would have expected an error message.
>
> Yours,
> Laurenz Albe
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message girish R G peetle 2013-06-19 10:31:27 Re: pg_stop_backup is not archiving latest transaction log from pg_xlog directory
Previous Message Szymon Guz 2013-06-19 10:15:26 Re: Why sequence grant is separated from table?