Re: replication behind high lag

From: AI Rumman <rummandba(at)gmail(dot)com>
To: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: replication behind high lag
Date: 2013-03-25 20:23:40
Message-ID: CAGoODpe6hfW9W-MkCOkMQn01pTqJhKhgd3gMUg2=0vwJY40iqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Mar 25, 2013 at 4:03 PM, AI Rumman <rummandba(at)gmail(dot)com> wrote:

>
>
> On Mon, Mar 25, 2013 at 4:00 PM, Lonni J Friedman <netllama(at)gmail(dot)com>wrote:
>
>> On Mon, Mar 25, 2013 at 12:55 PM, AI Rumman <rummandba(at)gmail(dot)com> wrote:
>> >
>> >
>> > On Mon, Mar 25, 2013 at 3:52 PM, Lonni J Friedman <netllama(at)gmail(dot)com>
>> > wrote:
>> >>
>> >> On Mon, Mar 25, 2013 at 12:43 PM, AI Rumman <rummandba(at)gmail(dot)com>
>> wrote:
>> >> >
>> >> >
>> >> > On Mon, Mar 25, 2013 at 3:40 PM, Lonni J Friedman <
>> netllama(at)gmail(dot)com>
>> >> > wrote:
>> >> >>
>> >> >> On Mon, Mar 25, 2013 at 12:37 PM, AI Rumman <rummandba(at)gmail(dot)com>
>> >> >> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > I have two 9.2 databases running with hot_standby replication.
>> Today
>> >> >> > when I
>> >> >> > was checking, I found that replication has not been working since
>> Mar
>> >> >> > 1st.
>> >> >> > There was a large database restored in master on that day and I
>> >> >> > believe
>> >> >> > after that the lag went higher.
>> >> >> >
>> >> >> > SELECT pg_xlog_location_diff(pg_current_xlog_location(), '0/0') AS
>> >> >> > offset
>> >> >> >
>> >> >> > 431326108320
>> >> >> >
>> >> >> > SELECT pg_xlog_location_diff(pg_last_xlog_receive_location(),
>> '0/0')
>> >> >> > AS
>> >> >> > receive,
>> pg_xlog_location_diff(pg_last_xlog_replay_location(),
>> >> >> > '0/0')
>> >> >> > AS replay
>> >> >> >
>> >> >> > receive | replay
>> >> >> > --------------+--------------
>> >> >> > 245987541312 | 245987534032
>> >> >> > (1 row)
>> >> >> >
>> >> >> > I checked the pg_xlog in both the server. In Slave the last xlog
>> file
>> >> >> > -rw------- 1 postgres postgres 16777216 Mar 1 06:02
>> >> >> > 00000001000000390000007F
>> >> >> >
>> >> >> > In Master, the first xlog file is
>> >> >> > -rw------- 1 postgres postgres 16777216 Mar 1 04:45
>> >> >> > 00000001000000390000005E
>> >> >> >
>> >> >> >
>> >> >> > Is there any way I could sync the slave in quick process?
>> >> >>
>> >> >> generate a new base backup, and seed the slave with it.
>> >> >
>> >> >
>> >> > OK. I am getting these error in slave:
>> >> > LOG: invalid contrecord length 284 in log file 57, segment 127,
>> offset
>> >> > 0
>> >> >
>> >> > What is the actual reason?
>> >>
>> >> Corruption? What were you doing when you saw the error?
>> >
>> >
>> > I did not have enough idea about these stuffs. I got the database now
>> and
>> > saw the error.
>> > Is there any way to recover from this state. The master database is a
>> large
>> > database of 500 GB.
>>
>> generate a new base backup, and seed the slave with it. if the error
>> persists, then i'd guess that your master is corrupted, and then
>> you've got huge problems.
>>
>
> Master is running fine right now showing only a warning:
> WARNING: archive_mode enabled, yet archive_command is not set
>
> Do you think the master could be corrupted?
>
>
Hi,

I got the info that there was a master db restart on Feb 27th. Could this
be a reason of this error?

Thanks.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Lonni J Friedman 2013-03-25 20:25:48 Re: replication behind high lag
Previous Message Lonni J Friedman 2013-03-25 20:00:10 Re: replication behind high lag