Re: replication behind high lag

From: Lonni J Friedman <netllama(at)gmail(dot)com>
To: AI Rumman <rummandba(at)gmail(dot)com>
Cc: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: replication behind high lag
Date: 2013-03-25 20:25:48
Message-ID: CAP=oouE=6z1rZO34V4ZmHvp=qiDPUYG3BSViMhAUELnmH9er-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Mar 25, 2013 at 1:23 PM, AI Rumman <rummandba(at)gmail(dot)com> wrote:
>
>
> On Mon, Mar 25, 2013 at 4:03 PM, AI Rumman <rummandba(at)gmail(dot)com> wrote:
>>
>>
>>
>> On Mon, Mar 25, 2013 at 4:00 PM, Lonni J Friedman <netllama(at)gmail(dot)com>
>> wrote:
>>>
>>> On Mon, Mar 25, 2013 at 12:55 PM, AI Rumman <rummandba(at)gmail(dot)com> wrote:
>>> >
>>> >
>>> > On Mon, Mar 25, 2013 at 3:52 PM, Lonni J Friedman <netllama(at)gmail(dot)com>
>>> > wrote:
>>> >>
>>> >> On Mon, Mar 25, 2013 at 12:43 PM, AI Rumman <rummandba(at)gmail(dot)com>
>>> >> wrote:
>>> >> >
>>> >> >
>>> >> > On Mon, Mar 25, 2013 at 3:40 PM, Lonni J Friedman
>>> >> > <netllama(at)gmail(dot)com>
>>> >> > wrote:
>>> >> >>
>>> >> >> On Mon, Mar 25, 2013 at 12:37 PM, AI Rumman <rummandba(at)gmail(dot)com>
>>> >> >> wrote:
>>> >> >> > Hi,
>>> >> >> >
>>> >> >> > I have two 9.2 databases running with hot_standby replication.
>>> >> >> > Today
>>> >> >> > when I
>>> >> >> > was checking, I found that replication has not been working since
>>> >> >> > Mar
>>> >> >> > 1st.
>>> >> >> > There was a large database restored in master on that day and I
>>> >> >> > believe
>>> >> >> > after that the lag went higher.
>>> >> >> >
>>> >> >> > SELECT pg_xlog_location_diff(pg_current_xlog_location(), '0/0')
>>> >> >> > AS
>>> >> >> > offset
>>> >> >> >
>>> >> >> > 431326108320
>>> >> >> >
>>> >> >> > SELECT pg_xlog_location_diff(pg_last_xlog_receive_location(),
>>> >> >> > '0/0')
>>> >> >> > AS
>>> >> >> > receive,
>>> >> >> > pg_xlog_location_diff(pg_last_xlog_replay_location(),
>>> >> >> > '0/0')
>>> >> >> > AS replay
>>> >> >> >
>>> >> >> > receive | replay
>>> >> >> > --------------+--------------
>>> >> >> > 245987541312 | 245987534032
>>> >> >> > (1 row)
>>> >> >> >
>>> >> >> > I checked the pg_xlog in both the server. In Slave the last xlog
>>> >> >> > file
>>> >> >> > -rw------- 1 postgres postgres 16777216 Mar 1 06:02
>>> >> >> > 00000001000000390000007F
>>> >> >> >
>>> >> >> > In Master, the first xlog file is
>>> >> >> > -rw------- 1 postgres postgres 16777216 Mar 1 04:45
>>> >> >> > 00000001000000390000005E
>>> >> >> >
>>> >> >> >
>>> >> >> > Is there any way I could sync the slave in quick process?
>>> >> >>
>>> >> >> generate a new base backup, and seed the slave with it.
>>> >> >
>>> >> >
>>> >> > OK. I am getting these error in slave:
>>> >> > LOG: invalid contrecord length 284 in log file 57, segment 127,
>>> >> > offset
>>> >> > 0
>>> >> >
>>> >> > What is the actual reason?
>>> >>
>>> >> Corruption? What were you doing when you saw the error?
>>> >
>>> >
>>> > I did not have enough idea about these stuffs. I got the database now
>>> > and
>>> > saw the error.
>>> > Is there any way to recover from this state. The master database is a
>>> > large
>>> > database of 500 GB.
>>>
>>> generate a new base backup, and seed the slave with it. if the error
>>> persists, then i'd guess that your master is corrupted, and then
>>> you've got huge problems.
>>
>>
>> Master is running fine right now showing only a warning:
>> WARNING: archive_mode enabled, yet archive_command is not set
>>
>> Do you think the master could be corrupted?
>>
>
> Hi,
>
> I got the info that there was a master db restart on Feb 27th. Could this be
> a reason of this error?
>

restarting the database cleanly should never cause corruption. again,
you need to create a new base backup, and seed the slave with it. if
the problem persists, then the master is likely corrupted.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Misa Simic 2013-03-25 21:32:50 Re: PostgreSQL and VIEWS
Previous Message AI Rumman 2013-03-25 20:23:40 Re: replication behind high lag