Re: pg_standby replication problem

From: Khangelani Gama <kgama(at)argility(dot)com>
To: Alan Hodgson <ahodgson(at)simkin(dot)ca>, pgsql-general(at)postgresql(dot)org
Subject: Re: pg_standby replication problem
Date: 2014-06-09 15:06:03
Message-ID: 36e864716fcb063194f5f95e5fc0b35c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Alan Hodgson
Sent: Monday, June 09, 2014 4:51 PM
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] pg_standby replication problem

On Monday, June 09, 2014 04:28:53 PM Khangelani Gama wrote:
> Please help me with this, my secondary server shows a replication problem.
> It stopped at the file called *0000000500004BAF000000AF …*then from
> here primary server kept on sending walfiles, until the walfiles used
> up the disc space in the data directory. How do I fix this problem.
> It’s postgres 9.1.2.
>

It looks to me like your archive_command is probably failing on the primary
server. If that fails, the logs will build up and fill up your disk as
described. And they wouldn't be available to the slave to find.

I am sorry, I am still trying to understand all the settings, the person who
set up the servers left the company.

In primary server, postgresql.conf shows the following:

# WRITE AHEAD LOG
#------------------------------------------------------------------------------

# - Settings -

wal_level = archive
# - Checkpoints -

checkpoint_segments = 128
checkpoint_timeout = 15min
checkpoint_warning = 885s
# - Archiving -

archive_mode = on
#archive_mode = off # allows archiving to be done
archive_command = '/home/cdbs/bin/run_replication.sh %p %f'

# REPLICATION
#------------------------------------------------------------------------------

# - Master Server -

# These settings are ignored on a standby server

max_wal_senders = 3

The setting archive_command points to a script being run and the variable %p
and %f being passed.

There is replication script running in the primary server has the
following:

while [ $test = "false" ]
do
rsync -a /pgsql2/data/${src}
postgres(at)10(dot)58(dot)101(dot)10:/pgsql2/walfiles/${dest} >>
/tmp/run_replication.sh.out 2>> /tmp/run_replication.sh.out
test=`ssh AB_CDS3 "if [ -f /pgsql2/walfiles/${dest} ];then echo
'true' ;else echo 'false';fi"`
if [ ${test} = "false" ]
then
echo "Test is false for CDS3, sleeping 10" >>
/tmp/run_replication.sh.out
sleep 10
cnt=$(( $cnt + 1 ))
if [ ${cnt} -ge 60 ]
then
message="Replication ERROR: Unable to send WAL
file(${desc}) from CDS to CDS3"
echo "`date` : ${message}" >>
/tmp/run_replication.sh.out
sendsms
fi
fi
done

--
Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

CONFIDENTIALITY NOTICE
The contents of and attachments to this e-mail are intended for the addressee only, and may contain the confidential
information of Argility (Proprietary) Limited and/or its subsidiaries. Any review, use or dissemination thereof by anyone
other than the intended addressee is prohibited.If you are not the intended addressee please notify the writer immediately
and destroy the e-mail. Argility (Proprietary) Limited and its subsidiaries distance themselves from and accept no liability
for unauthorised use of their e-mail facilities or e-mails sent other than strictly for business purposes.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Khangelani Gama 2014-06-09 15:25:51 Re: pg_standby replication problem
Previous Message Alan Hodgson 2014-06-09 14:51:05 Re: pg_standby replication problem