Re: Problems with Streaming Replication and PostgreSQL 9.2.2

From: Lonni J Friedman <netllama(at)gmail(dot)com>
To: JotaComm <jota(dot)comm(at)gmail(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Problems with Streaming Replication and PostgreSQL 9.2.2
Date: 2013-04-01 18:22:07
Message-ID: CAP=oouF86ScWDc+vcFdczS02gPfs3vhG_wYJ-VC6kJu4jvQM+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

If you're not running pg_basebackup with the -x option, how are you
getting the WAL files onto the slave?

On Mon, Apr 1, 2013 at 11:19 AM, JotaComm <jota(dot)comm(at)gmail(dot)com> wrote:
> Hello, everybody
>
> I have one problem and I need some help.
>
> My environment: one master and one slave (PostgreSQL 9.2.2).
>
> My cluster has about 160GB and pg_basebackup to syncronize them (master and
> slave).
>
> The syntax is:
>
> pg_basebackup -h productionaddress -p productionport -U productionuser -D
> datadirectory -P -v
>
> My recovery.conf:
>
> standby_mode = 'on'
>
> primary_conninfo = 'host=productionaddress port=productionport
> user=productionuser'
>
> archive_cleanup_command = 'pg_archivecleanup /slave/transactionlogs %r'
>
> My postgresql.conf: (master)
>
> wal_level = hot_standby
>
> checkpoint_segments = 10
>
> archive_mode = on
>
> archive_command = 'rsync -Crap %p
> postgres(at)slaveaddress:/slave/transactionlogs/%f'
>
> max_wal_senders = 1
>
> wal_keep_segments = 50
>
> My postgresql.conf: (slave)
>
> checkpoint_segments = 10
>
> hot_standby = on
>
> In my slave (logs) I have the following erros: My first attempt
>
> 2013-03-07 15:58:21 BRT [11817]: [1-1] user=,db= LOG: database system was
> interrupted; last known up at 2013-03-07 15:55:43 BRT
> 2013-03-07 15:58:21 BRT [11817]: [2-1] user=,db= LOG: entering standby mode
> 2013-03-07 15:58:21 BRT [11818]: [1-1] user=,db= LOG: streaming replication
> successfully connected to primary
> 2013-03-07 15:58:25 BRT [11817]: [3-1] user=,db= LOG: consistent recovery
> state reached at 141/8FBB5F0
> 2013-03-07 15:58:25 BRT [11817]: [4-1] user=,db= LOG: redo starts at
> 141/2251F90
> 2013-03-07 15:58:25 BRT [11817]: [5-1] user=,db= FATAL: could not access
> status of transaction 30622931
> 2013-03-07 15:58:25 BRT [11817]: [6-1] user=,db= DETAIL: Could not read
> from file "pg_clog/001D" at offset 49152: Success.
> 2013-03-07 15:58:25 BRT [11817]: [7-1] user=,db= CONTEXT: xlog redo commit:
> 2013-03-07 15:55:40.673623-03
> 2013-03-07 15:58:25 BRT [11767]: [1-1] user=,db= LOG: startup process (PID
> 11817) exited with exit code 1
> 2013-03-07 15:58:25 BRT [11767]: [2-1] user=,db= LOG: terminating any other
> active server processes
>
> In my slave (logs) I have the following erros: My second attempt
>
> 2013-03-11 12:07:49 BRT [5862]: [1-1] user=,db= LOG: database system was
> interrupted; last known up at 2013-03-11 12:06:31 BRT
> 2013-03-11 12:07:49 BRT [5862]: [2-1] user=,db= LOG: entering standby mode
> 2013-03-11 12:07:49 BRT [5864]: [1-1] user=,db= LOG: streaming replication
> successfully connected to primary
> 2013-03-11 12:07:53 BRT [5862]: [3-1] user=,db= LOG: consistent recovery
> state reached at 168/816AE10
> 2013-03-11 12:07:53 BRT [5862]: [4-1] user=,db= LOG: redo starts at
> 167/FEC3D828
> 2013-03-11 12:07:53 BRT [5862]: [5-1] user=,db= FATAL: could not access
> status of transaction 36529670
> 2013-03-11 12:07:53 BRT [5862]: [6-1] user=,db= DETAIL: Could not read from
> file "pg_clog/0022" at offset 212992: Success.
> 2013-03-11 12:07:53 BRT [5862]: [7-1] user=,db= CONTEXT: xlog redo commit:
> 2013-03-11 12:05:35.069759-03
> 2013-03-11 12:07:53 BRT [5762]: [1-1] user=,db= LOG: startup process (PID
> 5862) exited with exit code 1
> 2013-03-11 12:07:53 BRT [5762]: [2-1] user=,db= LOG: terminating any other
> active server processes
>
> I had the same problem, but in different files (pg_clog):
>
> First attempt:
>
> 2013-03-07 15:58:25 BRT [11817]: [5-1] user=,db= FATAL: could not access
> status of transaction 30622931
> 2013-03-07 15:58:25 BRT [11817]: [6-1] user=,db= DETAIL: Could not read
> from file "pg_clog/001D" at offset 49152: Success.
>
> Second attempt:
>
> 2013-03-11 12:07:53 BRT [5862]: [5-1] user=,db= FATAL: could not access
> status of transaction 36529670
> 2013-03-11 12:07:53 BRT [5862]: [6-1] user=,db= DETAIL: Could not read from
> file "pg_clog/0022" at offset 212992: Success.
>
> When I started the synchronism at the first time (no data), it was OK. Now,
> my cluster has about 160GB and when the replication finishes I have these
> problems. I noticed that pg_clog/current_file is the problem.
>
> Any idea? Suggestions?
>
> Thank you.
>
> Best Regards
>
> João Paulo
> --
> JotaComm
> http://jotacomm.wordpress.com

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L. Friedman netllama(at)gmail(dot)com
LlamaLand https://netllama.linux-sxs.org

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message JotaComm 2013-04-01 18:30:48 Re: Problems with Streaming Replication and PostgreSQL 9.2.2
Previous Message JotaComm 2013-04-01 18:19:46 Problems with Streaming Replication and PostgreSQL 9.2.2