Re: Replication failed after stalling

From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Joe Van Dyk <joe(at)tanga(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Replication failed after stalling
Date: 2013-12-31 06:47:31
Message-ID: CAL_0b1vSKeSqDf+T-2iX4Z2MgRk26kEBW4RkxkkjWqVm2QaVTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Dec 30, 2013 at 10:05 PM, Joe Van Dyk <joe(at)tanga(dot)com> wrote:
>> I meant all the replication settings, see [1]. And pg_stat_statements
>> when there is a problem, preferable the error, because when everything
>> is okay it is not very useful actually.
>
> I don't understand, how is pg_stat_statements helpful here, and what error?

The error you showed in the initial email.

My guess is that the master might stop sending WAL records to the
replica, that is why I wanted to check the stat_replication query. Oh,
yes, and I forget to put current_xlog_location in the query. So, the
correct one is below.

\x
select pg_current_xlog_location(), * from pg_stat_replication;

> checkpoint_completion_target: 0.9
> checkpoint_segments: 16
> checkpoint_timeout: 5m
> checkpoint_warning: 30s
[...]
> max_wal_senders: 5
> wal_keep_segments: 10000
> vacuum_defer_cleanup_age: 0
> max_standby_archive_delay: 30s
> max_standby_streaming_delay: -1
> wal_receiver_status_interval: 10s
> hot_standby_feedback: on
[...]

That 10000 looks weird and I would increase checkpoint_segments and
checkpoint_timeout, but first let us check how often checkpoints and
checkpoint warnings happen on master. You can see it in logs. Turn
log_checkpoints on if it is off.

And also how many WAL your system generates and for what period.

ls -lt /path/to/pg_xlog/ | wc -l
ls -lt /path/to/pg_xlog/ | head
ls -lt /path/to/pg_xlog/ | tail

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2013-12-31 08:01:22 Re: Replication failed after stalling
Previous Message Joe Van Dyk 2013-12-31 06:05:23 Re: Replication failed after stalling