Behavior difference for walsender and walreceiver for n/w breakdown case

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Behavior difference for walsender and walreceiver for n/w breakdown case
Date: 2012-09-10 14:57:37
Message-ID: 00a901cd8f64$9e9d4160$dbd7c420$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have observed that currently incase there is a network break between
master and standby, walsender process gets terminated immediately, however
walreceiver detects the breakage after long time.

I could see that there is replication_timeout configuration parameter,
walsender checks for replication_timeout and exits after that timeout.

Shouldn't for walreceiver, there be a mechanism so that it can detect n/w
failure sooner?

Basic Steps to observe above behavior
1. Both master and standby machine are connected normally,
2. then you use the command: ifconfig ip down; make the network card of
master and standby down,
Observation
master can detect connect abnormal, but the standby can't detect connect
abnormal and show a connected channel long time.

Note - Earlier I had sent this on Hackers list also, I just wanted to know
that is it the behavior as defined by PostgreSQL or is it a bug or a new
feature in itself.

In case it is not clear, I will raise a bug.

With Regards,
Amit Kapila

Browse pgsql-general by date

  From Date Subject
Next Message Michael Sacket 2012-09-10 15:03:53 Re: [GENERAL] INSERT. RETURNING for copying records
Previous Message Ireneusz Pluta 2012-09-10 14:55:37 Re: application for postgres Log