Re: terminating walsender process due to replication timeout

From: AYahorau(at)ibagroup(dot)eu
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-general(at)postgresql(dot)org, rene(dot)romero(dot)b(at)gmail(dot)com
Subject: Re: terminating walsender process due to replication timeout
Date: 2019-05-17 08:04:58
Message-ID: OFE11EC62A.504EB2B3-ON432583FD.002BE231-432583FD.002C666E@iba.by
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello.

Thanks for the answer.

Can frequent database operations cause getting a standby server behind? Is
there a way to avoid this situation?
I checked that walsender works well in my test if I set
wal_sender_timeout at least to 5 second.

Best regards,
Andrei Yahorau

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: AYahorau(at)ibagroup(dot)eu,
Cc: rene(dot)romero(dot)b(at)gmail(dot)com, pgsql-general(at)postgresql(dot)org
Date: 16/05/2019 10:36
Subject: Re: terminating walsender process due to replication
timeout

Hello.

At Wed, 15 May 2019 10:04:12 +0300, AYahorau(at)ibagroup(dot)eu wrote in
<OF99D0D839(dot)6A5BCB70-ON432583FB(dot)0025912E-432583FB(dot)0026D664(at)iba(dot)by>
> Hello,
> Thank You for the response.
>
> Yes that's possible to monitor replication delay. But my questions were
> not about monitoring network issues.
>
> I use exactly wal_sender_timeout=1s because it allows to detect
> replication problems quickly.

Though I don't have an exact idea of your configuration, it seems
to me that your standby is simply getting behind more than one
second from the master. If you regard the fact as a problem of
replication, the configuration can be said to be finding the
problem correctly.

Since the keep-alive packet is sent in-band, it doesn't get to
the standby before already-sent-but-not-processed packets.

> So, I need clarification to the following questions:
> Is it possible to use exactly this configuration and be sure that it
will
> be work properly.
> What did I do wrong? Should I correct my configuration somehow?
> Is this the same issue as mentioned here:
>
https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com

> ? If it is so, why I do I face this problem again?

It is not the same "problem". What was mentioned there is fast
network making the sender-side loop busy, which prevents
keepalive packet from sending.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Malte Swart 2019-05-17 08:56:17 pg_rewind and full_page_writes on zfs
Previous Message Daulat Ram 2019-05-17 06:54:59 FATAL: SMgrRelation hashtable corrupted