From: | AYahorau(at)ibagroup(dot)eu |
---|---|
To: | Rene Romero Benavides <rene(dot)romero(dot)b(at)gmail(dot)com> |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: terminating walsender process due to replication timeout |
Date: | 2019-05-15 07:04:12 |
Message-ID: | OF99D0D839.6A5BCB70-ON432583FB.0025912E-432583FB.0026D664@iba.by |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
Thank You for the response.
Yes that's possible to monitor replication delay. But my questions were
not about monitoring network issues.
I use exactly wal_sender_timeout=1s because it allows to detect
replication problems quickly.
So, I need clarification to the following questions:
Is it possible to use exactly this configuration and be sure that it will
be work properly.
What did I do wrong? Should I correct my configuration somehow?
Is this the same issue as mentioned here:
https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com
? If it is so, why I do I face this problem again?
Thank you in advance.
Best regards,
Andrei
From: Rene Romero Benavides <rene(dot)romero(dot)b(at)gmail(dot)com>
To: AYahorau(at)ibagroup(dot)eu,
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Date: 14/05/2019 20:12
Subject: Re: terminating walsender process due to replication
timeout
To detect network issues maybe you could monitor replication delay.
On Mon, May 13, 2019 at 6:42 AM <AYahorau(at)ibagroup(dot)eu> wrote:
Hello PostgreSQL Community!
I faced an issue on my linux machine using Postgres 11.3 .
I have 2 nodes in db cluster: master and standby.
I tried to perform a plenty of long-running queries which lead to the
databases desynchronization:
terminating walsender process due to replication timeout
Here is the output in debug mode:
2019-05-13 13:21:33 FET 00000 DEBUG: sending replication keepalive
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed;
blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed;
blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed;
blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed;
blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed;
blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed;
blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed;
blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed;
blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed;
blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed;
blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed;
blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed;
blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 LOG: terminating walsender process due to
replication timeout
The issue is reproducible. I configure 2 nodes cluster, download
demo_small.zip from https://edu.postgrespro.ru/ and run the following
command:
psql -U user1 -f demo_small.sql db1
and I get the observed behaviour.
I know that I can increase wal_sender_timeout value to avoid this
behaviour (currently wal_sender_timeout is equal to 1 second.)
To be honest I don't want to increase wal_sender_timeout because I would
like to detect some network issues quickly.
After having googled I found that someone faced a similar issue
https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com
which was fixed in PostgreSQL 9.4.16.
Is my issue the same as described here
https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com
?
Is there any other chance to avoid it without increasing
wal_sender_timeout?
Thank you in advance.
Regards,
Andrei
--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/
From | Date | Subject | |
---|---|---|---|
Next Message | Prakash Ramakrishnan | 2019-05-15 09:37:01 | Re: perl path issue |
Previous Message | Stephen Frost | 2019-05-15 00:08:51 | Re: Table update: restore or replace? |