Re: Logical wal receiver (background worker) not detecting when publisher node has died

From: Rui DeSousa <rui(at)crazybean(dot)net>
To: Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Logical wal receiver (background worker) not detecting when publisher node has died
Date: 2018-11-24 01:00:13
Message-ID: 849F7F83-090C-41F8-91B6-FF20E736B060@crazybean.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

> On Nov 23, 2018, at 12:20 AM, Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> wrote:
>
> Dear List,
> Coming back from : https://www.postgresql.org/message-id/ae8812c3-d138-73b7-537a-a273e15ef6e1%40matrix.gatewaynet.com
>
> and having got absolutely no helpful answer from our infrastructure people, I would like to ask :
>
> Is it on earth possible that the primary (publisher node) has crushed while on the subscriber node the logical wal receiver goes on happily like there is no problem at all, no messages in log, no timeouts, acts if nothing happen ?
>
> (in the meantime the second standby instantly detected the crush of the primary and immediately restarted re-connection attempts)
>

Not that I can think of without it being a bug.

If it happens again; you can try killing the WAL receiver session via Postgres and if that fails then using tcpkill to terminate the session.

It would be good to know the actual cause though and collect as much information before terminating the session.

Interested; but not sure if it’s related: https://www.evanjones.ca/tcp-stuck-connection-mystery.html

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Rui DeSousa 2018-11-24 01:19:08 Re: Logical wal receiver (background worker) not detecting when publisher node has died
Previous Message Scott Ribe 2018-11-23 22:14:16 Re: User Authentication: LDAP and "local" accounts concurrently ?