Re: streaming replication master can fail to shut down

From: Andres Freund <andres(at)anarazel(dot)de>
To: Nick Cleaton <nick(at)cleaton(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: streaming replication master can fail to shut down
Date: 2016-04-28 18:14:24
Message-ID: 20160428181424.u5bzgbhk77va7j24@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2016-03-11 14:12:37 +0000, Nick Cleaton wrote:
> This patch is enough to eliminate the problem on my hardware

> diff -Nurd postgresql-9.5.1.orig/src/backend/replication/walsender.c postgresql-9.5.1/src/backend/replication/walsender.c
> --- postgresql-9.5.1.orig/src/backend/replication/walsender.c 2016-02-08 21:12:28.000000000 +0000
> +++ postgresql-9.5.1/src/backend/replication/walsender.c 2016-03-11 11:56:41.121361222 +0000
> @@ -2502,8 +2502,10 @@
>
> proc_exit(0);
> }
> - if (!waiting_for_ping_response)
> + if (!waiting_for_ping_response) {
> WalSndKeepalive(true);
> + waiting_for_ping_response = true;
> + }
> }

That looks (besides non-postges paren placement), reasonable. Will
commit & backpatch (to 9.4, where it looks like the bug has been
introduced).

> in this test the server sent only 29 keepalives during the shutdown:
> http://nick.cleaton.net/protodump-100k-nossl-patched.xz (47k)

I guess you have a fair amount of WAL traffic, and the receiver was
behind a good bit?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Nick Cleaton 2016-04-28 19:04:21 Re: streaming replication master can fail to shut down
Previous Message Andres Freund 2016-04-28 16:49:47 Re: [BUGS] Breakage with VACUUM ANALYSE + partitions