From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Nick Cleaton <nick(at)cleaton(dot)net> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: streaming replication master can fail to shut down |
Date: | 2016-04-29 18:33:32 |
Message-ID: | 20160429183332.5tiaz2ccu36uqjee@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
I pushed a fix for this to 9.4,9.5 and master yesterday. I'm not
convinced it's all that needs to be fixed, particularly for Magnus'
report.
On 2016-04-29 08:05:51 +0100, Nick Cleaton wrote:
> On 29 April 2016 at 04:38, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> >> > I guess you have a fair amount of WAL traffic, and the receiver was
> >> > behind a good bit?
> >>
> >> No, IIRC this was on the test cluster that I installed for the purpose
> >> of replicating the problem under 9.5; it was essentially idle.
> >
> > The reason I'm asking is that I so far can't really replicate the issue
> > so far. It's pretty clear that waiting_for_ping_response = true; is
> > needed, but I'm suspicious that that's not all.
> >
> > Was your standby on a separate machine?
>
> Yes, I've only seen it happen when the standby was on a machine with
> slower CPU cores than the primary. All my attempts to replicate it on
> a single machine by trying to slow down the wal receiver have failed.
> I'm fairly convinced it's some sort of race that depends on wal sender
> + network being faster than wal receiver.
Yes, that's kind of what I'm expecting. You'll only hit that branch if
there's outstanding data to be replicated, but the message has been
handed to the os (!pq_is_send_pending()). Locally that's just a small
data volume, but over actual network on a longer lived connection that
can be a lot more.
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2016-04-29 23:58:37 | Re: [BUGS] Breakage with VACUUM ANALYSE + partitions |
Previous Message | David G. Johnston | 2016-04-29 15:59:26 | Re: BUG #14121: Constraint UNIQUE |