Quick Links

Re: PATCH: Keep one postmaster monitoring pipe per process

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Marco Pfatschbacher <Marco_Pfatschbacher(at)genua(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PATCH: Keep one postmaster monitoring pipe per process
Date:	2016-09-19 23:07:03
Message-ID:	CAEepm=3FW33PeRxt0jE4N0truJqOepp72R6W-zyM5mu1bxnZRw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Sep 17, 2016 at 6:23 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2016-09-16 09:55:48 +0200, Marco Pfatschbacher wrote:
>> On Thu, Sep 15, 2016 at 12:26:16PM -0700, Andres Freund wrote:
>> > Yikes, that's a pretty absurd implementation.
>>
>> Not when you take into account that it's been written over 20 years ago ;)
>
> Well, that doesn't mean it can't be fixed ;)
>
>> > I'm not quite sure I understand why this an issue here - there shouldn't
>> > ever be events on this fd, so why is the kernel waking up all processes?
>> > It'd kinda makes sense it'd wake up all processes if there's one
>> > waiting, but ... ?
>>
>> Every read is an event, and that's what PostmasterIsAlive does.
>
> But in most places we only do a PostmasterIsAlive if WaitLatch returns
> WL_POSTMASTER_DEATH. The only walreceiver related place that doesn't is
> WalRcvWaitForStartPosition(). If that's indeed the cause of your issues
> this quite possibly could be fixed by doing the
> if (!PostmasterIsAlive())
> exit(1);
> check not unconditionally, but only after the WaitLatch at the end of
> the loop, and only if WL_POSTMASTER_DEATH is returned by WaitLatch()?
> That'll be a minor behaviour change for the WALRCV_RESTARTING, but that
> seems fine, we'll just loop once more outside (after a quick glance at
> least).

Yeah, I wondered why that was different than the pattern established
elsewhere when I was hacking on replication code. There are actually
several places where we call PostmasterIsAlive() unconditionally in a
loop that waits for WL_POSTMASTER_DEATH but ignores the return code:
in pgarch.c, syncrep.c, walsender.c and walreceiver.c. Should we just
change them all to check the return code and exit/break/ereport/etc as
appropriate? That would match the code from autovacuum.c,
checkpointer.c, pgstat.c, be-secure.c and bgworker.c. Something like
the attached.

The code in basebackup.c also waits for WL_POSTMASTER_DEATH but
doesn't check for it in the return value *or* call
PostmasterIsAlive(). I'm not sure what to make of that. I didn't
test it but it looks like maybe it would continue running after
postmaster death but not honour the throttling rate limit because
WaitLatch would keep returning immediately.

--
Thomas Munro
http://www.enterprisedb.com

Attachment	Content-Type	Size
check-return-code-for-postmaster-death.patch	application/octet-stream	5.6 KB

In response to

Re: PATCH: Keep one postmaster monitoring pipe per process at 2016-09-16 18:23:39 from Andres Freund

Responses

Re: PATCH: Keep one postmaster monitoring pipe per process at 2016-09-19 23:26:06 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2016-09-19 23:26:06	Re: PATCH: Keep one postmaster monitoring pipe per process
Previous Message	MauMau	2016-09-19 20:30:55	Re: [bug fix] pg_recvlogical is missing in the Windows installation