Re: BF animal malleefowl reported an failure in 001_password.pl

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BF animal malleefowl reported an failure in 001_password.pl
Date: 2023-01-14 09:29:42
Message-ID: CA+hUKG+_hTYQRD9w-01-dgQO+cw6zLARGRtr1ziXTgbX=RxEjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 14, 2023 at 8:55 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> writes:
> > I noticed one BF failure[1] when monitoring the BF for some other commit.
> > [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=malleefowl&dt=2023-01-13%2009%3A54%3A51
> > ...
> > So it seems the connection happens before pg_ident.conf is actually reloaded ?
> > Not sure if we need to do something make sure the reload happen, because it's
> > looks like very rare failure which hasn't happen in last 90 days.
>
> That does look like a race condition between config reloading and
> new-backend launching. However, I can't help being suspicious about
> the fact that we haven't seen this symptom before and now here it is
> barely a day after 7389aad63 (Use WaitEventSet API for postmaster's
> event loop). It seems fairly plausible that that did something that
> causes the postmaster to preferentially process connection-accept ahead
> of SIGHUP. I took a quick look through the code and did not see a
> smoking gun, but I'm way too tired to be sure I didn't miss something.

Yeah, I guess the scheduling might go something like this:

1. kill() runs and sets SIGHUP as pending in the postmaster process;
the postmaster is now runnable but not yet running.
2. Meanwhile connect() starts.
3. postmaster starts running, sees the pending signal and immediately
runs the handler, which previously did the actual reload (before doing
anything else) but now just sets our reload-pending flag and does
kill(self, SIGURG), and then returns, so epoll_wait() is unblocked.
4. epoll_wait() returns, reporting two events: signalfd ready to read
(or self-pipe, or EVFILT_SIGNAL), AND connection ready to accept.
5. Connection happens to be reported first so we accept/fork the
connection and reload later.

I think epoll will report fds in the order they became ready
(undocumented, but apparently well known that it's a kind of FIFO
linked list), but that itself is indeterminate, as 2 and 3 race. It
looks like melleefowl is slow/overloaded (often ~3 hours to run the
tests, sometimes ~half and hour and sometimes ~4 hours). Now that I
think about it, it's surprising I haven't seen this before though,
implying that 3 always beats 2, so maybe I'm missing something else...

But if that's the general idea, I suppose there would be two ways to
give higher priority to signals/latches that arrive in the same set of
events: (1) scan the events array twice (for latches then
connections), or (2) check our pending flags every time through the
output events loop, at the top, even for connection events (ie just
move some lines up a bit). Probably 2 is the way to go (see also
discussion about whether we should do that anyway, to give priority to
a shutdown request if it arrives while the server is looping over 64
server sockets that are all ready to accept).

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2023-01-14 09:40:40 Re: fix and document CLUSTER privileges
Previous Message Tatsuo Ishii 2023-01-14 09:19:10 Re: backup.sgml typo