From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BF animal malleefowl reported an failure in 001_password.pl |
Date: | 2023-01-16 22:24:23 |
Message-ID: | CA+hUKGKykFAoj3Ydyi84aXyQc-mFgPKPadQ2ppsGMqhzcAxDNA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jan 15, 2023 at 12:35 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Here's a sketch of the first idea.
To hit the problem case, the signal needs to arrive in between the
latch->is_set check and the epoll_wait() call, and the handler needs
to take a while to get started. (If it arrives before the
latch->is_set check we report WL_LATCH_SET immediately, and if it
arrives after the epoll_wait() call begins, we get EINTR and go back
around to the latch->is_set check.) With some carefully placed sleeps
to simulate a CPU-starved system (see attached) I managed to get a
kill-then-connect sequence to produce:
2023-01-17 10:48:32.508 NZDT [555849] LOG: nevents = 2
2023-01-17 10:48:32.508 NZDT [555849] LOG: events[0] = WL_SOCKET_ACCEPT
2023-01-17 10:48:32.508 NZDT [555849] LOG: events[1] = WL_LATCH_SET
2023-01-17 10:48:32.508 NZDT [555849] LOG: received SIGHUP, reloading
configuration files
With the patch I posted, we process that in the order we want:
2023-01-17 11:06:31.340 NZDT [562262] LOG: nevents = 2
2023-01-17 11:06:31.340 NZDT [562262] LOG: events[1] = WL_LATCH_SET
2023-01-17 11:06:31.340 NZDT [562262] LOG: received SIGHUP, reloading
configuration files
2023-01-17 11:06:31.344 NZDT [562262] LOG: events[0] = WL_SOCKET_ACCEPT
Other thoughts:
Another idea would be to teach the latch infrastructure itself to
magically swap latch events to position 0. Latches are usually
prioritised; it's only in this rare race case that they are not.
Or going the other way, I realise that we're lacking a "wait for
reload" mechanism as discussed in other threads (usually people want
this if they care about its effects on backends other than the
postmaster, where all bets are off and Andres once suggested the
ProcSignalBarrier hammer), and if we ever got something like that it
might be another solution to this particular problem.
Attachment | Content-Type | Size |
---|---|---|
x.patch | text/x-patch | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-01-16 22:37:28 | Re: almost-super-user problems that we haven't fixed yet |
Previous Message | Peter Geoghegan | 2023-01-16 21:58:21 | Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation |