Re: Interrupts vs signals

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Interrupts vs signals
Date: 2024-08-30 22:17:40
Message-ID: 391abe21-413e-4d91-a650-b663af49500c@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26/08/2024 02:05, Thomas Munro wrote:
> On Sun, Aug 25, 2024 at 5:17 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> On 07/08/2024 17:59, Heikki Linnakangas wrote:
>>> I'm also wondering about the relationship between interrupts and
>>> latches. Currently, SendInterrupt sets a latch to wake up the target
>>> process. I wonder if it should be the other way 'round? Move all the
>>> wakeup code, with the signalfd, the self-pipe etc to interrupt.c, and in
>>> SetLatch, call SendInterrupt to wake up the target process? Somehow that
>>> feels more natural to me, I think.
>>
>> I explored that a little, see attached patch set. It's going towards the
>> same end state as your patches, I think, but it starts from different
>> angle. In a nutshell:
>>
>> Remove Latch as an abstraction, and replace all use of Latches with
>> Interrupts. When I originally created the Latch abstraction, I imagined
>> that we would have different latches for different purposes, but in
>> reality, almost all code just used the general-purpose "process latch".
>> this patch accepts that reality and replaces the Latch struct directly
>> with the interrupt mask in PGPROC.
>
> Some very initial reactions:
>
> * I like it!

Ok, here's a new version, with a bunch of bugs and FIXMEs fixed,
comments and other polish. A few things remain that I'm not sure about:

- Terminology. We now have storage/interrupt.h and
postmaster/interrupt.h. They're not the same thing, but there's also
some overlap, especially after your patch set is applied.

Aside from filenames, "interrupt" now means at least two things: a
wakeup that you can send to a backend with SendInterrupt, and the thing
that you check with CHECK_FOR_INTERRUPTS(). They're related, but
different. To show case that confusion, the patch now contains this gem,
which was a result of mechanically replacing "latch" with "interrupt":

* We process interrupts whenever the interrupt has been set, so
* cancel/die interrupts are processed quickly.

- Fujii: this replaces the "recoveryWakeupLatch" with a separate
interrupt type. There was an earlier attempt at replacing
recoveryWakeupLatch with the process's regular latch which was reverted,
commits ac22929a26, and 00f690a239. I think this solution doesn't suffer
from the same problems as that earlier attempt, but if you have a chance
to review this, I would appreciate that. In a nutshell,
INTERRUPT_RECOVERY_CONTINUE interrupt is now used instead of
recoveryWakeupLatch, but the places that previously waited just on
recoveryWakeupLatch, now wait on both INTERRUPT_GENERAL_WAKEUP, which is
equivalent to waiting on the process latch, and
INTERRUPT_RECOVERY_CONTINUE. So those loops should now react more
quickly to signals like SIGHUP.

- Backwards-compatibility and extensions. This will break any extensions
that use Latches. Extensions that just used WaitLatch(MyLatch) or
similar are easy to convert to use latches instead. Or we could keep
around some backwards-compatibility macros like 0008 here does, to avoid
the code churn. However, if an extension is creating its own latches or
doing more complicated stuff with them, it gets harder to maintain
source-code compatibility for them.

Aside from the backwards-compatibility aspect, should we reserve a few
INTERRUPT_* values for extensions?

> * This direction seems to fit quite nicely with future ideas about
> asynchronous network I/O. That may sound unrelated, but imagine that
> a future version of WaitEventSet is built on Linux io_uring (or
> Windows iorings, or Windows IOCP, or kqueue), and waits for the kernel
> to tell you that network data has been transferred directly into a
> user space buffer. You could wait for the interrupt word to change at
> the same time by treating it as a futex[1]. Then all that other stuff
> -- signalfd, is_set, maybe_sleeping -- just goes away, and all we have
> left is one single word in memory. (That it is possible to do that is
> not really a coincidence, as our own Mr Freund asked Mr Axboe to add
> it[2]. The existing latch implementation techniques could be used as
> fallbacks, but when looked at from the right angle, once you squish
> all the wakeup reasons into a single word, it's all just an
> implementation of a multiplexable futex with extra steps.)

Cool

> * Speaking of other problems in other threads that might be solved by
> this redesign, I think I see the outline of some solutions to the
> problem of different classes of wakeup which you can handle at
> different times, using masks. There is a tension in a few places
> where we want to handle some kind of interrupts but not others in
> localised wait points, which we sort of try to address by holding
> interrupts or holding cancel interrupts, but it's not satisfying and
> there are some places where it doesn't work well. Needs a lot more
> thought, but a basic step would be: after old_interrupt_vector =
> pg_atomic_fetch_or_u32(interrupt_vector, new_bits), if
> (old_interrupt_vector & new_bits) == new_bits, then you didn't
> actually change any bits, so you probably don't really need to wake
> the other backend. If someone is currently unable to handle that type
> of interrupt (has ignored, ie not cleared, those bits) or is already
> in the process of handling it (is currently being rescheduled but
> hasn't cleared those bits yet), then you don't bother to wake it up.
> Concretely, it could mean that we avoid some of the useless wakeup
> storm problems we see in vacuum delays or while executing a query and
> not in a good place to handle sinval wakeups, etc. These are just
> some raw thoughts, I am not sure about the bigger picture of that
> topic yet.

Yeah, I expect this work to help with those issues, but also not sure of
the details yet.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v2-0001-Remove-unused-latch.patch text/x-patch 2.0 KB
v2-0002-Remove-unneeded-include.patch text/x-patch 772 bytes
v2-0003-Rename-SetWalSummarizerLatch-function.patch text/x-patch 2.3 KB
v2-0004-Address-walwriter-and-checkpointer-by-proc-number.patch text/x-patch 4.5 KB
v2-0005-Address-walreceiver-by-procno-not-direct-pointer-.patch text/x-patch 6.3 KB
v2-0006-Use-proc-number-for-wakeups-in-waitlsn.c.patch text/x-patch 4.6 KB
v2-0007-Clean-up-existing-WaitLatch-calls-that-passed-MyL.patch text/x-patch 2.4 KB
v2-0008-Replace-Latches-with-Interrupts.patch text/x-patch 134.0 KB
v2-0009-Replace-ProcSendSignal-ProcWaitForSignal-with-new.patch text/x-patch 5.8 KB
v2-0010-Replace-all-remaining-Latch-calls-with-new-Interr.patch text/x-patch 70.2 KB
v2-0011-Fix-lost-wakeup-issue-in-logical-replication-laun.patch text/x-patch 4.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2024-08-30 23:51:06 Re: Use read streams in pg_visibility
Previous Message Jelte Fennema-Nio 2024-08-30 21:24:58 Re: [EXTERNAL] Re: Add non-blocking version of PQcancel