Re: Interrupts vs signals

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Interrupts vs signals
Date: 2025-03-06 00:47:38
Message-ID: 08382e5c-a820-4aca-90ac-fab50cd4599c@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's a new patch set. It includes the previous work, and also goes the
whole hog and replaces procsignals and many other signalling with
interrupts. It's based on Thomas's
v3-0002-Redesign-interrupts-remove-ProcSignals.patch much earlier in
this thread.

That's included as a separate patch in the patch set, on top of the
previous ones. I like the end state, but I'm not sure if that's the most
sensible way to commit these. It might make more sense to commit the
procsignal->interrupt changes first, and the commit to remove latches
second. Or just commit all in one giant commit. Not sure.

In any case, I hope this split is easier to review at least than just
squashing them all tgoether...

I haven't addressed all your comments yet, and there are some TODO/FIXME
comments. More details below.

On 28/01/2025 19:01, Andres Freund wrote:
> On 2024-12-02 16:39:28 +0200, Heikki Linnakangas wrote:
>> From eff8de11fbfea4e2aadce9c1d71452b0f5a1b80b Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Mon, 2 Dec 2024 15:51:54 +0200
>> Subject: [PATCH v5 1/4] Replace Latches with Interrupts
>
> Your email has only three attachements - I assume the fourth is something
> local that didn't need to be shared?

Yeah, it was a work-in-progress patch to convert config-reload requests
to the interrupts, to see how it works.

>> The Latch was a flag, with an inter-process signalling mechanism that
>> allowed a process to wait for the Latch to be set. My original vision
>> for Latches was that you could have different latches for different
>> things that you need to wait for, but in practice, almost all code
>> used one "process latch" that was shared for all wakeups. The only
>> exception was the "recoveryWakeupLatch". I think the reason that we
>> ended up just sharing the same latch for all wakeups is that it was
>> cumbersome in practice to deal with multiple latches. For starters,
>> there was no machinery for waiting for multiple latches at the same
>> time. Secondly, changing the "ownership" of a latch needed extra
>> locking or other coordination. Thirdly, each inter-process latch
>> needed to be initialized at postmaster startup time.
>>
>> This patch embraces the reality of how Latches were used, and replaces
>> the Latches with a per-process interrupt mask. You can no longer
>> allocate Latches in arbitrary shared memory structs and an interrupt
>> is always directed at a particular process, addressed by its
>> ProcNumber. Each process has a bitmask of pending interrupts in
>> PGPROC.
>
> That doesn't really remove the need for inter-process coordination to know
> what ProcNumber corresponds to what...

Right. Was there a comment that claims it does or something, or why do
you point that out?

>> This commit introduces two interrupt bits. INTERRUPT_GENERAL replaces
>> the general-purpose per-process latch. All code that previously set a
>> process's process latch now sets its INTERRUPT_GENERAL interrupt bit
>> instead.
>
> Half-formed-at-best thought: I wonder if we should split the "interrupt
> yourself in response to a signal" cases from "another process wakes you up",
> even in the initial commit. They seem rather different to me.

Hmm. Some interrupts are indeed clearly only sent within the process,
while others are sent across processes. Not sure where that distinction
would help though. Just group them together and have comments? Or what
did you have in mind?

There are other ways to group the interrupts. Some are harmless if
they're sent to a wrong process. Others are not. Some are handled by
CHECK_FOR_INTERRUTPS, others are not.

>> diff --git a/src/include/storage/interrupt.h b/src/include/storage/interrupt.h
>> new file mode 100644
>> index 0000000000..1da05369c3
>> --- /dev/null
>> +++ b/src/include/storage/interrupt.h
>
> So we now have postmaster/interrupt.[ch] and storage/interrupt.h /
> storage/ipc/interrupt.c.

In the attached version, I renamed the existing
postmaster/interrupt.[ch] to postmaster/interrupt_handlers.[ch]. I think
that helps.

Now that I look at the big picture again, though, I'm thinking that they
should perhaps be merged, so that there would be only
postmaster/interrupt.[ch]. That's how Thomas had them in the original
procsignal->interrupt patches. There's not much left of the old
postmaster/interrupt.[ch], and it's all related to interrupts anyway.

>> + * The "standard" set of interrupts is handled by CHECK_FOR_INTERRUPTS(), and
>> + * consists of tasks that are safe to perform at most times. They can be
>> + * suppressed by HOLD_INTERRUPTS()/RESUME_INTERRUPTS().
>> + *
>> + *
>> + * The correct pattern to wait for event(s) using INTERRUPT_GENERAL is:
>> + *
>> + * for (;;)
>> + * {
>> + * ClearInterrupt(INTERRUPT_GENERAL);
>> + * if (work to do)
>> + * Do Stuff();
>> + * WaitInterrupt(1 << INTERRUPT_GENERAL, ...);
>> + * }
>
> I don't particularly like that there's now dozens of places that need to do
> bit-shifting.
>
> One reason I don't like it is that, for example, this should actually be 1U <<
> INTERRUPT_GENERAL, and at some later point we might need it to be a 64bit
> integer due to a larger number of interrupts, which will require going to all
> extensions and updating them.
>
> Perhaps WaitInterrupt should accept just a single interrupt type and
> WaitEventSets should allow registering/reporting different interrupts to wait
> for as different events?
>
> Or perhaps we could introduce an interrupt mask type?

I changed the values of INTERRUPT_* to be bitmasks, so that you can do
e.g. "INTERRUPT_GENERAL | INTERRUPT_CONFIG_RELOAD" without any
bit-shifting. All the functions like ClearInterrupt, RaiseInterrupt,
SendInterrupt now work with a bitmask, it doesn't have to be just a
single flag anymore.

>> +extern PGDLLIMPORT pg_atomic_uint32 *MyPendingInterrupts;
>> +
>> +/*
>> + * Flags in the pending interrupts bitmask. Each value represents one bit in
>> + * the bitmask.
>> + */
>> +typedef enum
>
> Personally I prefer if we name the objects underlying typedefs, as otherwise
> some tools will label them "anonymous". And, while it doesn't matter for
> enums, which we can't forward declare in our version of C, it also is required
> to make forward declarations work.

Ok

>> +/*
>> + * Test an interrupt flag.
>> + */
>> +static inline bool
>> +InterruptIsPending(InterruptType reason)
>> +{
>> + return (pg_atomic_read_u32(MyPendingInterrupts) & (1 << reason)) != 0;
>> +}
>
> This has no memory ordering guarantees implied - is that correct? I think
> that could lead to missing interrupts in some cases.

I added pg_read_barrier() and pg_write_barrier() calls to these. I'm not
100% if that's really needed, or if I got them right, though.

> Given functions like ClearInterrupt() are <Verb><Object>, why differ here?
>
>
>> +/*
>> + * Test an interrupt flag.
>> + */
>> +static inline bool
>> +InterruptsPending(uint32 mask)
>> +{
>> + return (pg_atomic_read_u32(MyPendingInterrupts) & (mask)) != 0;
>> +}
>
> It seems somewhat confusing to have two naming patterns for this and the
> closely related InterruptIsPending().
>
> Maybe IsInterruptPending() and AreInterruptsPending()? Or
> IsInterruptPendingMask()?
>
> I think I like IsInterruptPendingMask() better, because we're not actually
> waiting for multiple interrupts, we're waiting for one out of multiple
> possible interrupts to arrive

There's now just a single function, InterruptPending(mask), which works
with a single bit or mask of several bits.

I agree it's not a great name, I just didn't get around to think about
that yet.

>> @@ -4476,7 +4458,10 @@ CheckPromoteSignal(void)
>> void
>> WakeupRecovery(void)
>> {
>> - SetLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
>> + ProcNumber procno = ((volatile PROC_HDR *) ProcGlobal)->startupProc;
>> +
>> + if (procno != INVALID_PROC_NUMBER)
>> + SendInterrupt(INTERRUPT_RECOVERY_CONTINUE, procno);
>> }
>
> Not a new problem, and not really a problem for the startup process, that we
> don't expect to die, but: Code like this could obviously lead to interrupts
> being sent to the "former" owner of a procno.
>
> I think either the file header of SendInterrupt() should mention that risk and
> that code has to be aware of it.

Yeah, that's fair. As the patch stands, the rules are different for
different interrupts. Some interrupts are OK to send to random backends;
no harm done. But others, like INTERRUPT_DIE or INTERRUPT_QUERY_CANCEL,
are not. I think we need a mechanism to reliably send an interrupt to
the right backend, without race conditions if the backend exits
concurrently. Holding ProcArrayLock works, but that's a pretty big hammer.

> Should we explicitly define the interrupt state a process has at startup?

Makes sense. (Not done yet, it's a TODO)

>> +/*
>> + * Set an interrupt flag in this backend.
>> + */
>> +void
>> +RaiseInterrupt(InterruptType reason)
>
>> +{
>> + uint32 old_pending;
>> +
>> + old_pending = pg_atomic_fetch_or_u32(MyPendingInterrupts, 1 << reason);
>> +
>> + /*
>> + * If the process is currently blocked waiting for an interrupt to arrive,
>> + * and the interrupt wasn't already pending, wake it up.
>> + */
>> + if ((old_pending & (1 << reason | 1 << SLEEPING_ON_INTERRUPTS)) == 1 << SLEEPING_ON_INTERRUPTS)
>
> This is a somewhat hard to read line. Maybe it's worth breaking it out into
> two conditions? Or use a local variable?

It's a little better now that the bitshifts are gone, but still a fair
point..

> I was chatting with Heikki about this patch and he mentioned that he recalls a
> patch that did some work to unify the signal replacement, procsignal.h and
> CHECK_FOR_INTERRUPTS(). Thomas, that was probably from you? Do you have a
> pointer, if so?
>
> It does seem like we're going to have to do some unification here. We have too
> many different partially overlapping, partially collaborating systems here.
>
>
> - procsignals - kinda like the interrupts here, but not quite. Probably can't
> just merge them 1:1 into the proposed mechanism, or we'll run out of bits
> rather soon. I don't know if we want the relevant multiplexing to be built
> into interrupt.h or not.
>
> Or we ought to redesign this mechanism to deal with more than 32 types of
> interrupt.

In this patch, 29 out of 32 bits are used. Yeah, that doesn't leave much
headroom.

Some ideas for addressing that:

1. Use 64-bit atomics. Are there any architectures left where that would
not be acceptable?

2. Use 64-bit integers, but for the actual signaling part, split it into
two 32-bit atomic words. Waiting on the interrupts would need to check
both, but that seems fine from a performance point of view.

3. If we need > 64 bits, things get a bit awkward as you can no longer
have a single integer to represent a bitmask that includes all the bits.
That makes the function signatures more ugly, you need to have two
arguments like interruptMask1 and interruptMask2 or somehting. But
should basically work otherwise.

4. Multiplex the interrupts so that we need fewer of them. I think all
the recovery conflict interrupts could be merged into one, for example,
if we had a separate bitmask somewhere else in PGPROC to indicate which
ones are set. The limitation would be that if interrupts are multiplexed
together into one bit, you could only wait for all or none of them.

I'd like to not have to treat these bits as a scarce resource. It'd be
nice to use even more distinct interrupts than what's in the patch now
for different things, just for clarity. And I'd like to make it possible
for extensions to allocate interrupt bits too. (Not included in these
patches yet)

I'm leaning towards 1. or 2. at the moment. 64 bits should be enough for
a long time.

> - pmsignal.h - basically the same thing as here, except for signals sent *too*
> postmaster.
>
> It might or might not be required to keep this separate. There are different
> "reliability" requirements...

I think pmsignal.c could be rewritten to use interrupts fairly easily. I
didn't do that yet though. For that, postmaster needs to have a PGPROC
slot, or at least a special reserved ProcNumber, but it doesn't seem hard.

> - CHECK_FOR_INTERRUPTS(), which uses the mechanism introduced here to react to
> signals while blocked, using RaiseInterrupt() (whereas it did SetLatch()
> before).
>
> A fairly simple improvement would be to use different InterruptType for
> e.g. termination and query cancellation.

That's done now.

> But even with this infrastructure in place, we can't *quite* get away from
> signal handlers, because we need some way to set at the very least
> InterruptPending (and perhaps ProcDiePending, QueryCancelPending etc,
> although those could be replaced with InterruptIsPending()). The
> infrastructure introduced with these patches provides neither a way to set
> those variables in response to delivery of an interrupt, nor can we
> currently set them from another backend, as they are global variables.
>
> We could of course do InterruptsPending(~0) in in
> INTERRUPTS_PENDING_CONDITION(), but it would require analyzing the increased
> indirection overhead as well as the "timeliness" guarantees.
>
> Today we don't need a memory barrier around checking InterruptPending,
> because delivery of a signal delivery (via SetLatch()) ensures that. But I
> think we would need one if we were to not send signals for
> cancellation/terminations etc. I.e. right now the overhead of delivery is
> bigger, but checking if there pending signals is cheaper.

That's changed heavily in this latest patch. All the *Pending global
variables are gone, they are replaced with InterruptPending() calls.

> Another aspect of this is that we're cramming more and more code into
> ProcessInterrupts(), in a long weave of ifs. I wonder if somewhere along
> ipc/interrupt.h there should be a facility to register callbacks to react to
> delivered interrupts. It'd need to be a bit more than just a mapping of
> InterruptType to callback, because of things like InterruptHoldoffCount,
> CritSectionCount, QueryCancelHoldoffCount.

+1 for the general idea, but I didn't pursue it yet.

There are some changes to CHECK_FOR_INTERRUPTS() and the holdoff counts
here already though. The pattern for waiting for something now looks
like this:

for (;;)
{
CHECK_FOR_INTERRUPTS();

/*
* The baker will wake us up with INTERRUPT_GENERAL when the cake
* is ready.
*/
ClearInterrupt(INTERRUPT_GENERAL);
if (IsTheCakeBakedYet())
break;

WaitInterrupt(CheckForInterruptsMask | INTERRUPT_GENERAL, ...);
}

CheckForInterruptsMask includes INTERRUPT_BARRIER, INTERRUPT_DIE,
INTERRUPT_QUERY_CANCEL, and all the other interrupts that are handled by
CHECK_FOR_INTERRUPTS(). However, when you call HOLD_INTERRUPTS(),
INTERRUPT_DIE is removed from CheckForInterruptsMask.

(I'm not wedded to the name CheckForInterruptsMask. Thomas called them
"regular interrupts" in his patch. I think I used the term "standard
interrupts" somewhere.)

> Other questions:
>
> - Can ipc/interrupts.c style interrupts be sent by postmaster? I think they
> ought to before long.

Yes. The postmaster does that now to notify backends when background
workers are launched or terminated.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v6-0001-Rename-postmaster-interrupt.c-to-interrupt_handle.patch text/x-patch 14.1 KB
v6-0002-Replace-Latches-with-Interrupts.patch text/x-patch 221.5 KB
v6-0003-Fix-lost-wakeup-issue-in-logical-replication-laun.patch text/x-patch 4.1 KB
v6-0004-Use-INTERRUPT_GENERAL-for-bgworker-state-change-n.patch text/x-patch 23.9 KB
v6-0005-Replace-ProcSignals-and-other-ad-hoc-signals-with.patch text/x-patch 230.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2025-03-06 00:51:50 Re: JSON_VALUE() behavior when RETURNING bytea (expected base64 decoding)
Previous Message Sami Imseih 2025-03-06 00:46:45 Re: track generic and custom plans in pg_stat_statements