Re: Interrupts vs signals

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Interrupts vs signals
Date: 2024-08-25 23:05:46
Message-ID: CA+hUKG+6rn1FNxvO7Wuz_aAQ4MnGW5G6+UW4nBnzwnGnuBVWyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 25, 2024 at 5:17 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 07/08/2024 17:59, Heikki Linnakangas wrote:
> > I'm also wondering about the relationship between interrupts and
> > latches. Currently, SendInterrupt sets a latch to wake up the target
> > process. I wonder if it should be the other way 'round? Move all the
> > wakeup code, with the signalfd, the self-pipe etc to interrupt.c, and in
> > SetLatch, call SendInterrupt to wake up the target process? Somehow that
> > feels more natural to me, I think.
>
> I explored that a little, see attached patch set. It's going towards the
> same end state as your patches, I think, but it starts from different
> angle. In a nutshell:
>
> Remove Latch as an abstraction, and replace all use of Latches with
> Interrupts. When I originally created the Latch abstraction, I imagined
> that we would have different latches for different purposes, but in
> reality, almost all code just used the general-purpose "process latch".
> this patch accepts that reality and replaces the Latch struct directly
> with the interrupt mask in PGPROC.

Some very initial reactions:

* I like it!

* This direction seems to fit quite nicely with future ideas about
asynchronous network I/O. That may sound unrelated, but imagine that
a future version of WaitEventSet is built on Linux io_uring (or
Windows iorings, or Windows IOCP, or kqueue), and waits for the kernel
to tell you that network data has been transferred directly into a
user space buffer. You could wait for the interrupt word to change at
the same time by treating it as a futex[1]. Then all that other stuff
-- signalfd, is_set, maybe_sleeping -- just goes away, and all we have
left is one single word in memory. (That it is possible to do that is
not really a coincidence, as our own Mr Freund asked Mr Axboe to add
it[2]. The existing latch implementation techniques could be used as
fallbacks, but when looked at from the right angle, once you squish
all the wakeup reasons into a single word, it's all just an
implementation of a multiplexable futex with extra steps.)

* Speaking of other problems in other threads that might be solved by
this redesign, I think I see the outline of some solutions to the
problem of different classes of wakeup which you can handle at
different times, using masks. There is a tension in a few places
where we want to handle some kind of interrupts but not others in
localised wait points, which we sort of try to address by holding
interrupts or holding cancel interrupts, but it's not satisfying and
there are some places where it doesn't work well. Needs a lot more
thought, but a basic step would be: after old_interrupt_vector =
pg_atomic_fetch_or_u32(interrupt_vector, new_bits), if
(old_interrupt_vector & new_bits) == new_bits, then you didn't
actually change any bits, so you probably don't really need to wake
the other backend. If someone is currently unable to handle that type
of interrupt (has ignored, ie not cleared, those bits) or is already
in the process of handling it (is currently being rescheduled but
hasn't cleared those bits yet), then you don't bother to wake it up.
Concretely, it could mean that we avoid some of the useless wakeup
storm problems we see in vacuum delays or while executing a query and
not in a good place to handle sinval wakeups, etc. These are just
some raw thoughts, I am not sure about the bigger picture of that
topic yet.

* Archeological note on terminology: the reason almost every relation
database and all the literature uses the term "latch" for something
like our LWLocks seems to be that latches were/are one of the kinds of
system-provided mutex on IBM System/370 and modern descendents ie
z/OS. Oracle and other systems that started as knock-offs of the IBM
System R (the original SQL system, of which DB2 is the modern heir)
continued that terminology, even though they ran on VMS or Unix or
whatever. I would not be sad if we removed our unusual use of the
term latch.

[1] https://man7.org/linux/man-pages/man3/io_uring_prep_futex_wait.3.html
[2] https://lore.kernel.org/lkml/20230720221858(dot)135240-1-axboe(at)kernel(dot)dk/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-08-26 00:00:00 Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Previous Message Alexander Korotkov 2024-08-25 21:22:03 Re: type cache cleanup improvements