From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Victor Yegorov <vyegorov(at)gmail(dot)com>, Alexander Kukushkin <cyberdemn(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Autovacuum worker doesn't immediately exit on postmaster death |
Date: | 2020-12-11 02:02:51 |
Message-ID: | CA+hUKG+pHf0NXxAJAs=wZW5cpMUw++gmdb+e=cAy0vEoN9hB8w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Dec 11, 2020 at 8:34 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Oct 29, 2020 at 5:36 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > Maybe instead of thinking specifically in terms of vacuum, we could
> > count buffer accesses (read from kernel) and check the latch once every
> > 1000th such, or something like that. Then a very long query doesn't
> > have to wait until it's run to completion. The cost is one integer
> > addition per syscall, which should be bearable.
>
> Interesting idea. One related case is where everything is fine on the
> server side but the client has disconnected and we don't notice that
> the socket has changed state until something makes us try to send a
> message to the client, which might be a really long time if the
> server's doing like a lengthy computation before generating any rows.
> It would be really nice if we could find a cheap way to check for both
> postmaster death and client disconnect every now and then, like if a
> single system call could somehow answer both questions.
For the record, an alternative approach was proposed[1] that
periodically checks for disconnected sockets using a timer, that will
then cause the next CFI() to abort.
Doing the check (a syscall) based on elapsed time rather than every
nth CFI() or buffer access or whatever seems better in some ways,
considering the difficulty of knowing what the frequency will be. One
of the objections was that it added unacceptable setitimer() calls.
We discussed an idea to solve that problem generally, and then later I
prototyped that idea in another thread[2] about idle session timeouts
(not sure about that yet, comments welcome).
I've also wondered about checking postmaster_possibly_dead in CFI() on
platforms where we have it (and working to increase that set of
platforms), instead of just reacting to PM death when sleeping. But
it seems like the real problem in this specific case is the use of
pg_usleep() where WaitLatch() should be used, no?
The recovery loop is at the opposite end of the spectrum: while vacuum
doesn't check for postmaster death often enough, the recovery loop
checks potentially hundreds of thousands or millions of times per
seconds, which sucks on systems that don't have parent-death signals
and slows down recovery quite measurably. In the course of the
discussion about fixing that[3] we spotted other places that are using
a pg_usleep() where they ought to be using WaitLatch() (which comes
with exit-on-PM-death behaviour built-in). By the way, the patch in
that thread does almost what Robert described, namely check for PM
death every nth time (which in this case means every nth WAL record),
except it's not in the main CFI(), it's in a special variant used just
for recovery.
[1] https://www.postgresql.org/message-id/flat/77def86b27e41f0efcba411460e929ae%40postgrespro.ru
[2] https://www.postgresql.org/message-id/flat/763A0689-F189-459E-946F-F0EC4458980B(at)hotmail(dot)com
[3] https://www.postgresql.org/message-id/flat/CA+hUKGK1607VmtrDUHQXrsooU=ap4g4R2yaoByWOOA3m8xevUQ(at)mail(dot)gmail(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2020-12-11 03:54:22 | Re: please update ps display for recovery checkpoint |
Previous Message | Kyotaro Horiguchi | 2020-12-11 02:00:58 | Re: pg_shmem_allocations & documentation |