Re: Race conditions with checkpointer and shutdown

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Race conditions with checkpointer and shutdown
Date: 2019-04-17 23:39:16
Message-ID: CA+hUKGJMA=hyQrZciEm-muYzYM-w1mkR9yWGCVq4Hjg2EBbZOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 17, 2019 at 10:45 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I think what we need to look for is reasons why (1) the postmaster
> never sends SIGUSR2 to the checkpointer, or (2) the checkpointer's
> main loop doesn't get to noticing shutdown_requested.
>
> A rather scary point for (2) is that said main loop seems to be
> assuming that MyLatch a/k/a MyProc->procLatch is not used for any
> other purposes in the checkpointer process. If there were something,
> like say a condition variable wait, that would reset MyLatch at any
> time during a checkpoint, then we could very easily go to sleep at the
> bottom of the loop and not notice that there's a pending shutdown request.

Agreed on the non-composability of that coding, but if there actually
is anything in that loop that can reach ResetLatch(), it's well
hidden...

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-04-18 00:23:47 Re: Cleanup/remove/update references to OID column
Previous Message Tom Lane 2019-04-17 23:22:08 Re: itemptr_encode/itemptr_decode