Re: Reorder shutdown sequence, to flush pgstats later

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Reorder shutdown sequence, to flush pgstats later
Date: 2025-01-14 13:06:31
Message-ID: Z4ZhV8FBjT64+loW@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Jan 14, 2025 at 12:58:44AM -0500, Andres Freund wrote:
> Hi,
>
> On 2025-01-13 12:20:39 +0000, Bertrand Drouvot wrote:
> > > We have
> > > multiple copies of code to go to FatalError, that doesn't seem great.
> >
> > + * FIXME: This should probably not have a copy fo the code to
> > + * transition into FatalError state.
> > + */
> > + ereport(LOG,
> > + (errmsg("WAL was shut down unexpectedly")));
> >
> > Indeed, might be worth extracting this into a helper function or such.
>
> That is quite a rats nest of issues :(. There are quite a few oddities around
> the related code. The attached series contains a few commits trying to unify
> the paths transitioning to FatalError = true into the new HandleFatalError().

Thanks!

> The current code has the weird behaviour of going through PM_WAIT_BACKENDS. I
> left it like that for now. In fact more paths do so now, because we did so for
> PM_WAIT_XLOG_SHUTDOWN but *didn't* do so for PM_WAIT_XLOG_ARCHIVAL, which
> seems rather weird.
>
> I suspect it'd be better to switch directly to PM_DEAD_END and have
> HandleFatalError always do ConfigurePostmasterWaitSet(false), but that's left
> for another time.

That would probably make more sense, but would be more invasive and would
need careful testing. Worth a XXX comment in the code?

> After the earlier commit that just moves the code I turned the existing if ()
> into a switch so that whoever adds new states is told to look at that code by
> the compiler.

Good idea!

> Thoughts?
>

======== Patch 0001:

LGTM

======== Patch 0002:

It makes use of TerminateChildren() in HandleChildCrash(). TerminateChildren()
follows the exact same logic as the code being replaced (exclude syslogger,
StartupPMChild special case), so LGTM.

======== Patch 0003:

It replaces the take_action boolean flag with an early return.

One comment:

=== 1

+ TerminateChildren(send_abort_for_crash ? SIGABRT : SIGQUIT);

if (Shutdown != ImmediateShutdown)
FatalError = true;

I think we can avoid this remaining extra check and set FatalError to true
unconditionally (as we already know that Shutdown != ImmediateShutdown as per
the first check in the function).

======== Patch 0004:

One comment:

=== 2

As per comment === 1 above, then we could remove:

if (Shutdown != ImmediateShutdown)
FatalError = true;

from HandleFatalError(). And that would be up to the HandleFatalError() caller
to set FatalError to true (prior calling HandleFatalError()).

HandleFatalError becomes a pure "transition to error state" function then.
Does that make sense to you?

======== Patch 0005:

Comments:

=== 3

If so, then in this change:

- FatalError = true;
- UpdatePMState(PM_WAIT_DEAD_END);
- ConfigurePostmasterWaitSet(false);
-
- /* Kill the walsenders and archiver too */
- SignalChildren(SIGQUIT, btmask_all_except(B_LOGGER));
+ HandleFatalError(PMQUIT_FOR_CRASH, false);

we should keep the "FatalError = true" part.

=== 4

+ * XXX: Is it worth inventing a different PMQUIT value
+ * that signals that the cluster is in a bad state,
+ * without a process having crashed?
*/

That would be more "accurate". Something like PMQUIT_FORK_FAILURE or such.

> I suspect it'd be better to switch directly to PM_DEAD_END and have
> HandleFatalError always do ConfigurePostmasterWaitSet(false), but that's left
> for another time.

Before doing so, what do you think about:

1. keeping FatalError = true; as suggeted above
2. put back the UpdatePMState(PM_WAIT_DEAD_END) (prior to the HandleFatalError(
PMQUIT_FOR_CRASH, false) call
3. put the ConfigurePostmasterWaitSet(false) in the PM_WAIT_DEAD_END switch in
HandleFatalError()?

That's not ideal but that way we would keep the "current" behavior (i.e failure
to fork checkpointer transitions through PM_WAIT_DEAD_END) during the time the
"switch directly to PM_DEAD_END" is not implemented?

But that would also need to call TerminateChildren() (a second time) again after
ConfigurePostmasterWaitSet() which is not ideal though (unless we move the
TerminateChildren() calls in the switch or add a check on PM_WAIT_DEAD_END in
the first call).

======== Patch 0006:

The patch adds archiver and walsender to processes that should exit during
crash/immediate shutdown.

LGTM.

======== Patch 0007:

Comments:

=== 5

commit message says "Change shutdown sequence to terminate checkpointer last",
which is not 100% accurate.

While this one is:

"
Checkpointer now is terminated
after all children other than dead-end ones have been terminated,
"

=== 6 (Nit)

+ * Close down the database.

s/database/database system/?

Was already there before the patch, just suggesting in passing...

=== 7

+ * PM_WAIT_XLOG_ARCHIVAL is in proccess_pm_pmsignal(), in response to

typo: s/proccess_pm_pmsignal/process_pm_pmsignal/

=== 8

+ * Now that everyone important is gone

s/everyone important is/walsenders and archiver are also gone/?

=== 9

+ * in proccess_pm_child_exit().

typo: s/proccess_pm_child_exit/process_pm_child_exit/

=== 10

+ /*
+ * Doesn't seem likely to help to take send_abort_for_crash into
+ * account here.
+ */
+ HandleFatalError(PMQUIT_FOR_CRASH, false);

If comments === 2 and === 3 made sense to you then we should set FatalError
to true here prior the HandleFatalError call.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2025-01-14 13:13:21 Re: Recovering from detoast-related catcache invalidations
Previous Message jian he 2025-01-14 12:54:02 Support --include-analyze in pg_dump, pg_dumpall, pg_restore