Re: Reorder shutdown sequence, to flush pgstats later

From: Andres Freund <andres(at)anarazel(dot)de>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Reorder shutdown sequence, to flush pgstats later
Date: 2025-01-14 05:58:44
Message-ID: 2mxuwhpf4ruxxfw2i57lyaqngx3uod7l53wwem6v6jumy24pum@n2obzjzeny6w
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-01-13 12:20:39 +0000, Bertrand Drouvot wrote:
> > We have
> > multiple copies of code to go to FatalError, that doesn't seem great.
>
> + * FIXME: This should probably not have a copy fo the code to
> + * transition into FatalError state.
> + */
> + ereport(LOG,
> + (errmsg("WAL was shut down unexpectedly")));
>
> Indeed, might be worth extracting this into a helper function or such.

That is quite a rats nest of issues :(. There are quite a few oddities around
the related code. The attached series contains a few commits trying to unify
the paths transitioning to FatalError = true into the new HandleFatalError().

The current code has the weird behaviour of going through PM_WAIT_BACKENDS. I
left it like that for now. In fact more paths do so now, because we did so for
PM_WAIT_XLOG_SHUTDOWN but *didn't* do so for PM_WAIT_XLOG_ARCHIVAL, which
seems rather weird.

I suspect it'd be better to switch directly to PM_DEAD_END and have
HandleFatalError always do ConfigurePostmasterWaitSet(false), but that's left
for another time.

After the earlier commit that just moves the code I turned the existing if ()
into a switch so that whoever adds new states is told to look at that code by
the compiler.

Thoughts?

> > Another thing I started worrying about is that the commit added
> > PM_WAIT_CHECKPOINTER *after* PM_WAIT_DEAD_END. However, we have a few places
> > where we directly jump to PM_WAIT_DEAD_END in fatal situations:

Attached patch is implemented that way.

Greetings,

Andres Freund

Attachment Content-Type Size
v4-0001-checkpointer-Request-checkpoint-via-latch-instead.patch text/x-diff 4.6 KB
v4-0002-postmaster-Don-t-open-code-TerminateChildren-in-H.patch text/x-diff 2.9 KB
v4-0003-postmaster-Don-t-repeatedly-transition-to-crashin.patch text/x-diff 2.4 KB
v4-0004-postmaster-Move-code-to-switch-into-FatalError-st.patch text/x-diff 4.2 KB
v4-0005-WIP-postmaster-Commonalize-FatalError-paths.patch text/x-diff 3.4 KB
v4-0006-postmaster-Adjust-which-processes-we-expect-to-ha.patch text/x-diff 2.8 KB
v4-0007-Change-shutdown-sequence-to-terminate-checkpointe.patch text/x-diff 18.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-01-14 05:59:04 Re: [PATCH] Hex-coding optimizations using SVE on ARM.
Previous Message jian he 2025-01-14 05:51:20 Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row