Re: Refactoring postmaster's code to cleanup after child exit

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Refactoring postmaster's code to cleanup after child exit
Date: 2024-08-09 21:13:37
Message-ID: e9e58efa-a364-4f4f-b112-a11d62af952a@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/08/2024 13:47, Thomas Munro wrote:
> On Windows, if a child process exits with ERROR_WAIT_NO_CHILDREN, it's
> now logged with that exit code, instead of 0. Also, if a bgworker
> exits with ERROR_WAIT_NO_CHILDREN, it's now treated as crashed and is
> restarted. Previously it was treated as a normal exit.
>
> Interesting. So when that error was first specially handled in this thread:
>
> https://www.postgresql.org/message-id/flat/AANLkTimCTkNKKrHCd3Ot6kAsrSS7SeDpOTcaLsEP7i%2BM%40mail.gmail.com#41f60947571b75377f04af67ba6baf40
>
> ... it went from being considered a crash, to being considered like
> exit(0). It's true that shared memory can't be corrupted by a process
> that never enters main(), but it's better not to hide the true reason
> for the failure (if it is still possible -- I don't find many
> references to that phenomenon in recent times). Clobbering exitstatus
> with 0 doesn't seem right at all, now that we have background workers
> whose restart behaviour is affected by that.

I adjusted this ERROR_WAIT_NO_CHILDREN a little more, to avoid logging
the death of the child twice in some cases.

>> * v3-0003-Fix-comment-on-processes-being-kept-over-a-restar.patch
>
> +1

Committed the patches up to and including this one, with tiny comment
changes.

>> * v3-0004-Consolidate-postmaster-code-to-launch-background-.patch
>
> Much of the code in process_pm_child_exit() to launch replacement
> processes when one exits or when progressing to next postmaster state
> was unnecessary, because the ServerLoop will launch any missing
> background processes anyway. Remove the redundant code and let
> ServerLoop handle it.

I'm going to work a little more on the comments on this one before
committing; I had just moved all the "If we have lost the XXX, try to
start a new one" comments as is, but they look pretty repetitive now.

Thanks for the review!

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-08-09 21:42:29 Re: Remaining dependency on setlocale()
Previous Message Peter Geoghegan 2024-08-09 21:13:28 Re: Adding skip scan (including MDAM style range skip scan) to nbtree