pgsql: Fix waitpid() emulation on Windows.

From: Thomas Munro <tmunro(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix waitpid() emulation on Windows.
Date: 2023-03-15 00:45:08
Message-ID: E1pcFGO-003a3J-3D@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix waitpid() emulation on Windows.

Our waitpid() emulation didn't prevent a PID from being recycled by the
OS before the call to waitpid(). The postmaster could finish up
tracking more than one child process with the same PID, and confuse
them.

Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),
so that resources are released synchronously. The process and PID
continue to exist until we close the process handle, which only happens
once we're ready to adjust our book-keeping of running children.

This seems to explain a couple of failures on CI. It had never been
reported before, despite the code being as old as the Windows port.
Perhaps Windows started recycling PIDs more rapidly, or perhaps timing
changes due to commit 7389aad6 made it more likely to break.

Thanks to Alexander Lakhin for analysis and Andres Freund for tracking
down the root cause.

Back-patch to all supported branches.

Reported-by: Andres Freund <andres(at)anarazel(dot)de>
Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/d41a178b3a7ac0c2ca16f70129899ddabc2ce468

Modified Files
--------------
src/backend/postmaster/postmaster.c | 70 +++++++++++++++++++++----------------
1 file changed, 40 insertions(+), 30 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Thomas Munro 2023-03-15 00:45:22 pgsql: Fix waitpid() emulation on Windows.
Previous Message Tom Lane 2023-03-14 23:17:46 pgsql: Fix corner case bug in numeric to_char() some more.