Re: Refactoring postmaster's code to cleanup after child exit

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>, Noah Misch <noah(at)leadboat(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Refactoring postmaster's code to cleanup after child exit
Date: 2025-03-06 20:49:20
Message-ID: fd5e9523-78d3-4270-86b2-fd1b1eeb4fc9@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In short, all the 4 patches look good to me. Thanks for picking this up!

On 06/03/2025 22:16, Andres Freund wrote:
> On 2025-03-05 20:49:33 -0800, Noah Misch wrote:
>>> This behaviour makes it really hard to debug problems. It'd have been a lot
>>> easier to understand the problem if we'd seen psql's stderr before the test
>>> died.
>>>
>>> I guess that mean at the very least we'd need to put an eval {} around the
>>> ->pump() call., print $self->{stdout}, ->{stderr} and reraise an error?
>>
>> That sounds right.
>
> In the attached patch I did that for wait_connect(). I did verify that it
> works by implementing the wait_connect() fix before fixing
> 002_connection_limits.pl, which fails if a sleep(1) is added just before the
> proc_exit(1) for FATAL.

+1. For the archives sake, I just want to clarify that this pump stuff
is all about getting better error messages on a test failure. It doesn't
help with the original issue.

This is all annoyingly complicated, but getting good error messages is
worth it.

> On 2025-03-05 08:23:32 +0900, Michael Paquier wrote:>> Why not adding an injection point with a WARNING or a LOG generated,
then
>> check the server logs for the code path taken based on the elog() generated
>> with the point name?
>
> I think the log_min_messages approach is a lot simpler. If we need something
> like this more widely we can reconsider injection points...

+1. It's a little annoying to depend on a detail like the "client
backend process exited" debug message, but seems like the best fix for now.

> I also attached a patch to improve connect_fails()/connect_ok() test names a
> bit. They weren't symmetric and I felt they were lacking in detail for the
> psql return code check.

+1.

While we're at it, attached are a few more cleanups I noticed.

> Another annoying and also funny problem I saw is this failure:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-06%2009%3A18%3A21
> 2025-03-06 10:42:02.552 UTC [372451][postmaster][:0] LOG: 1800 s is outside the valid range for parameter "authentication_timeout" (1 s .. 600 s)
>
> I had to increase PG_TEST_TIMEOUT_DEFAULT due to some other test timing out
> when run under valgrind (due to having to insert a lot of rows). But then this
> test runs into the above issue.
>
> The easiest way seems to be to just limit PG_TEST_TIMEOUT_DEFAULT in this
> test.

LGTM

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
0001-Fix-test-name-and-username-used-in-failed-connection.patch text/x-patch 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-03-06 20:50:46 Re: Statistics Import and Export
Previous Message Peter Eisentraut 2025-03-06 20:33:16 Re: dblink: Add SCRAM pass-through authentication