From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Refactoring postmaster's code to cleanup after child exit |
Date: | 2024-12-09 12:47:49 |
Message-ID: | c576eddd-13da-4bad-a760-c98dfe9a314d@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/9/24 13:30, Heikki Linnakangas wrote:
> On 09/12/2024 01:12, Tomas Vondra wrote:
>> On 11/14/24 15:13, Heikki Linnakangas wrote:
>>> On 09/10/2024 23:40, Heikki Linnakangas wrote:
>>>> I pushed the first three patches, with the new test and one of the
>>>> small
>>>> refactoring patches. Thanks for all the comments so far! Here is a new
>>>> version of the remaining patches.
>>>>
>> Hi, the TAP test 001_connection_limits.pl introduced by 6a1d0d470e84
>> seems to have problems with valgrind :-( I reliably get this failure:
>
> How exactly do you run the test with valgrind? What platform?
>
It failed for me on both amd64 (Fedora 41) and rpi5 32/64-bit (Debian).
> It works for me, with this:
>
> (cd build && ninja && rm -rf tmp_install && meson test --suite setup &&
> valgrind --leak-check=no --gen-suppressions=all --suppressions=/home/
> heikki/git-sandbox/postgresql/src/tools/valgrind.supp --time-stamp=yes
> --error-markers=VALGRINDERROR-BEGIN,VALGRINDERROR-END --log-file=$HOME/
> pg-valgrind/%p.log --trace-children=yes meson test --suite postmaster )
>
I have a patch that tweaks pg_ctl/pg_regress to execute valgrind, so I
just do
./configure --enable-debug --prefix=/home/user/builds/master
--enable-depend --enable-cassert --enable-tap-tests CPPFLAGS="-O0 -ggdb3
-DUSE_VALGRIND"
and then the usual "make check" or whatever.
The patch has a hardcoded path to the .supp file, and places the
valgrind log into /tmp. It has worked for me fine up until that commit,
and it still seems to be working in every other test directory.
>> t/001_connection_limits.pl .. 3/? # Tests were run but no plan was
>> declared and done_testing() was not seen.
>> # Looks like your test exited with 29 just after 4.
>> t/001_connection_limits.pl .. Dubious, test returned 29 (wstat 7424,
>> 0x1d00)
>> All 4 subtests passed
>>
>>
>> and tmp_check/log/regress_log_001_connection_limits says:
>>
>>
>> [23:48:44.444](1.129s) ok 3 - reserved_connections limit
>> [23:48:44.445](0.001s) ok 4 - reserved_connections limit: matches
>> process ended prematurely at
>> /home/user/work/postgres/src/test/postmaster/../../../src/test/perl/
>> PostgreSQL/Test/BackgroundPsql.pm
>> line 154.
>> # Postmaster PID for node "primary" is 198592
>>
>>
>> That BackgroundPsql.pm line is this in wait_connect()
>>
>> $self->{run}->pump()
>> until $self->{stdout} =~ /$banner/ || $self->{timeout}->is_expired;
>>
>> By trial and error I found that it fails on this line 70:
>>
>> push(@sessions, background_psql_as_user('regress_superuser'));
>>
>> but I have no idea idea why. There are multiple similar calls a couple
>> lines earlier, and those work fine. And various other TAP tests with
>> background_sql() work fine too.
>>
>> So what's so special about this particular line?
>
> Weird. Valgrind makes everything slow; is it a timeout? Any other clues
> in the logs?
>
Yeah, weird.
Timeouts were the first thing I thought about, but it fails even if I
set PGCTLTIMEOUT/PG_TEST_TIMEOUT_DEFAULT to 3600. And it doesn't seem to
be waiting for anything for that long :-(
regards
--
Tomas Vondra
Attachment | Content-Type | Size |
---|---|---|
valgrind-master.patch | text/x-patch | 2.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrey M. Borodin | 2024-12-09 13:02:19 | Re: Sort functions with specialized comparators |
Previous Message | Heikki Linnakangas | 2024-12-09 12:30:25 | Re: Refactoring postmaster's code to cleanup after child exit |