Re: ssl tests fail due to TCP port conflict

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ssl tests fail due to TCP port conflict
Date: 2024-07-08 19:40:37
Message-ID: a0a8a9b3-6600-46b7-845c-169a169dea71@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2024-07-08 Mo 8:00 AM, Alexander Lakhin wrote:
> Hello,
>
> 07.06.2024 17:25, Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> I still think my patch to force TCP mode for the SSL test makes
>>> sense as
>>> well.
>> +1 to both things.  If that doesn't get the failure rate down to an
>> acceptable level, we can look at the retry idea.

I have push patches for both of those (i.e. start SSL test nodes in TCP
mode and change the range of ports we allocate server ports from)

I didn't see this email until after I had pushed them.

>
> I'd like to add that the kerberos/001_auth test also suffers from the
> port
> conflict, but slightly differently. Look for example at [1]:
> krb5kdc.log contains:
> Jul 02 09:29:41 andres-postgres-buildfarm-v5 krb5kdc[471964](info):
> setting up network...
> Jul 02 09:29:41 andres-postgres-buildfarm-v5 krb5kdc[471964](Error):
> Address already in use - Cannot bind server socket on 127.0.0.1.55853
> Jul 02 09:29:41 andres-postgres-buildfarm-v5 krb5kdc[471964](Error):
> Failed setting up a UDP socket (for 127.0.0.1.55853)
> Jul 02 09:29:41 andres-postgres-buildfarm-v5 krb5kdc[471964](Error):
> Address already in use - Error setting up network
>
> As far as I can see, the port for kdc is chosen by
> PostgreSQL::Test::Kerberos, via
> PostgreSQL::Test::Cluster::get_free_port(), which checks only for TCP
> port availability (with can_bind()), but not for UDP, so this increases
> the probability of the conflict for this test (a similar failure: [2]).
> Although we can also find a failure with TCP: [3]
>
> (It's not clear to me, what processes can use UDP ports while testing,
> but maybe those buildfarm animals are running on the same logical
> machine simultaneously?)
>
> [1]
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rorqual&dt=2024-07-02%2009%3A27%3A15
> [2]
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2024-05-15%2001%3A25%3A07
> [3]
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grassquit&dt=2024-07-04%2008%3A28%3A19
>
>

Let's see if this persists now we are testing for free ports in a
different range, not the range usually used for ephemeral ports.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-07-08 20:07:53 Re: Should we work around msvc failing to compile tab-complete.c?
Previous Message Robert Haas 2024-07-08 19:37:49 Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal