Re: IPC::Run accepts bug reports

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: IPC::Run accepts bug reports
Date: 2024-10-04 11:00:00
Message-ID: fb666566-32bb-9c36-9c2e-3949b7a061bc@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Noah,

16.06.2024 02:48, Noah Misch wrote:
> I don't see in https://github.com/cpan-authors/IPC-Run/issues anything
> affecting PostgreSQL. If you know of IPC::Run defects, please report them.
> If I knew of an IPC::Run defect affecting PostgreSQL, I likely would work on
> it before absurdity like https://github.com/cpan-authors/IPC-Run/issues/175
> NetBSD-10-specific behavior coping.

It looks like a recent indri failure [1] revealed one more IPC::Run
anomaly. The failure log contains:
# Running: pgbench -n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 -Dbadtrue=trueXXX
-Dmaxint=9223372036854775807 -Dminint=-9223372036854775808 -M prepared -f
/Users/buildfarm/bf-data/HEAD/pgsql.build/src/bin/pgbench/tmp_check/t_001_pgbench_with_server_main_data/001_pgbench_error_sleep_undefined_variable
[22:38:14.887](0.014s) ok 362 - pgbench script error: sleep undefined variable status (got 2 vs expected 2)
[22:38:14.887](0.000s) ok 363 - pgbench script error: sleep undefined variable stdout /(?^:processed: 0/1)/
[22:38:14.887](0.000s) not ok 364 - pgbench script error: sleep undefined variable stderr /(?^:sleep: undefined variable)/
[22:38:14.887](0.000s)
[22:38:14.887](0.000s) #   Failed test 'pgbench script error: sleep undefined variable stderr /(?^:sleep: undefined
variable)/'
#   at t/001_pgbench_with_server.pl line 1242.
[22:38:14.887](0.000s) #                   ''
#     doesn't match '(?^:sleep: undefined variable)'

So the pgbench process exited as expected, stdout contained expected
string, but stderr happened to be empty.

Maybe such behavior is specific to macOS, and even on indri it's the
only failure of that ilk out of 2000+ runs (and I couldn't reproduce
this in a Linux VM), but I find this place in IPC::Run suspicious:
sub _read {
...
    my $r = POSIX::read( $_[0], $s, 10_000 );
    croak "$!: read( $_[0] )" if not($r) and !$!{EINTR};

That is, EINTR kind of recognized as an expected error, but there is no
retry in this case. Thus, with the following modification, which simulates
read() failed with EINTR:
 sub _read {
     confess 'undef' unless defined $_[0];
     my $s = '';
-    my $r = POSIX::read( $_[0], $s, 10_000 );
+    my $r;
+if (int(rand(100)) == 0)
+{
+   $r = 0;  $! = Errno::EINTR;
+}
+else
+{
+    $r = POSIX::read( $_[0], $s, 10_000 );
+}
     croak "$!: read( $_[0] )" if not($r) and !$!{EINTR};

I can see failures like the one in question when running that test.

Perhaps, I could reproduce the issue with a program, that sends signals
to running (pgbench) processes (and thus interrupts read()), if it makes
sense.

What do you think?

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=indri&dt=2024-10-02%2002%3A34%3A16

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-10-04 11:02:04 Re: Rename PageData to XLogPageData
Previous Message shveta malik 2024-10-04 10:09:44 Re: Logical Replication of sequences