Re: Recent 027_streaming_regress.pl hangs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: Re: Recent 027_streaming_regress.pl hangs
Date: 2024-07-25 21:14:13
Message-ID: 1563098.1721942053@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I'm confused by crake's buildfarm logs. AFAICS it is not running
> recovery-check at all in most of the runs; at least there is no
> mention of that step, for example here:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2024-07-25%2013%3A27%3A02

Oh, I see it: the log file that is called recovery-check in a
failing run is called misc-check if successful. That seems
mighty bizarre, and it's not how my own animals behave.
Something weird about the meson code path, perhaps?

Anyway, in this successful run:

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=crake&dt=2024-07-25%2018%3A57%3A02&stg=misc-check

here are some salient test timings:

1/297 postgresql:pg_upgrade / pg_upgrade/001_basic OK 0.18s 9 subtests passed
2/297 postgresql:pg_upgrade / pg_upgrade/003_logical_slots OK 15.95s 12 subtests passed
3/297 postgresql:pg_upgrade / pg_upgrade/004_subscription OK 16.29s 14 subtests passed
17/297 postgresql:isolation / isolation/isolation OK 71.60s 119 subtests passed
41/297 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade OK 169.13s 18 subtests passed
140/297 postgresql:initdb / initdb/001_initdb OK 41.34s 52 subtests passed
170/297 postgresql:recovery / recovery/027_stream_regress OK 469.49s 9 subtests passed

while in the next, failing run

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=crake&dt=2024-07-25%2020%3A18%3A05&stg=recovery-check

the same tests took:

1/297 postgresql:pg_upgrade / pg_upgrade/001_basic OK 0.22s 9 subtests passed
2/297 postgresql:pg_upgrade / pg_upgrade/003_logical_slots OK 56.62s 12 subtests passed
3/297 postgresql:pg_upgrade / pg_upgrade/004_subscription OK 71.92s 14 subtests passed
21/297 postgresql:isolation / isolation/isolation OK 299.12s 119 subtests passed
31/297 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade OK 344.42s 18 subtests passed
159/297 postgresql:initdb / initdb/001_initdb OK 344.46s 52 subtests passed
162/297 postgresql:recovery / recovery/027_stream_regress ERROR 840.84s exit status 29

Based on this, it seems fairly likely that crake is simply timing out
as a consequence of intermittent heavy background activity.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilya Gladyshev 2024-07-25 21:21:49 Re: REINDEX not updating partition progress
Previous Message Tom Lane 2024-07-25 20:39:44 Re: [PATCH] Fix docs to use canonical links