From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Improving tracking/processing of buildfarm test failures |
Date: | 2024-05-24 20:00:35 |
Message-ID: | 20240524200035.c2@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, May 23, 2024 at 02:00:00PM +0300, Alexander Lakhin wrote:
> I'd like to discuss ways to improve the buildfarm experience for anyone who
> are interested in using information which buildfarm gives to us.
>
> Unless I'm missing something, as of now there are no means to determine
> whether some concrete failure is known/investigated or fixed, how
> frequently it occurs and so on... From my experience, it's not that
> unbelievable that some failure occurred two years ago and lost in time was
> an indication of e. g. a race condition still existing in the code/tests
> and thus worth fixing. But without classifying/marking failures it's hard
> to find such or other interesting failure among many others...
I agree this is an area of difficulty consuming buildfarm results. I have an
inefficient template for studying a failure, which your proposals would help:
**** grep recent -hackers for animal name
**** search the log for ~10 strings (e.g. "was terminated") to find the real indicator of where it failed
**** search mailing lists for that indicator
**** search buildfarm database for that indicator
> The first way to improve things I can imagine is to add two fields to the
> buildfarm database: a link to the failure discussion (set when the failure
> is investigated/reproduced and reported in -bugs or -hackers) and a commit
> id/link (set when the failure is fixed). I understand that it requires
I bet the hard part is getting data submissions, so I'd err on the side of
making this as easy as possible for submitters. For example, accept free-form
text for quick notes, not only URLs and commit IDs.
> modifying the buildfarm code, and adding some UI to update these fields,
> but it allows to add filters to see only unknown/non-investigated failures
> in the buildfarm web interface later.
>
> The second way is to create a wiki page, similar to "PostgreSQL 17 Open
> Items", say, "Known buildfarm test failures" and fill it like below:
> <url to failure1>
> <url to failure2>
> ...
> Useful info from the failure logs for reference
> ...
> <link to -hackers thread>
> ---
> This way is less invasive, but it would work well only if most of
> interested people know of it/use it.
> (I could start with the second approach, if you don't mind, and we'll see
> how it works.)
Certainly you doing (2) can only help, though it may help less than (1).
I recommend considering what the buildfarm server could discover and publish
on its own. Examples:
- N members failed at the same step, in a related commit range. Those members
are now mostly green. Defect probably got fixed quickly.
- Log contains the following lines that are highly correlated with failure.
The following other reports, if any, also contained them.
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2024-05-24 20:23:24 | Re: commitfest.postgresql.org is no longer fit for purpose |
Previous Message | Tom Lane | 2024-05-24 20:00:21 | Re: DROP OWNED BY fails to clean out pg_init_privs grants |