Re: Buildfarm feature request: some way to track/classify failures

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Buildfarm feature request: some way to track/classify failures
Date: 2007-03-20 02:14:32
Message-ID: 45FF4388.6060406@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> Tom Lane wrote:
>>
>>> Actually what I *really* want is something closer to "show me all the
>>> unexplained failures", but unless Andrew is willing to support some way
>>> of tagging failures in the master database, I suppose that won't happen.
>>>
>
>
>> Who would do the tagging, and how?
>>
>
> Well, that's the hard part isn't it? I was sort of envisioning a group
> of users who'd be authorized to log in and set tags on database entries
> somehow. I'm not sure about details. One issue is that the majority
> of failures come in batches (when one of us commits a bad patch).
> With the current web interface it would be real tedious to verify which
> of the failures in a particular time interval matched the symptoms of
> a failure. What I did for my experiment this weekend was to download
> the last-stage-log of each failed build, which required an hour or so
> of setup time; then I could use grep to confirm which logs matched a
> failure that I'd identified. Doing that through the current webpage
> would involve lots of clicking and waiting. If we could expose a
> text-search-style API for grepping the stage logs, it'd be a lot easier
> to collect related failures. Then maybe a few widgets to let authorized
> users apply a tag to the search results ...
>
> I'm not entirely sure that this infrastructure would pay for itself,
> though. Without some users willing to take the time to separate
> explained from unexplained failures, it'd be a waste of effort.
> But we've already had a couple of cases of interesting failures going
> unnoticed because of the noise level. Between duplicate reports about
> busted patches and transient problems on particular build machines
> (out of disk space, misconfiguration, etc) it's pretty hard to not miss
> the once-in-a-while failures. Is there some other way we could attack
> that problem?
>
>

I'm not too sanguine about having a team of eager taggers.

I think we probably need to work on a usable API for extracting data in
small or large amounts, and maybe some good text search facilities.

The real issue is the one you identify of stuff getting lost in the
noise. But I'm not sure there's any realistic cure for that.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2007-03-20 03:42:13 Re: [COMMITTERS] pgsql: Changes pg_trigger and extend pg_rewrite in order to allow
Previous Message Mark Kirkwood 2007-03-20 02:14:29 Re: Stats for multi-column indexes