Re: what's going on with lapwing?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrew Dunstan <adunstan(at)postgresql(dot)org>, pgbuildfarm(at)rjuju(dot)net
Subject: Re: what's going on with lapwing?
Date: 2025-03-04 16:02:51
Message-ID: CA+Tgmob3MyFzbtE0-D4Q6PsffbN_gG1=Cy1iWXncYqJpE4O=iQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 4, 2025 at 10:18 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> At the point where I complained to you about that other problem,
> it was looking like it might cause a quarter or a third of the
> buildfarm to fail intermittently. Maybe I overestimated the
> frequency of the failure, but if that was accurate it would have
> resulted in a lot of fruitless double-checking of failures to
> see if there was anything real underneath the noise. So I find
> that sort of case much more painful.

I think that's actually totally fair. I was not upset that you wanted
it fixed, or even that you wanted it fixed relatively quickly. The
things that I was upset about were:

1. There's no real way for me to avoid this kind of pain. That's not
your fault, but it is something that I think we need to address as a
community. As Jelte said on the other thread, other projects have
infrastructure that allows them to avoid these kinds of problems by
being able to do pre-commit testing. Having modern infrastructure for
stuff like this is an important part of attracting and retaining
developers.

2. Two hours is just not enough time. Never mind that people like to
have evenings and weekends off -- this is supposed to be a community
that operates by consensus. It doesn't seem right to spend months or
years discussing the design before committing, and then after commit,
boom, you have to make a unilateral decision about what to change
within -- not even hours, but minutes. Because you also need time for
BF results to show up, and then you need time to code and test
whatever you decided. I would actually be quite sympathetic to the
time frame here if we were immediately before a feature or release
freeze when there is no tolerance for error, but not in this
situation.

3. You ignored the substantive questions that I asked you to comment
only on the procedural issue of fixing the BF, even though you were a
previous participant in the discussion on that patch.

Maybe my tolerance for reverts is just lower than yours. I think it's
bad when somebody has a problem like this, insta-reverts, then tries
it again later after changing the patch, then maybe the same thing
happens again, gets insta-reverted a second time, then maybe the third
time the commit actually sticks. I think that clutters up the commit
history with a bunch of junk, and that junk is permanent. The
buildfarm being red is bad, but after it's fixed it will be green
again and the time for which it was red will have little enduring
impact. But everybody who tries to find stuff in the commit log is
potentially inconvenienced by reverts, forever. Commands like 'git log
FILENAME' or 'git log -Gstring' will now return extra, spurious hits.
If I absolutely have to choose between the BF being red for a couple
of days and revert ping-pong, I would prefer the former.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-03-04 16:11:18 Re: Adding NetBSD and OpenBSD to Postgres CI
Previous Message Bertrand Drouvot 2025-03-04 16:02:17 Re: Draft for basic NUMA observability