Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest
Date: 2024-08-15 17:25:15
Message-ID: CAEze2Wi8hk2FkXg=CA_ZArpFDVaTs5BBG0FdoxCd8R3BeTAiAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(sorry for the formatting, my mobile phone doesn't have the capabilities I
usually get when using my laptop)

On Thu, 15 Aug 2024, 16:02 Jelte Fennema-Nio, <postgres(at)jeltef(dot)nl> wrote:

> On Thu, 15 Aug 2024 at 15:33, Peter Eisentraut <peter(at)eisentraut(dot)org>
> wrote:
> > Maybe this kind of thing should rather be on the linked-to web page, not
> > in every email.
>
> Yeah, I'll first put a code snippet on the page for the commitfest entry.
>
> > But a more serious concern here is that the patches created by the cfbot
> > are not canonical. There are various heuristics when they get applied.
> > I would prefer that people work with the actual patches sent by email,
> > at least unless they know exactly what they are doing. We don't want to
> > create parallel worlds of patches that are like 90% similar but not
> > really identical.
>
> I'm not really sure what kind of heuristics and resulting differences
> you're worried about here. The heuristics it uses are very simple and
> are good enough for our CI. Basically they are:
> 1. Unzip/untar based on file extension
> 2. Apply patches using "patch" in alphabetic order
>
> Also, when I apply patches myself, I use heuristics too. And my
> heuristics are probably different from yours. So I'd expect that many
> people using the exact same heuristic would only make the situation
> better. Especially because if people don't know exactly what they are
> doing, then their heuristics are probably not as good as the one of
> our cfbot. I know I've struggled a lot the first few times when I was
> manually applying patches.

One serious issue with this is that in cases of apply failures, CFBot
delays, or other issues, the CFBot repo won't contain the latest version of
the series' patchsets. E.g. a hacker can accidentally send an incremental
patch, or an unrelated patch to fix an issue mentioned in the thread
without splitting into a different thread, etc. This can easily cause users
(and CFBot) to test and review the wrong patch, esp. when the mail thread
proper is not looked by the reviewer, which would be somewhat promoted by a
CFA+github -centric workflow.

Apart from the above issue, I'm -0.5 on what to me equates with automated
spam to -hackers: the volume of mails would put this around the 16th most
common sender on -hackers, with about 400 mails/year (based on 80 new
patches for next CF, and 5 CFs/year, combined with Robert's 2023 statistics
at [0]).

I also don't quite like the suggested contents of such mail: (1) and (2)
are essentially duplicative information, and because CF's entries' IDs are
not shown in the app the "with ID 0000" part of (1) is practically useless
(better use the CFE's title), (3) would best be stored and/or integrated in
the CFA, as would (4). Additionally, (4) isn't canonical/guaranteed to be
up-to-date, see above. As for the "copy-pastable git commands" suggestion,
I'm not sure that's applicable, for the same reasons that (4) won't work
reliably. CFBot's repo to me seems more like an internal implementation
detail of CFBot than an authorative source of patchset diffs.

Maybe we could instead automate CF mail thread registration by allowing
registration of threadless CF entries (as 'preliminary'), and detecting
(and subsequently linking) new threads containing references to those CF
entries, with e.g. an "CF: https://commitfest.postgresql.org/49/4980/"
directive in the new thread's initial mail's text. This would give the
benefits of requiring no second mail for CF referencing purposes, be it
automated or manual.
Alternatively, we could allow threads for new entries to be started through
the CF app (which would automatically insert the right form data into the
mail), providing an alternative avenue to registering patches that doesn't
have the chicken-and-egg problem you're trying to address here.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] https://rhaas.blogspot.com/2024/01/who-contributed-to-postgresql.html

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2024-08-15 17:52:55 Re: Reducing the log spam
Previous Message Heikki Linnakangas 2024-08-15 17:13:49 Re: Make query cancellation keys longer