Re: Mangling mail archive "flat" links

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: Mangling mail archive "flat" links
Date: 2020-08-31 10:05:59
Message-ID: CABUevExmn8iy+Z3-Q5t_pVtQwL81nmH4YwOv8BjOhv2uZT__QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Mon, Aug 31, 2020 at 3:49 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

> Hello,
>
> It would be very nice if the archives didn't corrupt URLs like the one
> at the bottom of this message:
>
>
> https://www.postgresql.org/message-id/CA%2BhUKGJ8NRsqgkZEnsnRc2MFROBV-jCnacbYvtpptK2A9YYp9Q%40mail.gmail.com
>
> I peeked in pgfilters.py and saw that there is a regular expression
> designed to avoid mangling archives URLs, but it apparently doesn't
> match the "flat" ones.
>

Yeah, that's clearly not great. I think this fix to ther regex is the right
thing, it won't end up randomly missing other things now will it:

-_re_mail =
re.compile(r'(/m(essage-id)?/)?[^()<>@,;:\/\s"\'&|]+(at)[^()<>@,;:\/\s"\'&|]+')
+_re_mail =
re.compile(r'(/m(essage-id)?/(flat/)?)?[^()<>@,;:\/\s"\'&|]+(at)[^()<>@,;:\/\s"\'&|]+')

(it does still work for the ones I tested, but just to be on the safe
side..)

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Browse pgsql-www by date

  From Date Subject
Next Message Gregory 2020-09-01 15:40:06 Wiki editor request
Previous Message Thomas Munro 2020-08-31 01:49:07 Mangling mail archive "flat" links