Re: Broken links in mailinglist archive due to percent-encoding

From: Erik Wienhold <ewie(at)ewie(dot)name>
To: "pgsql-www(at)lists(dot)postgresql(dot)org" <pgsql-www(at)lists(dot)postgresql(dot)org>
Subject: Re: Broken links in mailinglist archive due to percent-encoding
Date: 2023-09-14 22:05:17
Message-ID: 1805604733.531229.1694729117448@office.mailbox.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On 29/08/2023 21:38 CEST Erik Wienhold <ewie(at)ewie(dot)name> wrote:

> It looks like the archive percent-encodes subcomponent delimiters in the query
> component. Perhaps the encoding is allowed and it's just git.postgresql.org
> that can't handle it. But I'm pretty sure that links to git.postgresql.org
> from the archive worked in the past.

I've been digging around a bit more because this is an odd bug.

Turns out it's the result of applying Django's urlize filter to the message
body [1]:

>>> from django.template.defaultfilters import urlize
>>> urlize('http://example.net/foo?bar=baz;abc=123')
'<a href="http://example.net/foo?bar=baz%3Babc%3D123" rel="nofollow">http://example.net/foo?bar=baz;abc=123</a>'

Looks like a bug in Django because it does not percent-encode any sub-delimiters
outside the query component:

>>> urlize('http://example.net/foo;bar=baz')
'<a href="http://example.net/foo;bar=baz" rel="nofollow">http://example.net/foo;bar=baz</a>'

And regarding git.postgresql.org: gitweb generates URLs with semicolon as the
separator of query pairs [2] instead of using ampersand, although semicolon is
no longer recommended by W3C. But gitweb also handles query components with
ampersand instead of semicolon. Which means that links [1] and [3] work after
I've manually replaced all semicolons with ampersands.

[1] https://git.postgresql.org/gitweb/?p=pgarchives.git&a=blob&f=django/archives/mailarchives/templates/_message.html&h=c90a80afea418fc4800ae81bb517978fa56f7a4d&hb=HEAD#l64
[2] https://git.kernel.org/pub/scm/git/git.git/tree/gitweb/gitweb.perl#n1505
[3] https://git.postgresql.org/gitweb/?p=postgresql.git&a=blob&f=src/bin/psql/describe.c&h=bac94a338cfbc497200f0cf960cbabce2dadaa33&hb=9b581c53418666205938311ef86047aa3c6b741f#l1420

--
Erik

In response to

Browse pgsql-www by date

  From Date Subject
Next Message Joe Conway 2023-09-16 10:43:07 Re: Wiki editor request
Previous Message Jonathan S. Katz 2023-09-14 19:38:06 Re: Broken URL on PostgreSQL 16 Press kit Page