Re: nested <a> tags in glossary entries in html docs

From: Erik Wienhold <ewie(at)ewie(dot)name>
To: Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru>
Cc: pgsql-docs <pgsql-docs(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: nested <a> tags in glossary entries in html docs
Date: 2024-04-12 19:56:04
Message-ID: chpdkl4yex3fvv7od5w23rrazyuimj63epiywfr2t5vwefhury@4x4ps6dvzatm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On 2024-04-12 18:29 +0200, Anton Voloshin wrote:
> In REL_13_STABLE and above, generated HTML have a broken HTML: nested <a
> href="..."> tags for all links to glossary. Somehow, this results in
> duplicated <a> tags on the https://www.postgresql.org/docs/
>
> Found by tab-navigating https://www.postgresql.org/docs/16/rowtypes.html
> where we see (spacing added to avoid line wraps):
> ... create a <a class="glossterm"
> href="glossary.html#GLOSSARY-DOMAIN"</a><a class="glossterm"
> href="glossary.html#GLOSSARY-DOMAIN"
> title="Domain">domain</a> over the composite type ...
>
> So, empty <a>, and then the real <a>. This resulted in stopping twice on the
> "domain" link (right before, and then on the
> "domain" word itself) while tab-navigating.

There's this bug[1] in the DocBook XSLT stylesheets. Looks like the
fix[2] landed in 1.79.2 (latest version on Arch, matching the latest
snapshot on GitHub from 2020-06-03) because I can see the change in
/usr/share/xml/docbook/xsl-stylesheets-1.79.2-nons/html/inline.xsl and
/usr/share/xml/docbook/xsl-stylesheets-1.79.2-nons/xhtml/inline.xsl.
But I still get those nested <a> with a simple test:

<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<book>
<title>Test</title>
<glosslist>
<glossentry id="glossary-a">
<glossterm>A</glossterm>
<glossdef>
<para>
<glossterm linkend="glossary-b">B</glossterm>
</para>
</glossdef>
</glossentry>
<glossentry id="glossary-b">
<glossterm>B</glossterm>
<glossdef>
<para>
Lorem ipsum&hellip;
</para>
</glossdef>
</glossentry>
</glosslist>
</book>

Generating the XHTML with

xsltproc --nonet /usr/share/xml/docbook/xsl-stylesheets-1.79.2-nons/xhtml/docbook.xsl test.sgml | grep '</a></em></a>'

gives me

<a class="glossterm" href="#glossary-b"><em class="glossterm"><a class="glossterm" href="#glossary-b" title="B">B</a></em></a>

> Not sure about how to fix this (don't really know docbook).

My XSLT skills are quite rusty, but maybe it's possible to omit the
outer <a class="glossterm"> and just emit <em class="glossterm"> and its
child <a> in our stylesheets.

[1] https://github.com/docbook/xslt10-stylesheets/issues/24
[2] https://github.com/docbook/xslt10-stylesheets/commit/c242ce2b8c1a5ebfdb2e719f788367bb1ddee8ea

--
Erik

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Anton Voloshin 2024-04-12 20:20:27 three small improvements for "Composite Types" page
Previous Message Anton Voloshin 2024-04-12 16:29:08 nested <a> tags in glossary entries in html docs