From: | Josh Kupershmidt <schmiddy(at)gmail(dot)com> |
---|---|
To: | pgsql-docs(at)postgresql(dot)org |
Subject: | Large SGML Cleanup |
Date: | 2010-11-03 02:56:26 |
Message-ID: | AANLkTi=1Sm9N3Khiued9UiMfdd_TKLimMiO9mCfHtL39@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
[Resending without large attachment, looks like the previous attempt
isn't going to make it]
Hi all,
I've gone through the SGML documentation, trying to push the output
HTML towards HTML 4.01 compliance. By far the most common problem I
found was incorrect nesting of <para> nodes, which results in invalid
HTML.
A common idiom I encountered was SGML like this:
<para>
...
<simplelist>
...
</simplelist>
...
</para>
This SGML would then produce HTML which looked like this:
<p>
...
<table>
...
</table>
...
</p>
This HTML fails validation, as one isn't supposed to be stuffing
tables inside <p> nodes. The attached patch fixes all the instances of
this I could find, by closing out <para> nodes before beginning lists
and tables.
I used the w3c-markup-validator package and the web service at
validator.w3.org to test HTML validity. A handy Perl package I found
for this was WebService::Validator, which includes the example script
"validate_files_in_dir.pl" to easily validate a directory full of html
files. With this patch, the number of invalid HTML files has been
reduced to 16 from many dozens.
Patch at:
http://kupershmidt.org/pg/sgml_fixup.patch.gz
Josh
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-11-03 03:15:26 | Re: Large SGML Cleanup |
Previous Message | Katharina kuhn | 2010-11-02 18:35:32 | Re: CREATE CUSTOM TEXT SEARCH PARSER |