From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: XML with invalid chars |
Date: | 2011-04-26 22:02:59 |
Message-ID: | 1303855379.12063.10.camel@vanquo.pezone.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On mån, 2011-04-25 at 19:25 -0400, Andrew Dunstan wrote:
> I came across this today, while helping a customer. The following
> will
> happily create a piece of XML with an embedded ^A:
>
> select xmlelement(name foo, null, E'abc\x01def');
>
> Now, a ^A is totally forbidden in XML version 1.0, and allowed but
> only
> as "" or equivalent in XML version 1.1, and not as a 0x01 byte
> (see <http://en.wikipedia.org/wiki/XML#Valid_characters>)
The best place to fix this might be in escape_xml() in xml.c. Since we
don't support XML 1.1 yet, just reject all invalid characters there
according to XML 1.0.
Relevant bits from the SQL standard:
i)
Let CS be the character set of SQLT. Let XMLVRAW be the result of mapping SQLV to Unicode
using the implementation-defined mapping of character strings of CS to Unicode. If any Unicode
code point in XMLVRAW does not represent a valid XML character, then an exception condition
is raised: SQL/XML mapping error — invalid XML character.
[This is what you'd add.]
ii)
Let XMLV be XMLVRAW, with each instance of “&” (U+0026) replaced by “&”, each
instance of “<” (U+003C) replaced by “<”, each instance of “>” (U+003E) replaced by “>”,
and each instance of Carriage Return (U+000D) replaced by “
”.
[This is what it already does.]
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-04-26 22:28:55 | Re: "stored procedures" - use cases? |
Previous Message | Peter Eisentraut | 2011-04-26 21:56:39 | Re: "stored procedures" - use cases? |