From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | Radosław Smogura <rsmogura(at)softperience(dot)eu> |
Cc: | PG Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Review of patch Bugfix for XPATH() if expression returns a scalar value |
Date: | 2011-06-29 20:26:39 |
Message-ID: | 6A2C047F-7EE4-430D-9813-A1B215D81D56@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jun29, 2011, at 19:57 , Radosław Smogura wrote:
> This is review of patch
> https://commitfest.postgresql.org/action/patch_view?id=565
> "Bugfix for XPATH() if expression returns a scalar value"
>
> SELECT XMLELEMENT(name root, XMLATTRIBUTES(foo.namespace AS sth)) FROM (SELECT
> (XPATH('namespace-uri(/*)', x))[1] AS namespace FROM (VALUES (XMLELEMENT(name
> "root", XMLATTRIBUTES('<n' AS xmlns, '<v' AS value),'<t'))) v(x)) as foo;
>
> xmlelement
> -------------------------
> <root sth="&lt;n"/>
>
> It's clearly visible that value from attribute is "<n", not "<". Every
> parser will read this as "<n" which is not-natural and will require form
> consumer/developer to de-escape this on his side - roughly speaking this will
> be reported as serious bug.
There's a problem there, no doubt, but your proposed solution just moves the
problem around.
Here's your query, reformatted to be more legible
SELECT
XMLELEMENT(name root,
XMLATTRIBUTES(foo.namespace AS sth)
)
FROM (
SELECT
(XPATH('namespace-uri(/*)', x))[1] AS namespace
FROM (VALUES (XMLELEMENT(name "root",
XMLATTRIBUTES('<n' AS xmlns,
'<v' AS value),'<t')
)
) v(x)) as foo;
What happens is that the XPATH expression selects the xmlns
attribute with the value '<n'. To be well-formed xml, that value
must be escaped, so what is actually returned by the XPATH
call is '<n'. The XMLATTRIBUTES() function, however, doesn't
distinguish between input of type XML and input of type TEXT,
so it figures it has to represent the *textual* value '<n'
in xml, and produces '&lt;n'.
Not escaping textual values returned by XPATH isn't a solution,
though. For example, assume someone inserts the value returned
by the XPATH() call in your query into a column of type XML.
If XPATH() returned '<n' literally instead of escaping it,
the column would contain non-well-formed XML in a column of type
XML. Not pretty, and pretty fatal during your next dump/reload
cycle.
Or, if you want another example, simply remove the XMLATTRIBUTES
call and use the value returned by XPATH as a child node directly.
SELECT
XMLELEMENT(name root,
foo.namespace
)
FROM (
SELECT
(XPATH('namespace-uri(/*)', x))[1] AS namespace
FROM (VALUES (XMLELEMENT(name "root",
XMLATTRIBUTES('<n' AS xmlns,
'<v' AS value),'<t')
)
) v(x)) as foo;
xmlelement
--------------------
<root><n</root>
Note that this correct! The <root> node contains a text node
representing the textual value '<n'. If XPATH() hadn't return
the value escaped, the query above would have produces
<root><n</root>
which is obviously wrong. It works in this case because XMLELEMENT
is smart enough to distinguish between child nodes gives as TEXT
values (those are escaped) and child nodes provided as XML values
(those are inserted unchanged).
XMLATTRIBUTES, however, doesn't make that distinction, and simply
escaped all attributes values independent of their type. Now, the
reason it does that is probably that *not* all XML values are
well-formed attribute values without additional escaping. For example,
you cannot have literal '<' and '>' in attribute values. So if
XMLATTRIBUTES, like XMLELEMENT never escaped values which are
already of type XML, the following piece of code
XMLELEMENT(name "a", XMLATTRIBUTES('<node/>'::xml as v))
would produce
<a v="<node/>"/>
which, alas, is not well-formed :-(
The correct solution, I believe, is to teach XMLATTRIBUTES to
only escape those characters which are invalid in attribute
values but valid in XML (Probably only '<' and '>' but I'll
check). If there are no objects, I'll prepare a separate patch
for that. I don't want to include it in this patch because it's
really a distinct issue (even modifies a different function).
best regards,
Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Florian Pflug | 2011-06-29 20:48:40 | Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected |
Previous Message | Kevin Grittner | 2011-06-29 20:22:01 | serializable installcheck |