From: | Chapman Flack <chap(at)anastigmatix(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, mostafa_bit0108(at)hotmail(dot)com |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org, Markus Winand <markus(dot)winand(at)winand(dot)at>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Subject: | Re: BUG #16046: xpath returns CDATA tag along with the value in postgres 12 |
Date: | 2019-10-24 23:14:48 |
Message-ID: | 5DB23068.3080601@anastigmatix.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 10/24/19 17:38, Tom Lane wrote:
>> Query: SELECT unnest(xpath('//cname/aname/text()','<cname><aname><![CDATA[select 5]]></aname></cname>'::xml))
>>
>> Output - pg11: select 5
>>
>> Output - pg12: <![CDATA[select 5]]>
>
> ... What's not entirely clear to me is whether it's an intentional
> effect, or a bug. Authors, any comments?
Hmm. I would say the pg12 behavior is "not wrong". But it's unexpected.
xpath's return type is xml (well, array of), so the result must have a
form that can escape any characters mistakable for markup. In this example,
there aren't any, but once tweaked so there are:
SELECT
unnest(xpath('//cname/aname/text()',
'<cname><aname><![CDATA[select 5 & 6 <yahoo!>]]></aname></cname>'::xml));
pg12: <![CDATA[select 5 & 6 <yahoo!>]]>
the necessity is clear.
The other valid option would be to return, not CDATA, but a regular
text node, which would look like straight text if there were no special
characters in it, and would otherwise have every such character individually
escaped. That's what I get from pg11:
SELECT
unnest(xpath('//cname/aname/text()',
'<cname><aname><![CDATA[select 5 & 6 <yahoo!>]]></aname></cname>'::xml));
pg11: select 5 & 6 <yahoo!>
So what pg11 is doing is also "not wrong" (in this respect, anyway).
And looks "more natural", in the case where the value has no characters
that need escaping.
Which may or may not be a good thing. Perhaps it could lead the unwary
in some cases to think such a query is giving a directly usable
text string back, which will be harmless until the one time a value
with escaping comes back. (The no-surprises way to get back a directly
usable text string, if that's what's wanted, would be with XMLTABLE
and an output column of text type.)
Oddly, what pg12 is doing seems to be influenced by the form of escaping
used in the input:
SELECT
unnest(xpath('//cname/aname/text()',
'<cname><aname><![CDATA[select 5 & 6 <yahoo!>]]></aname></cname>'::xml));
unnest
-----------------------------------
<![CDATA[select 5 & 6 <yahoo!>]]>
SELECT
unnest(xpath('//cname/aname/text()',
'<cname><aname>select 5 & 6 <yahoo!></aname></cname>'::xml));
unnest
---------------------------------
select 5 & 6 <yahoo!>
Either form of result is correct, and having it respect the form that was
used in the input might even be delightfully smart.
I haven't looked in the code just now to see if it is intentionally being
delightfully smart, or more simplistic-and-lucky.
Regards,
-Chap
From | Date | Subject | |
---|---|---|---|
Next Message | Chapman Flack | 2019-10-24 23:34:09 | Re: BUG #16046: xpath returns CDATA tag along with the value in postgres 12 |
Previous Message | Tom Lane | 2019-10-24 21:38:11 | Re: BUG #16046: xpath returns CDATA tag along with the value in postgres 12 |