From: | Markus Winand <markus(dot)winand(at)winand(dot)at> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated with wrong context |
Date: | 2018-03-27 11:52:26 |
Message-ID: | 0684A598-002C-42A2-AE12-F024A324EAE4@winand.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
I’ve found two issues with XML/XPath integration in PostgreSQL. Two patches are attached. I’ve just separated them because the second one is an incompatible change.
* Patch 0001: Accept TEXT and CDATA nodes in XMLTABLE’s column_expression.
> Column_expressions that match TEXT or CDATA nodes must return
> the contents of the node itself, not the content of the
> non-existing childs (i.e. the empty string).
The following query returns a wrong result in master:
SELECT *
FROM (VALUES ('<xml>text</xml>'::xml)
, ('<xml><![CDATA[<cdata>]]></xml>'::xml)
) d(x)
CROSS JOIN LATERAL
XMLTABLE('/xml'
PASSING x
COLUMNS "node()" TEXT PATH 'node()'
) t
x | node()
--------------------------------+--------
<xml>text</xml> |
<xml><![CDATA[<cdata>]]></xml> |
(the “node()” column is empty)
The correct result can be obtained with patch 0001 applied:
x | node()
--------------------------------+---------
<xml>text</xml> | text
<xml><![CDATA[<cdata>]]></xml> | <cdata>
The patch adopts existing tests to guard against future regressions.
* Patch 0002: Set correct context for XPath evaluation.
> According to the ISO standard, the context of XMLTABLE's XPath
> row_expression is the document node of the XML[1] - not the root node.
>
> The respective code is shared with non-ISO functions such as xpath
> and xpath_exists so that the change also affects these functions.
It’s an incompatible change - even the regression tests need to be adjusted because they
test for the “wrong” behaviour. The documentation, on the other hand, does neither
document the behaviour explicitly, no does it show any example that breaks due to this patch.
The alternatives to this patch are (1) documenting the current standard-breaking behaviour
and (2) fixing the context only for ISO functions. Personally, I don’t like either of them.
I’ve checked the libxml2 code to see what’s the proper way to set the context to the
document node. It’s indeed just “(xmlNodePtr)ctxt->doc”.
See: https://github.com/GNOME/libxml2/blob/7abec671473b837f99181442d59edd0cc2ee01d1/xpath.c#L14306
-markus
--
[1] SQL/XML:2011, 6.18 General Rule 1aii2.
Attachment | Content-Type | Size |
---|---|---|
0001-Accept-TEXT-and-CDATA-nodes-in-XMLTABLE-s-column_exp.patch | application/octet-stream | 9.0 KB |
0002-Set-correct-context-for-XPath-evaluation.patch | application/octet-stream | 4.3 KB |
unknown_filename | text/plain | 7 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2018-03-27 11:58:11 | pg_class.reltuples of brin indexes |
Previous Message | Eren Başak | 2018-03-27 11:50:35 | Re: [HACKERS] Optional message to user when terminating/cancelling backend |