Re: XMLDocument (SQL/XML X030)

From: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
To: Chapman Flack <jcflack(at)acm(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: XMLDocument (SQL/XML X030)
Date: 2025-01-22 23:55:39
Message-ID: 1c7b828b-a65a-4188-9b4c-2290f6c6f5f0@uni-muenster.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Chapman & Robert

Many thanks for the input

On 22.01.25 22:35, Chapman Flack wrote:
> On 01/22/25 13:41, Robert Treat wrote:
>> So even if we are following the spec (which I think technically we may
>> not be),
> There are definite ways in which we're not following the SQL/XML spec,
> which we document in an appendix[1]. The one that matters here is that
> we just have a single XML type instead of the hierarchy of them in the
> spec, and ours corresponds to what the spec calls XML(CONTENT(ANY)).
>
> With that divergence from the spec understood, I don't see any new
> divergence in providing an XMLDOCUMENT function that just returns
> its argument. That's the correct result to return for anything that's
> a valid value of our XML type to begin with.
>
>> if no other database implements it the way we are
> There may be other systems that don't implement it at all, for which
> case I don't see any compatibility issue created because we have it
> and they do not.

Yeah, as long as we stick to the specification, I don’t see any issue
with including it.

>
> There may be systems that implement the SQL/XML type hierarchy more
> completely than we do, so that it would be possible for their
> XMLDOCUMENT to be called with an XML(SEQUENCE) argument, or with
> RETURNING SEQUENCE, both of which are things that can't happen
> in PostgreSQL. I don't see a problem in that either, as long as
> theirs produces results matching ours in the RETURNING CONTENT,
> passed an XML(CONTENT...) argument case.
>
> If another system produces results that differ, in that restricted
> domain corresponding to ours, I'd say something's nonconformant in
> that implementation. In my opinion, that would only be a problem
> for us if the system in question is an 800 lb gorilla and has many
> users relying on the differing behavior.
>
> Regarding the patch itself: I wonder if there is already, somewhere
> in the code base, a generic fmgr identity function that the pg_proc
> entry could point to. Or is there a strong project tradition in favor
> of writing a dedicated new one to use here? I'm not sure it's critical
> to have a version that tests USE_LIBXML or reports it's unsupported,
> because without support I doubt there's any way to pass it a non-null
> XML argument, and if declared STRICT it won't be called for a null one
> anyway.
>
> Regards,
> -Chap
>
> [1] https://www.postgresql.org/docs/17/xml-limits-conformance.html

Regarding compatibility, here is an example of how both implementations
handle single-rooted XML, non-single-rooted XML, and NULL values.

== DB2 [1] ==

WITH t(x) AS (
  VALUES
    (xmlparse(DOCUMENT '<root><foo>bar</foo></root>')),
    (xmlforest(42 AS foo, 73 AS bar)),
    (NULL)
)
SELECT xmldocument(x) FROM t;

----------------------------
<root><foo>bar</foo></root>   
<FOO>42</FOO><BAR>73</BAR>
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
 
 3 record(s) selected.
 
 
 
== PostgreSQL ==
 
WITH t(x) AS (
  VALUES
    (xmlparse(DOCUMENT '<root><foo>bar</foo></root>')),
    (xmlforest(42 AS foo, 73 AS bar)),
    (NULL)
)
SELECT xmldocument(x) FROM t;

         xmldocument         
-----------------------------
 <root><foo>bar</foo></root>
 <foo>42</foo><bar>73</bar>
 
(3 rows)

To make it clear: ensuring this function is compatible with other
database products is IMHO beneficial (and is my primary motivation), but
it shouldn't come at the expense of violating the SQL/XML specification.
I mean, if in some edge case, another database system implemented
XMLDocument in a way that deviates from the standard, I'd argue it’s not
worth prioritizing compatibility -- assuming, of course, that we are
fully following the standard.

Best regards, Jim

1 - https://dbfiddle.uk/G9VoHKp7

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-01-23 00:22:07 Re: Pgoutput not capturing the generated columns
Previous Message Tomas Vondra 2025-01-22 22:50:57 Re: Extended Statistics set/restore/clear functions.