Re: XMLDocument (SQL/XML X030)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
Cc: Chapman Flack <jcflack(at)acm(dot)org>, Robert Treat <rob(at)xzilla(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: XMLDocument (SQL/XML X030)
Date: 2025-01-23 06:50:23
Message-ID: CAFj8pRCyJLXezdYOAZFvPnFNoR5Z3OOkp_0AzCLxa12R+XE3rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 23. 1. 2025 v 0:55 odesílatel Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
napsal:

> Hi Chapman & Robert
>
> Many thanks for the input
>
> On 22.01.25 22:35, Chapman Flack wrote:
> > On 01/22/25 13:41, Robert Treat wrote:
> >> So even if we are following the spec (which I think technically we may
> >> not be),
> > There are definite ways in which we're not following the SQL/XML spec,
> > which we document in an appendix[1]. The one that matters here is that
> > we just have a single XML type instead of the hierarchy of them in the
> > spec, and ours corresponds to what the spec calls XML(CONTENT(ANY)).
> >
> > With that divergence from the spec understood, I don't see any new
> > divergence in providing an XMLDOCUMENT function that just returns
> > its argument. That's the correct result to return for anything that's
> > a valid value of our XML type to begin with.
> >
> >> if no other database implements it the way we are
> > There may be other systems that don't implement it at all, for which
> > case I don't see any compatibility issue created because we have it
> > and they do not.
>
>
> Yeah, as long as we stick to the specification, I don’t see any issue
> with including it.
>
>
> >
> > There may be systems that implement the SQL/XML type hierarchy more
> > completely than we do, so that it would be possible for their
> > XMLDOCUMENT to be called with an XML(SEQUENCE) argument, or with
> > RETURNING SEQUENCE, both of which are things that can't happen
> > in PostgreSQL. I don't see a problem in that either, as long as
> > theirs produces results matching ours in the RETURNING CONTENT,
> > passed an XML(CONTENT...) argument case.
> >
> > If another system produces results that differ, in that restricted
> > domain corresponding to ours, I'd say something's nonconformant in
> > that implementation. In my opinion, that would only be a problem
> > for us if the system in question is an 800 lb gorilla and has many
> > users relying on the differing behavior.
> >
> > Regarding the patch itself: I wonder if there is already, somewhere
> > in the code base, a generic fmgr identity function that the pg_proc
> > entry could point to. Or is there a strong project tradition in favor
> > of writing a dedicated new one to use here? I'm not sure it's critical
> > to have a version that tests USE_LIBXML or reports it's unsupported,
> > because without support I doubt there's any way to pass it a non-null
> > XML argument, and if declared STRICT it won't be called for a null one
> > anyway.
> >
> > Regards,
> > -Chap
> >
> > [1] https://www.postgresql.org/docs/17/xml-limits-conformance.html
>
>
> Regarding compatibility, here is an example of how both implementations
> handle single-rooted XML, non-single-rooted XML, and NULL values.
>
>
> == DB2 [1] ==
>
> WITH t(x) AS (
> VALUES
> (xmlparse(DOCUMENT '<root><foo>bar</foo></root>')),
> (xmlforest(42 AS foo, 73 AS bar)),
> (NULL)
> )
> SELECT xmldocument(x) FROM t;
>
> ----------------------------
> <root><foo>bar</foo></root>
> <FOO>42</FOO><BAR>73</BAR>
>

I think so in this form (just forward input to output) I have no
objection.

There is a benefit with a) possible zero work with migration from db2, b)
nobody needs to repeat a work which is a correct implementation of
XMLDOCUMENT function.

Maybe opened question can be implementation like classic scalar function or
via XmlExpr

In this moment I prefer to use XmlExpr from consistency reasons

Regards

Pavel

-
>
> 3 record(s) selected.
>
>
>
> == PostgreSQL ==
>
> WITH t(x) AS (
> VALUES
> (xmlparse(DOCUMENT '<root><foo>bar</foo></root>')),
> (xmlforest(42 AS foo, 73 AS bar)),
> (NULL)
> )
> SELECT xmldocument(x) FROM t;
>
> xmldocument
> -----------------------------
> <root><foo>bar</foo></root>
> <foo>42</foo><bar>73</bar>
>
> (3 rows)
>
>
> To make it clear: ensuring this function is compatible with other
> database products is IMHO beneficial (and is my primary motivation), but
> it shouldn't come at the expense of violating the SQL/XML specification.
> I mean, if in some edge case, another database system implemented
> XMLDocument in a way that deviates from the standard, I'd argue it’s not
> worth prioritizing compatibility -- assuming, of course, that we are
> fully following the standard.
>
>
> Best regards, Jim
>
>
> 1 - https://dbfiddle.uk/G9VoHKp7
>
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-01-23 06:59:47 Re: Log a warning in pg_createsubscriber for max_slot_wal_keep_size
Previous Message Tatsuo Ishii 2025-01-23 06:27:41 Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options