Re: [PoC] XMLCast (SQL/XML X025)

From: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: [PoC] XMLCast (SQL/XML X025)
Date: 2024-11-11 19:43:17
Message-ID: 800e2b29-7cf9-42e0-9d95-12b3007d2f7a@uni-muenster.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Robert
Thanks for taking a look at it.

On 11.11.24 19:15, Robert Haas wrote:
> Hmm, this patch has gotten no responses for 4 months. That's kind of
> unfortunate. Sadly, there's not a whole lot that I can do to better
> the situation, because I know very little either about XML-related
> standards or about how people make use of XML in practice. It's not
> that much code, so if it does a useful thing that we actually want, we
> can probably figure out how to verify that the code is correct, or fix
> it. But I don't know whether it's a useful thing that we actually
> want. Syntactically, XMLCAST() looks a lot like CAST(), so one might
> ask whether the things that it does can already be accomplished using
> CAST(); or whether, perhaps, we have some other existing method for
> performing such conversions.
It indeed has a huge overlap with CAST(), except for a few handy SQL <->
XML mappings, such as

SELECT xmlcast('foo &amp; &lt;&quot;bar&quot;&gt;'::xml AS text);

    xmlcast    
---------------
 foo & <"bar">
(1 row)

--

SELECT
  xmlcast('2024-05-29 12:04:10.703585+02'::timestamp without time zone
AS xml),
  xmlcast('2024-05-29T12:04:10.703585'::xml AS timestamp without time zone);
 
          xmlcast           |          xmlcast           
----------------------------+----------------------------
 2024-05-29T12:04:10.703585 | 2024-05-29 12:04:10.703585
(1 row)

--

SELECT
  xmlcast('P1Y2M3DT4H5M6S'::xml AS interval),
  xmlcast('1 year 2 mons 3 days 4 hours 5 minutes 6 seconds'::interval
AS xml);
 
            xmlcast            |    xmlcast     
-------------------------------+----------------
 1 year 2 mons 3 days 04:05:06 | P1Y2M3DT4H5M6S
(1 row)

--

SELECT CAST('42'::xml AS int);

ERROR:  cannot cast type xml to integer
LINE 1: SELECT CAST('42'::xml AS int);
               ^
--

SELECT XMLCAST('42'::xml AS int);
 xmlcast
---------
      42
(1 row)

> The only thing I found during a quick perusal of the documentation was
> XMLTABLE(), which seems a bit baroque if you just want to convert one
> value. Is this intended to plug that gap? Is there any other current
> way of doing it?
>
> Do we need to ensure some kind of consistency between XMLTABLE() and
> XMLCAST() in terms of how they behave?

I haven't considered any compatibility to XMLTABLE(), as it has a
different spec (X300-X305), but I can take a look at it! To implement
this function I just followed the SQL/XML spec "ISO/IEC IWD 9075-14" -
and from time to time I also took a look on how other databases
implemented it.[1]

> The documentation at
> https://www.postgresql.org/docs/current/xml-limits-conformance.html#FUNCTIONS-XML-LIMITS-CASTS
> says that "When PostgreSQL maps SQL data values to XML (as in
> xmlelement), or XML to SQL (as in the output columns of xmltable),
> except for a few cases treated specially, PostgreSQL simply assumes
> that the XML data type's XPath 1.0 string form will be valid as the
> text-input form of the SQL datatype, and conversely." Unfortunately,
> it does not specify what those cases treated specially are, and the
> commit that added that documentation text is not the one that added
> the underlying code, so I don't actually know where that code is, but
> one would expect this function to conform to that general rule.

I agree. It would be nice to know which cases those are.
However, invalid inputs should normally return an error, e.g.

SELECT xmlcast('foo&bar'::xml AS text);

ERROR:  invalid XML content
LINE 1: SELECT xmlcast('foo&bar'::xml AS text);
                       ^
DETAIL:  line 1: EntityRef: expecting ';'
foo&bar
       ^
--

SELECT xmlcast('foo'::xml AS date);
ERROR:  invalid input syntax for type date: "foo"

--

.. but perhaps the text means something else?

Thanks!

Best, Jim

1 - https://dbfiddle.uk/ZSpsyIal

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2024-11-11 19:52:17 Re: Update Unicode data to Unicode 16.0.0
Previous Message Daniel Gustafsson 2024-11-11 19:43:00 Re: Offsets of `struct Port` are no longer constant