Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes

From: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes
Date: 2024-09-06 22:46:06
Message-ID: 78805495-ebb1-4951-b8d9-0099d27b0c3e@uni-muenster.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tom

On 06.09.24 18:34, Tom Lane wrote:
> I think it'd be quite foolish to assume that every extant and future
> version of libxml2 will share this glitch. Probably should use
> logic more like pg_strip_crlf(), although we can't use that directly.
Makes sense. I Introduced this logic in the end of
xmltotext_with_options() in case it was called with INDENT and DOCUMENT
type xml string.

SELECT xmlserialize(DOCUMENT '<foo><bar>42</bar></foo>' AS text INDENT);
  xmlserialize   
-----------------
 <foo>          +
   <bar>42</bar>+
 </foo>
(1 row)

The regression tests were updated accordingly - see patch v2-0002.
> Would it ever be the case that trailing whitespace would be valid
> data? In a bit of testing, it seems like that could be true in
> CONTENT mode but not DOCUMENT mode.
Yes, in case of CONTENT it is valid data and it will be preserved, as
CONTENT can be pretty much anything.

SELECT xmlserialize(CONTENT E'<foo><bar>42</bar></foo>\n\n\t\t\t' AS
text INDENT);
       xmlserialize       
--------------------------
 <foo>                   +
   <bar>42</bar>         +
 </foo>                  +
                         +
                         
(1 row)

With DOCUMENT it is superfluous and should be removed after indentation.
IIRC there's an xmlSaveToBuffer option called XML_SAVE_WSNONSIG that can
be used to preserve it.

Thanks

Best, Jim

Attachment Content-Type Size
v2-0002-Bug-fix-remove-default-trailing-newline-from-XMLS.patch text/x-patch 5.8 KB
v2-0001-Bug-fix-for-XMLSERIALIZE-.INDENT-for-xml-containi.patch text/x-patch 5.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2024-09-06 23:25:39 Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Previous Message Jelte Fennema-Nio 2024-09-06 22:40:12 Re: PostgreSQL 17 release announcement draft