Quick Links

Re: [PATCH] Add pretty-printed XML output option

From:	Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com>
Subject:	Re: [PATCH] Add pretty-printed XML output option
Date:	2023-03-13 12:08:11
Message-ID:	efe0b19b-d41c-f8ab-f3b8-afb0108f3706@uni-muenster.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 10.03.23 15:32, Tom Lane wrote:
> Jim Jones<jim(dot)jones(at)uni-muenster(dot)de> writes:
>> On 09.03.23 21:21, Tom Lane wrote:
>>> I've looked through this now, and have some minor complaints and a major
>>> one. The major one is that it doesn't work for XML that doesn't satisfy
>>> IS DOCUMENT. For example,
>> How do you suggest the output should look like?
> I'd say a series of node trees, each starting on a separate line.

v22 attached enables the usage of INDENT with non singly-rooted xml.

postgres=# SELECT xmlserialize (CONTENT '<bar><val
x="y">42</val></bar><foo>73</foo>' AS text INDENT);
     xmlserialize
-----------------------
<bar>                +
   <val x="y">42</val>+
</bar>               +
<foo>73</foo>
(1 row)

I tried several libxml2 dump functions and none of them could cope very
well with an xml string without a root node. So added them into a
temporary root node, so that I could iterate over its children and add
them one by one (formatted) into the output buffer.

I slightly modified the existing xml_parse() function to return the list
of nodes parsed by xmlParseBalancedChunkMemory:

xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace,
int encoding, Node *escontext, *xmlNodePtr *parsed_nodes*)

res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0,
utf8string + count, *parsed_nodes*);

>> I was mistakenly calling xml_parse with GetDatabaseEncoding(). It now
>> uses the encoding of the given doc and UTF8 if not provided.
> Mmmm .... doing this differently from what we do elsewhere does not
> sound like the right path forward. The input *is* (or had better be)
> in the database encoding.
I changed that behavior. It now uses GetDatabaseEncoding();

Thanks!

Best, Jim

Attachment	Content-Type	Size
v22-0001-Add-pretty-printed-XML-output-option.patch	text/x-patch	35.3 KB

In response to

Re: [PATCH] Add pretty-printed XML output option at 2023-03-10 14:32:56 from Tom Lane

Responses

Re: [PATCH] Add pretty-printed XML output option at 2023-03-14 17:40:25 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dean Rasheed	2023-03-13 12:20:59	Re: Lock mode in ExecMergeMatched()
Previous Message	'Sandro Santilli'	2023-03-13 11:59:16	Re: Ability to reference other extensions by schema in extension scripts