From: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com> |
Subject: | Re: [PATCH] Add pretty-printed XML output option |
Date: | 2023-03-13 12:08:11 |
Message-ID: | efe0b19b-d41c-f8ab-f3b8-afb0108f3706@uni-muenster.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10.03.23 15:32, Tom Lane wrote:
> Jim Jones<jim(dot)jones(at)uni-muenster(dot)de> writes:
>> On 09.03.23 21:21, Tom Lane wrote:
>>> I've looked through this now, and have some minor complaints and a major
>>> one. The major one is that it doesn't work for XML that doesn't satisfy
>>> IS DOCUMENT. For example,
>> How do you suggest the output should look like?
> I'd say a series of node trees, each starting on a separate line.
v22 attached enables the usage of INDENT with non singly-rooted xml.
postgres=# SELECT xmlserialize (CONTENT '<bar><val
x="y">42</val></bar><foo>73</foo>' AS text INDENT);
xmlserialize
-----------------------
<bar> +
<val x="y">42</val>+
</bar> +
<foo>73</foo>
(1 row)
I tried several libxml2 dump functions and none of them could cope very
well with an xml string without a root node. So added them into a
temporary root node, so that I could iterate over its children and add
them one by one (formatted) into the output buffer.
I slightly modified the existing xml_parse() function to return the list
of nodes parsed by xmlParseBalancedChunkMemory:
xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace,
int encoding, Node *escontext, *xmlNodePtr *parsed_nodes*)
res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0,
utf8string + count, *parsed_nodes*);
>> I was mistakenly calling xml_parse with GetDatabaseEncoding(). It now
>> uses the encoding of the given doc and UTF8 if not provided.
> Mmmm .... doing this differently from what we do elsewhere does not
> sound like the right path forward. The input *is* (or had better be)
> in the database encoding.
I changed that behavior. It now uses GetDatabaseEncoding();
Thanks!
Best, Jim
Attachment | Content-Type | Size |
---|---|---|
v22-0001-Add-pretty-printed-XML-output-option.patch | text/x-patch | 35.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2023-03-13 12:20:59 | Re: Lock mode in ExecMergeMatched() |
Previous Message | 'Sandro Santilli' | 2023-03-13 11:59:16 | Re: Ability to reference other extensions by schema in extension scripts |