| From: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com> | 
| Subject: | Re: [PATCH] Add pretty-printed XML output option | 
| Date: | 2023-03-13 12:08:11 | 
| Message-ID: | efe0b19b-d41c-f8ab-f3b8-afb0108f3706@uni-muenster.de | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 10.03.23 15:32, Tom Lane wrote:
> Jim Jones<jim(dot)jones(at)uni-muenster(dot)de>  writes:
>> On 09.03.23 21:21, Tom Lane wrote:
>>> I've looked through this now, and have some minor complaints and a major
>>> one.  The major one is that it doesn't work for XML that doesn't satisfy
>>> IS DOCUMENT.  For example,
>> How do you suggest the output should look like?
> I'd say a series of node trees, each starting on a separate line.
v22 attached enables the usage of INDENT with non singly-rooted xml.
postgres=# SELECT xmlserialize (CONTENT '<bar><val 
x="y">42</val></bar><foo>73</foo>' AS text INDENT);
      xmlserialize
-----------------------
  <bar>                +
    <val x="y">42</val>+
  </bar>               +
  <foo>73</foo>
(1 row)
I tried several libxml2 dump functions and none of them could cope very 
well with an xml string without a root node. So added them into a 
temporary root node, so that I could iterate over its children and add 
them one by one (formatted) into the output buffer.
I slightly modified the existing xml_parse() function to return the list 
of nodes parsed by xmlParseBalancedChunkMemory:
xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace,
           int encoding, Node *escontext, *xmlNodePtr *parsed_nodes*)
res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0,
utf8string + count, *parsed_nodes*);
>> I was mistakenly calling xml_parse with GetDatabaseEncoding(). It now
>> uses the encoding of the given doc and UTF8 if not provided.
> Mmmm .... doing this differently from what we do elsewhere does not
> sound like the right path forward.  The input *is* (or had better be)
> in the database encoding.
I changed that behavior. It now uses GetDatabaseEncoding();
Thanks!
Best, Jim
| Attachment | Content-Type | Size | 
|---|---|---|
| v22-0001-Add-pretty-printed-XML-output-option.patch | text/x-patch | 35.3 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dean Rasheed | 2023-03-13 12:20:59 | Re: Lock mode in ExecMergeMatched() | 
| Previous Message | 'Sandro Santilli' | 2023-03-13 11:59:16 | Re: Ability to reference other extensions by schema in extension scripts |