Re: [PATCH] Add CANONICAL option to xmlserialize

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
Cc: Chapman Flack <chap(at)anastigmatix(dot)net>, vignesh C <vignesh21(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: [PATCH] Add CANONICAL option to xmlserialize
Date: 2024-08-30 04:46:12
Message-ID: CAFj8pRCXU67xb3w0OESaHgqWDVcbfKNy=w_6ZVWGucZbfhxe_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 29. 8. 2024 v 23:54 odesílatel Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
napsal:

>
>
> On 29.08.24 20:50, Pavel Stehule wrote:
> >
> > I know, but theoretically, there can be some benefit for CANONICAL if
> > pg supports bytea there. Lot of databases still use non utf8 encoding.
> >
> > It is a more theoretical question - if pg supports different types
> > there in future (because SQL/XML or Oracle), then CANONICAL can be
> > used without limit,
> I like the idea of extending the feature to support bytea. I can
> definitely take a look at it, but perhaps in another patch? This change
> would most likely involve transformXmlSerialize in parse_expr.c, and I'm
> not sure of the impact in other usages of XMLSERIALIZE.
> > or CANONICAL can be used just for text? And you are sure, so you can
> > compare text X text, instead xml X xml?
> Yes, currently it only supports varchar or text - and their cousins. The
> idea is to format the xml and serialize it as text in a way that they
> can compared based on their content, independently of how they were
> written, e.g '<foo a="1" b="2"/>' is equal to '<foo b="2" a="1"/>'.
>
> >
> > +SELECT xmlserialize(CONTENT doc AS text CANONICAL) =
> > xmlserialize(CONTENT doc AS text CANONICAL WITH COMMENTS) FROM
> > xmltest_serialize;
> > + ?column?
> > +----------
> > + t
> > + t
> > +(2 rows)
> >
> > Maybe I am a little bit confused by these regress tests, because at
> > the end it is not too useful - you compare two identical XML, and WITH
> > COMMENTS and WITHOUT COMMENTS is tested elsewhere. I tried to search
> > for a sense of this test. Better to use really different documents
> > (columns) instead.
>
> Yeah, I can see that it's confusing. In this example I actually just
> wanted to test that the default option of CANONICAL is CANONICAL WITH
> COMMENTS, even if you don't mention it. In the docs I mentioned it like
> this:
>
> "The optional parameters WITH COMMENTS (which is the default) or WITH NO
> COMMENTS, respectively, keep or remove XML comments from the given
> document."
>
> Perhaps I should rephrase it? Or maybe a comment in the regression tests
> would suffice?
>

comment will be enough

>
> Thanks a lot for the input!
>
> --
> Jim
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2024-08-30 05:00:50 Re: Make printtup a bit faster
Previous Message shveta malik 2024-08-30 04:10:23 Re: Collect statistics about conflicts in logical replication