Re: Native XML

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Anton <antonin(dot)houska(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Native XML
Date: 2011-02-28 15:30:18
Message-ID: 7155.1298907018@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 02/28/2011 04:25 AM, Anton wrote:
>> A question is of course, if potential new implementation must
>> necessarily replace the existing one, immediately or at all. What I
>> published is implemented as a new data type and thus pg_type.h and
>> pg_proc.h are the only files where something needs to be merged. From
>> technical point of view, the new type can co-exist with the existing easily.
>>
>> This however implies a question if such co-existence (whether temporary
>> or permanent) would be acceptable for users, i.e. if it wouldn't bring
>> some/significant confusion. That's something I'm not able to answer.

> The only reason we need the XML stuff in core at all and not in a
> separate module is because of the odd syntax requirements of SQL/XML.
> But those operators work on the xml type, and not on any new type you
> might invent.

Well, in principle we could allow them to work on both, just the same
way that (for instance) "+" is a standardized operator but works on more
than one datatype. But I agree that the prospect of two parallel types
with essentially duplicate functionality isn't pleasing at all.

I think a reasonable path forwards for this work would be to develop and
extend the non-libxml-based type as an extension, outside of core, with
the idea that it might replace the core implementation if it ever gets
complete enough. The main thing that that would imply that you might
not bother with otherwise is an ability to deal with existing
plain-text-style stored values. This doesn't seem terribly hard to do
IMO --- one easy way would be to insert an initial zero byte in all
new-style values as a flag to distinguish them from old-style. The
forced parsing that would occur to deal with an old-style value would be
akin to detoasting and could be hidden in the same access macros.

> We really can't just consider XSLT, and more importantly XPath, as
> separate topics. Any alternative XML implementation that doesn't include
> XPath is going to be unacceptably incomplete, IMNSHO.

Agreed. The single most pressing problem we've got with XML right now
is the poor state of the XPath extensions in contrib/xml2. If we don't
see a meaningful step forward in that area, a new implementation of the
xml datatype isn't likely to win acceptance.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-02-28 15:36:51 Re: Native XML
Previous Message Robert Haas 2011-02-28 15:13:39 Re: WIP: cross column correlation ...