From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Anton <antonin(dot)houska(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Native XML |
Date: | 2011-03-01 13:43:58 |
Message-ID: | 4D6CF81E.8020100@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 03/01/2011 08:16 AM, Robert Haas wrote:
> On Mon, Feb 28, 2011 at 6:54 PM, Andrew Dunstan<andrew(at)dunslane(dot)net> wrote:
>> There seems to be an almost universal assumption that storing XML in its
>> native form (i.e. a text stream) is going to produce inefficient results.
>> Maybe it will, but I think it needs to be fairly convincingly demonstrated.
>> And then we would have to consider the costs. For example, unless we
>> implemented our own XPath processor to work with our own XML format (do we
>> really want to do that?), to evaluate an XPath expression for a piece of XML
>> we'd actually need to produce the text format from our internal format
>> before passing it to some external library to parse into its internal format
>> and then process the XPath expression. That means we'd actually be making
>> things worse, not better. But this is clearly the sort of processing people
>> want to do - see today's discussion upthread about xpath_table.
> Well, obviously the only point of having our own internal format is if
> we have our own xpath processor&c to match. One would think that
> this would be a lot faster than parsing the string with libxml2 every
> time we want to xpath it, especially for large documents. But then
> again, I haven't seen any benchmarks.
That would be a huge body of code we'd need to maintain, complex and
full of subtleties which, if we weren't deeply invested in the XML
standards would bite us, I have no doubt.
Now, if someone wanted to start a project that added efficient
serialization/de-serialization of libxml2 (or other library) objects so
we could avoid constant parsing overhead, that would make lots more
sense to me.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2011-03-01 14:17:23 | Re: PG signal handler and non-reentrant malloc/free calls |
Previous Message | Robert Haas | 2011-03-01 13:40:37 | Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum |