From: | Scott Bailey <artacus(at)comcast(dot)net> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: XML helper functions |
Date: | 2010-01-05 19:07:28 |
Message-ID: | 4B438DF0.8060108@comcast.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Merlin Moncure wrote:
> On Tue, Jan 5, 2010 at 1:14 PM, Scott Bailey <artacus(at)comcast(dot)net> wrote:
>> One of the problem with shredding XML is that it is very kludgy to get a
>> scalar value back from xpath. The xpath function always returns an array of
>> XML. So for example, to extract a numeric value you need to:
>> 1) use xpath to get the node
>> 2) get the first element of the XML array
>> 3) cast that to varchar
>> 4) cast that to numeric
>
> I just happen to be dealing with XML right now as well and my initial
> thought is that your suggestion doesn't buy you a whole lot: the root
> problem IMO is not dealing with what xpath gives you but that there is
> no DOMish representation of the xml document for you to query. You
> have to continually feed the entire document to xpath which is
> absolutely not scalable (if it works the way I think it does --
> haven't looked at the code).
No typically you'll only be passing the xml for a single "row" so what
we end up doing in Postgres typically looks something like this:
SELECT xmlvalue('/row/@id', bitesizexml)::int AS id,
xmlvalue('/row/@lat', bitesizexml)::numeric AS lat,
xmlvalue('/row/@lon', bitesizexml)::numeric,
xmlvalue('/row/comment', bitesizexml) AS cmt
FROM (
SELECT unnest(xpath('/foo/row', mybigxmldoc)) AS bitesizexml
) sub
So only the one call has to work with the entire document. All the calls
to xmlvalue are passed a much smaller node to work with.
> xpath is great for simple things but it's too texty and you need a
> more robust API to handle documents for serious parsing on the
> backend. In the short term i'd advise doing work in another pl like
> perl.
This is basically the method used for Oracle too until they provided
XMLTable functionality. They had a function xmlsequence that basically
did the unnest(xpath()) part. Hopefully we'll get xmltable support soon.
Scott
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2010-01-05 19:14:03 | true serializability and predicate locking |
Previous Message | Markus Wanner | 2010-01-05 19:06:37 | Re: Testing with concurrent sessions |