From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> |
Cc: | "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Anton" <antonin(dot)houska(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Native XML |
Date: | 2011-03-01 19:24:16 |
Message-ID: | 24401.1299007456@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> I apparently didn't express myself very well, since you seem to have
> *completely* missed my point. I know we can do tsearch2 searches
> against XML, or JSON, or YAML, or (insert next week's new favorite
> format here). What we can't currently do efficiently is search for
> particular values in some particular place in the hierarchy of a
> document. I've had loads of fun approximating it with regular
> expressions, but some days I'd like life to be easier.
Check.
> What I was arguing for is a new type which would represent the
> structure in a fashion which was independent of the particular text
> format and was efficient to traverse hierarchically. Done right,
> that would map well to GiST. Although, thinking about that some
> more, perhaps there would be a way to create a GiST index suitable
> for that straight from the XML text, and avoid the sharded column.
> A GiST index actually seems pretty close to what such a structure
> would look like anyway....
FWIW, GIN might be a more natural match, at least for the cases where
"place in the document" has a scalar value. If you need to search for
"place" with something other than equality or prefix match semantics,
maybe not.
But in any case I think your point is that this is an indexing problem,
and whether the full document in the table column is pre-parsed or not
isn't all that relevant for performance. I agree. tsearch2 is really a
precedent for your argument, not a distinct approach, because it doesn't
expect pre-parsed text columns either.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-03-01 19:26:37 | Re: wrapping up this CommitFest (was Re: knngist - 0.8) |
Previous Message | Jan Urbański | 2011-03-01 19:20:58 | Re: pl/python tracebacks |