From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: jsonb and nested hstore |
Date: | 2014-03-15 01:44:59 |
Message-ID: | 5323B09B.5020308@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 15.3.2014 02:03, Greg Stark wrote:
> On Fri, Mar 14, 2014 at 9:21 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> I'm not awfully familiar with the GIN code, but based on Alexander's
>> feedback I presume fixing the GIN length limit (or rather removing it,
>> as it's a feature, not a bug) is quite straightforward. Why not to at
>> least consider that for 9.4, unless it turns more complex than expected?
>>
>> Don't get me wrong - I'm aware it's quite late in the last commitfest,
>> and if it's deemed unacceptable / endandering 9.4 release, I'm not going
>> to say a word. But if it's a simple patch ...
>
> Well I think the bigger picture is that the cases were we're getting
> this error it's because we're expecting too much from the GIN
> opclass. It's trying to index entire json objects as individual
> values which isn't really very useful. We're unlikely to go querying
> for rows where the value of a given key is a specific json object.
Stupid question - so if I have a json like this:
{ "a" : { "b" : "c"}}
the GIN code indexes {"b" : "c"} as a single value? And then takes "c"
and indexes it as a single value too?
Because otherwise I don't understand how the index could be used for
queries with @> '{"a" : {"b" : "c"}}' conditions (i.e. path "[a,b]" with
value "c").
Hmmmm, if that's how it works, removing the size limit would be
certainly more difficult than I thought.
> As I understand it Peter's right that in its current form the GIN
> opclass is only useful if you use it on an expression index on
> specific pieces of your json which are traditional non-nested hash
> tables. Or I suppose if you're really only concerned with the ?
> operator which looks for keys, which is pretty common too.
Well, depends on how you define useful. With the sample dataset
'delicious' (see Peter's post) I can do this:
SELECT doc FROM delicious
WHERE doc @> '{"title_detail" : {"value" : "TheaterMania"}}';
with arbitrary paths, and I may create a GIN index to support such
queries. And yes, it's much faster than GiST for example (by a factor of
1000).
Yes, the GIN index is quite large (~560MB for a ~1.2GB table).
> I had in mind that the GIN opclass would do something clever like
> decompose the json into all the path->value tuples so I could do
> arbitrary path lookups for values. That might be possible in the
> future but it's not what we have today and what we have today is
> already better than hstore. I think we're better off committing this
> and moving forward with the contrib hstore2 wrapper which uses this
> infrastructure so people have a migration path.
Yes, it's better than hstore - no doubt about that. The hierarchy and
data types are great, and hstore has the same size limitation.
> I don't think Josh is right to say it'll be "fixed" in 9.5. It'll be
> "better" in 9.5 because we have ambitious plans to continue
> improving in this direction. But it'll be even better in 9.6 and
> better again in 9.7. It'll never be "fixed".
I don't dare to say what will be in 9.5 (not even thinking about the
following versions).
Assuming the GIN will remain for 9.4 as it is now (both opclasses), it
would be nice if we could improve this in 9.5. I can live with custom
opclasses in an extension, if there are some ...
regards
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2014-03-15 04:17:35 | Re: jsonb and nested hstore |
Previous Message | Peter Geoghegan | 2014-03-15 01:15:42 | Re: jsonb and nested hstore |