From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: working around JSONB's lack of stats? |
Date: | 2015-01-30 20:26:42 |
Message-ID: | 54CBE902.6000309@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 01/28/2015 03:50 PM, Peter Geoghegan wrote:
> On Wed, Jan 28, 2015 at 3:42 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> jsonb_col @> '[ "key1" ]'
>> or jsonb_col ? 'key1'
>> if in MCE, assign % from MCE
>> otherwise assign 1% of non-MCE %
>>
>> jsonb_col @> '{ "key1": "value1" }'
>> if in MCE, assign MCE% * 0.1
>> otherwise assign 0.01 of non-MCE %
>>
>> Does that make sense?
>
> I suspect it makes a lot less sense. The way people seem to want to
> use jsonb is as a document store with a bit of flexibility. Individual
> JSON documents tend to be fairly homogeneous in structure within a
> table, just like with systems like MongoDB. Strings within arrays are
> keys for our purposes, and these are often used for tags and so on.
> But Strings that are the key of an object/pair are much less useful to
> index, in my estimation.
Yeah, I see your point; except for arrays, people are usually searching
for a key:value pair, and the existence of the key is not in doubt.
That would make the "element" the key:value pair, no? But
realistically, we would only want to do that for simple keys and values.
Although: if you "flatten" a nested JSON structure into just keys with
scalar values (and array items as their own thing), then you could have
a series of expanded key:value pairs to put into MCE.
For example:
{ house : { city : San Francisco,
sqft: 1200,
color: blue,
occupants: [ mom, dad, child1 ]
}
occupation: programmer
}
... would get flattened out into the following pairs:
city: san francisco
sqft: 1200
color: blue
occupants: [ mom ]
occupants: [ dad ]
occupants: [ child1 ]
occupation: programmer
This would probably work because there aren't a lot of data structures
where people would have the same key:value pair in different locations
in the JSON, and care about it stats-wise. Alternatetly, if the same
key-value pair appears multiple times in the same sample row, we could
cut the MC% by that multiple.
No?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Slava Mudry | 2015-01-30 23:44:00 | why pg_class.relfrozenxid needs to be updated for frozen tables (where all rows have xmin=2)? |
Previous Message | Tomas Vondra | 2015-01-28 23:55:25 | Re: working around JSONB's lack of stats? |