From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Wanted: jsonb on-disk representation documentation |
Date: | 2014-05-07 18:13:44 |
Message-ID: | 20140507181344.GK13397@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2014-05-07 10:55:28 -0700, Peter Geoghegan wrote:
> On Wed, May 7, 2014 at 4:40 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > * The jentry representation should be changed so it's possible to get the type
> > of a entry without checking individual types. That'll make code like
> > findJsonbValueFromSuperHeader() (well, whatever you've renamed it to)
> > much easier to read. Preferrably so it an have the same values (after
> > shifting/masking) ask the JBE variants. And it needs space for futher
> > types (or representations thereof).
>
> I don't know what you mean. Every place that macros like
> JBE_ISNUMERIC() are used subsequently involves access to (say) numeric
> union fields. At best, you could have those functions check that the
> types match, and then handle each case in a switch that only looked at
> (say) the "candidate", but that doesn't really save much at all. It
> wouldn't take much to have the macros give enum constant values back
> as you suggest, though.
Compare
for (i = 0; i < count; i++)
{
JEntry *e = array + i;
if (JBE_ISNULL(*e) && key->type == jbvNull)
{
result->type = jbvNull;
result->estSize = sizeof(JEntry);
}
else if (JBE_ISSTRING(*e) && key->type == jbvString)
{
result->type = jbvString;
result->val.string.val = data + JBE_OFF(*e);
result->val.string.len = JBE_LEN(*e);
result->estSize = sizeof(JEntry) + result->val.string.len;
}
else if (JBE_ISNUMERIC(*e) && key->type == jbvNumeric)
{
result->type = jbvNumeric;
result->val.numeric = (Numeric) (data + INTALIGN(JBE_OFF(*e)));
result->estSize = 2 * sizeof(JEntry) +
VARSIZE_ANY(result->val.numeric);
}
else if (JBE_ISBOOL(*e) && key->type == jbvBool)
{
result->type = jbvBool;
result->val.boolean = JBE_ISBOOL_TRUE(*e) != 0;
result->estSize = sizeof(JEntry);
}
else
continue;
if (compareJsonbScalarValue(key, result) == 0)
return result;
}
with
for (i = 0; i < count; i++)
{
JEntry *e = array + i;
if (!JBE_TYPE_IS_SCALAR(*e))
continue;
if (JBE_TYPE(*e) != key->type)
continue;
result = getJsonbValue(e);
if (compareJsonbScalarValue(key, result) == 0)
return result;
}
Yes, it's not a fair comparison because I made up getJsonbValue(). But
it's *much* more readable regardless. And there's more places that could
use it. Like the second half of findJsonbValueFromSuperHeader(). FWIW,
that's one of the requests I definitely made before.
> > * I wonder if the hash/object pair representation is wise and if it
> > shouldn't be keys combined with offsets first, and then the
> > values. That will make access much faster. And that's what jsonb
> > essentially is about.
>
> I like that the physical layout reflects the layout of the original
> JSON document.
Don't see much value in that. This is a binary representation and it'd
be bijective.
> Besides, I don't think this is obviously the case. Are
> you sure that it won't be more useful to make children as close to
> their parents as possible? Particularly given idiomatic usage.
Because - if done right - it would allow elementwise access without
scanning previous values.
> > * I think both arrays and hashes should grow individual structs. With
> > space for additional flags.
> Why?
Because a) it will make the code more readable b) it'll allow for
adding different representations of hashes/arrays.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2014-05-07 18:20:34 | Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers |
Previous Message | Peter Geoghegan | 2014-05-07 18:13:14 | Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers |