From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Wanted: jsonb on-disk representation documentation |
Date: | 2014-05-07 19:50:24 |
Message-ID: | CAM3SWZTkeN1KXD+63uoN7J0do-WNDOU0grEWkPqhUUL=AToi0w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 7, 2014 at 12:27 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The jsonb_ops storage format for values is inherently lossy, because it
> cannot distinguish the string values "n", "t", "f" from JSON null or
> boolean true, false respectively; nor does it know the difference between
> a number and a string containing digits. This appears to not quite be a
> bug because the consistent functions force recheck for all queries that
> care about values (as opposed to keys). But if it's documented anywhere
> I don't see where.
The fact that we *don't* set the reset flag for
JsonbExistsStrategyNumber, and why that's okay is prominently
documented. So I'd say that it is.
> And in any case, is it a good idea? We could fairly
> easily change things so that these cases are guaranteed distinguishable.
> We're using an entire byte to convey one bit of information (key or
> value); I'm inclined to redefine the flag byte so that it tells not just
> that but which JSON datatype is involved.
It seemed simpler to do it that way. As I've said before, jsonb_ops is
mostly concerned with hstore-style indexing. It could also be
particularly useful for expressional indexes on "tags" arrays of
strings, which is a common use-case.
jsonb_hash_ops on the other hand is for testing containment, which is
useful for querying heavily nested documents, where typically there is
a very low selectivity for keys. It's not the default largely because
I was concerned about not supporting all indexable operators by
default, and because I suspected that it would be preferred to have a
default closer to hstore.
Anyway, doing things that way for values won't obviate the need to set
the reset flag, unless you come up with a much more sophisticated
scheme. Existence (of keys) is only tested in respect of the
top-level. Containment (where values are tested) is more complicated.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-05-07 19:52:50 | Re: Wanted: jsonb on-disk representation documentation |
Previous Message | Robert Haas | 2014-05-07 19:35:56 | Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers |