Re: WIP Patch: Add a function that returns binary JSONB as a bytea

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Van <kevinvan(at)shift(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP Patch: Add a function that returns binary JSONB as a bytea
Date: 2018-10-31 15:18:46
Message-ID: 20181031151846.GW4184@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> >> I dunno, I do not think it's a great idea to expose jsonb's internal
> >> format to the world. We intentionally did not do that when the type
> >> was first defined --- that's why its binary I/O format isn't already
> >> like this --- and I don't see that the tradeoffs have changed since then.
>
> > I disagree- it's awfully expensive to go back and forth between string
> > and a proper representation.
>
> Has anyone put any effort into making jsonb_out() faster? I think that
> that would be way more productive. Nobody is going to want to write
> code to convert jsonb's internal form into whatever their application
> uses; particularly not dealing with numeric fields.

I'm all for making jsonb_out() faster, but even a faster jsonb_out()
isn't going to be faster than shoveling the jsonb across.

The concern over the application question seems like a complete red
herring to me- people will bake into the various libraries the ability
to convert from our jsonb format to the language's preferred json data
structure, and even if that doesn't happen, we clearly have someone here
who is saying they'd write code to convert from our jsonb format to
whatever their application or language uses, and we've heard that
multiple times on this list.

> In any case, the approach proposed in this patch seems like the worst
> of all possible worlds: it's inconsistent and we get zero benefit from
> having thrown away our information-hiding. If we're going to expose the
> internal format, let's just change the definition of the type's binary
> I/O format, thereby getting a win for purposes like COPY BINARY as well.

I hadn't looked at the patch, so I'm not surprised it has issues. I
recall there was some prior discussion about what a good approach was to
implementing this and that changing the binary i/o format was the way to
go, but certainly a review of the threads should be done by whomever
wants to implement this or review the patch.

> We'd need to ensure that jsonb_recv could tell whether it was seeing the
> old or new format, but at worst that'd require prepending a header of
> some sort. (In practice, I suspect we'd end up with a wire-format
> definition that isn't exactly the bits-on-disk, but something easily
> convertible to/from that and more easily verifiable by jsonb_recv.
> Numeric subfields, for instance, ought to match the numeric wire
> format, which IIRC isn't exactly the bits-on-disk either.)

Agreed, that'd certainly be a good idea.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-10-31 15:22:49 Re: WIP Patch: Add a function that returns binary JSONB as a bytea
Previous Message Andrew Dunstan 2018-10-31 15:13:13 Re: WIP Patch: Add a function that returns binary JSONB as a bytea