From: | Christian Ohler <ohler(at)shift(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Using JSONB directly from application |
Date: | 2018-06-22 21:52:36 |
Message-ID: | CAOsiKEL7+KkV0C_hAJWxqwTg+PYVfiGPQ0yjFww7ECtqwBjb+Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
(continuing an old thread from
https://www.postgresql.org/message-id/CAMsr%2BYEtamQYZ5EocsuthQCvyvmRnQrucDP6GZynPtf0gsMbuw%40mail.gmail.com
)
Craig Ringer <craig(at)2ndquadrant(dot)com>, 2018-02-26:
> On 26 February 2018 at 04:05, Anthony Communier
<anthony(dot)communier(at)gmail(dot)com> wrote:
>
> > It would be nice if application connected to a Postrgesql database could
> > send and receive JSONB in binary. It could save some useless text
> > conversion. All works could be done on application side which are often
> > more scalable than database itself.
>
> To support this, you'd need to extract PostgreSQL's jsonb support into a C
> library that could be used independently of backend server infrastructure
> like 'palloc' and memory contexts, ereport(), etc. Or write a parallel
> implementation.
At Shift, we also have use cases that would likely be sped up quite a bit
if we could avoid the conversion from JSONB to JSON, and instead pass
binary JSONB to the application side and parse it there (in Go). I doubt
we'd want to reuse any of Postgres's C code, and would instead go with your
"parallel implementation" idea; I can't imagine it being particularly
difficult to implement a JSONB parser from scratch.
All we need, I think, is a Postgres function raw_jsonb(jsonb) that returns
bytea but is the identity function at the byte level. (Or allow a cast
from jsonb to bytea.)
Our Go code would then send queries like SELECT col1, col2, raw_jsonb(col3)
FROM table1 WHERE ...; I haven't thought in depth about how we'd parse the
JSONB in Go, but perhaps we can synthesize a stream of JSON tokens from the
binary JSONB (one token at a time, to avoid copies and allocations) and
adapt the streaming parser https://github.com/json-iterator/go to turn it
into Go values.
Sending raw JSONB to Postgres might also be interesting, but I'd start with
receiving.
Would implementing raw_jsonb be as trivial as it sounds? What about usages
like SELECT raw_jsonb(col3->'foo'); does the subobject returned by '->'
share structure with the containing object, making the conversion to a
self-contained JSONB value less direct?
Can these conversions be implemented without copying the bytes?
An open question about the API contract would be how raw_jsonb would be
affected if Postgres introduces a version 2 of JSONB encoding. My
intuition is to punt that problem to the application, and define that
raw_jsonb returns whatever version of JSONB is most convenient for Postgres
for that particular datum; this minimizes conversion work on the Postgres
side, which is the purpose of the mechanism. Applications that want a
stable format can use the conventional textual JSON format. But I could
see a case for making the function raw_jsonb(int, jsonb) and allowing the
caller to specify what (maximum?) version of JSONB they want.
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2018-06-22 22:18:55 | Re: Speeding up INSERTs and UPDATEs to partitioned tables |
Previous Message | Alvaro Herrera | 2018-06-22 21:51:59 | Re: bug with expression index on partition |