Re: Automatically parsing in-line composite types

From: Fabio Ugo Venchiarutti <f(dot)venchiarutti(at)ocado(dot)com>
To: Mitar <mmitar(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Dave Cramer <pg(at)fastcrypt(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Automatically parsing in-line composite types
Date: 2019-10-30 16:51:13
Message-ID: 25e4fa14-28c9-c2b3-404f-2574d111cdeb@ocado.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 30/10/2019 16:15, Mitar wrote:
> Hi!
>
> On Wed, Oct 30, 2019 at 8:37 AM Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> Check out libpqtypes: https://github.com/pgagarinov/libpqtypes
>
> Interesting. I have looked at the code a bit but I do not find how it
> determines the type for inline compound types, like the ones they
> appear in my original SQL query example. Could you maybe point me to
> the piece of code there handling that? Because to my
> understanding/exploration that information is simply not exposed to
> the client in any way. :-(
>
>> it does exactly what you want. It's a wrapper for libpq that provides
>> client side parsing for the binary protocol with array and composite
>> type parsing.
>
> It looks to me that it does parsing of composite types only if they
> are registered composite types. But not for example ones you get if
> you project a subset of fields from a table in a subquery. That has no
> registered composite type?
>
> Also, how you are handling discovery of registered types, do you read
> that on-demand from the database? They are not provided over the wire?
>
>> Virtually any
>> non-C client application really ought to be using json rather than the
>> custom binary structures libpqtyps would provide.
>
> I thought that initially, too, but then found out that JSON has some
> heavy limitations because the implementation in PostgreSQL is standard
> based. There is also no hook to do custom encoding of non-JSON values.
> So binary blobs are converted in an ugly way (base64 would be better).
> You also loose a lot of meta-information, because everything non-JSON
> gets converted to strings automatically. Like knowing what is a date.
> I think MongoDB with BSON made much more sense here. It looks like
> perfect balance between simplicity of JSON structure and adding few
> more useful data types.
>
> But yes, JSON is great also because clients often have optimized JSON
> readers. Which can beat any other binary serialization format. In
> node.js, it is simply the fastest there is to transfer data:
>
> https://mitar.tnode.com/post/in-nodejs-always-query-in-json-from-postgresql/
>
>
> Mitar
>

Then perhaps, as opposed to wedging this into the tabular paradigm, a
transition to more targeted support for hierarchical result
representation would be preferable, just done directly by the backend an
rendered by libpq... (perhaps still encapsulated as a DataRow field not
to break the traditional model. Or perhaps a special RowDescription-like
message in the backend protocol? Not my place to strongly push proposals
there).

There's a lot of room for optimisation if done natively (think label
deduplication at the source. Not sure if BSON works this way too).

There's also the problem of independent implementations of the
protocol...AFAIK the JDBC client is not a wrapper to libpq and they'd
also have to break their result surfacing paradigms to make it work...

Sounds like an enormous risk & undertaking for the hackers TBH, and I
currently see another limiting factor to idea's popularity: as it
stands, advanced SQL is daunting for the much of the industry, and IMHO
the queries to generate arbitrarily structured & lightweight inline
types/relations are relatively verbose and deeply nested (eg: last time
I checked, stripping/renaming some attributes from a relation required
subselecting them, and - prehaps due to PEBCAK - can't think of a way to
create results as associative arrays indexed by attributes).

For this to gain traction, a more streamlined syntax/functions/operators
for precision work may be necessary, or the result would only satisfy a
narrow set of users who are already intimate with the state of affairs.

Can't help thinking that the current JSON-over-field pinhole may already
be at the sweet spot between usefulness and inter-operability with
existing systems. Just the SQL side of it could be less noisy and, yes,
data type pidgeonhole problem could benefit from something like a GUC
setting to electively break standard JSON compatibility, function
arguments or else.

--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--

Notice:
This email is confidential and may contain copyright material of
members of the Ocado Group. Opinions and views expressed in this message
may not necessarily reflect the opinions and views of the members of the
Ocado Group.

If you are not the intended recipient, please notify us
immediately and delete all copies of this message. Please note that it is
your responsibility to scan this message for viruses.

References to the
"Ocado Group" are to Ocado Group plc (registered in England and Wales with
number 7098618) and its subsidiary undertakings (as that expression is
defined in the Companies Act 2006) from time to time. The registered office
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way,
Hatfield, Hertfordshire, AL10 9UL.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message rihad 2019-10-30 16:59:39 Upgrade procedure
Previous Message M Tarkeshwar Rao 2019-10-30 16:47:27 RE: Can you please tell us how set this prefetch attribute in following lines.