Re: Automatically parsing in-line composite types

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Mitar <mmitar(at)gmail(dot)com>
Cc: f(dot)venchiarutti(at)ocado(dot)com, Dave Cramer <pg(at)fastcrypt(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Automatically parsing in-line composite types
Date: 2019-10-30 22:07:32
Message-ID: CAHyXU0y0aN6ML40i1preHYVL4xYKdNSnQ7GoEv9FTPykUUwZVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Oct 30, 2019 at 11:15 AM Mitar <mmitar(at)gmail(dot)com> wrote:
>
> Hi!
>
> On Wed, Oct 30, 2019 at 8:37 AM Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> > Check out libpqtypes: https://github.com/pgagarinov/libpqtypes
>
> Interesting. I have looked at the code a bit but I do not find how it
> determines the type for inline compound types, like the ones they
> appear in my original SQL query example. Could you maybe point me to
> the piece of code there handling that? Because to my
> understanding/exploration that information is simply not exposed to
> the client in any way. :-(

It looks it up from the database. See implementation in
https://github.com/pgagarinov/libpqtypes/blob/master/source/src/handler.c
(look for macro LOOKUP_TYPES) and usage in
https://github.com/pgagarinov/libpqtypes/blob/master/source/src/regression-test.c.

> > it does exactly what you want. It's a wrapper for libpq that provides
> > client side parsing for the binary protocol with array and composite
> > type parsing.
>
> It looks to me that it does parsing of composite types only if they
> are registered composite types. But not for example ones you get if
> you project a subset of fields from a table in a subquery. That has no
> registered composite type?

Correct. Only declared (via CREATE TYPE) composite types will work due
to protocol limitations. For custom C programming this is fine, but
limiting if you are interested in writing a driver that can handle any
type of object the database can throw at you; (and, don't forget
custom types at the C level!).

> Also, how you are handling discovery of registered types, do you read
> that on-demand from the database? They are not provided over the wire?

The client application has to declare at connection time which types
it is interested in, then they are looked up in the SQL level. This
would be fairly typical of C applications, I think; but mostly this
works around the limitations of the binary protocol.

> > Virtually any
> > non-C client application really ought to be using json rather than the
> > custom binary structures libpqtyps would provide.
>
> I thought that initially, too, but then found out that JSON has some
> heavy limitations because the implementation in PostgreSQL is standard
> based. There is also no hook to do custom encoding of non-JSON values.
> So binary blobs are converted in an ugly way (base64 would be better).
> You also loose a lot of meta-information, because everything non-JSON
> gets converted to strings automatically. Like knowing what is a date.
> I think MongoDB with BSON made much more sense here. It looks like
> perfect balance between simplicity of JSON structure and adding few
> more useful data types.
>
> But yes, JSON is great also because clients often have optimized JSON
> readers. Which can beat any other binary serialization format. In
> node.js, it is simply the fastest there is to transfer data:

Sure, JSON has only very, very basic type support. In a practical
sense this means type safety has to be built above the json layer,
(e.g. {"field: "foo", "type":"shape", "data": "(1,1), (2,2)"}) exactly
as we do with the textual sql protocol.

The postgres hacker community will tend to target major standards as a
matter of culture and prudence..this used to annoy me, now I support
this strongly. Since the type system is extensible it's theoretically
possible to implement bson support or some other funky type safe
format. I would personally argue (perhaps with good company against
such a type being incorporated in core, or even in contrib. In fact, I
argued (win some, lose some, heh!) that jsonb should be in contrib,
not core; we should be moving stuff OUT of the regular namespace and
into extensions, not the other way around..

Aside: now that stored procedures are good to go, there is one feature
left that IMNSHO postgres desperately needs, and that is external
package repository management (on the order of CPAN, npm, etc) so that
3rd party repositories handle grooming, precompiling, packaging,
downloading, and deploying (via SQL) of extensions that interact with
the database at the C level. Enterprise environments and managed
postgres providers will never allow custom compiled C extensions which
is the kiss of death; even if I wanted to use those extensions, I
can't. So if you decided to scratch in itch and create a postgres
BSON type, no one would likely use it, since the chances of adoption
in core are slim to none.

I had seen your article, and liked it. During the development of what
was to become the jsonb type, I had argued quite strenuously not to
have it completely displace the old json type variant on performance
and other grounds.

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2019-10-30 22:41:05 Re: Automatically parsing in-line composite types
Previous Message Laurenz Albe 2019-10-30 17:50:28 Re: Getting following error in using cursor to fetch the records from a large table in c language