From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: record identical operator |
Date: | 2013-09-16 14:28:23 |
Message-ID: | 20130916142823.GD5249@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2013-09-15 19:49:26 -0400, Noah Misch wrote:
> On Sat, Sep 14, 2013 at 08:58:32PM +0200, Andres Freund wrote:
> > On 2013-09-14 11:25:52 -0700, Kevin Grittner wrote:
> > > Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > > > But both arrays don't have the same binary representation since
> > > > the former has a null bitmap, the latter not. So, if you had a
> > > > composite type like (int4[]) and would compare that without
> > > > invoking operators you'd return something false in some cases
> > > > because of the null bitmaps.
> > >
> > > Not for the = operator. The new "identical" operator would find
> > > them to not be identical, though.
> >
> > Yep. And I think that's a problem if exposed to SQL. People won't
> > understand the hazards and end up using it because its faster or
> > somesuch.
>
> The important question is whether to document the new operator and/or provide
> it under a guessable name. If we give the operator a weird name, don't
> document it, and put an "internal use only" comment in the catalogs, that is
> essentially as good as hiding this feature at the SQL level.
Doesn't match my experience.
> Type-specific identity operators seem like overkill, anyway. If we find that
> meaningless variations in a particular data type are causing too many false
> non-matches for the generic identity operator, the answer is to make the
> functions generating datums of that type settle on a canonical form. That
> would be the solution for your example involving array null bitmaps.
I think that's pretty much unrealistic. I am pretty sure that if either
of us starts looking we will find at about a dozen of such cases and
miss the other dozen. Not to speak about external code which is damn
likely to contain such cases.
And I think that efficiency will often make such normalization expensive
(consider postgis where Datums afaik can exist with an internal bounding
box or without).
I think it's far more realistic to implement an identity operator that
will fall back to a type specific operator iff equals has "strange"
properties.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-09-16 14:38:46 | Re: Support for REINDEX CONCURRENTLY |
Previous Message | Merlin Moncure | 2013-09-16 14:18:04 | Re: Proposal: json_populate_record and nested json objects |