From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Kevin Grittner <kgrittn(at)ymail(dot)com> |
Cc: | Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: record identical operator |
Date: | 2013-09-18 15:06:50 |
Message-ID: | CA+TgmoYOqY61pONPMJK6DpRWS3Cnm0DEUx4j_i53DK_Ca7sHMw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 17, 2013 at 8:23 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> To have clean semantics, I think the operator should mean that the
> stored format of the row is the same. Regarding the array null
> bitmap example, I think it would be truly weird if the operator
> said that the stored format was the same, but this happened:
>
> test=# select pg_column_size(ARRAY[1,2,3]);
> pg_column_size
> ----------------
> 36
> (1 row)
>
> test=# select pg_column_size((ARRAY[1,2,3,NULL])::int4[3]);
> pg_column_size
> ----------------
> 44
> (1 row)
>
> They have the same stored format, but a different number of
> bytes?!?
Hmm. For most of this thread, I was leaning toward the view that
comparing the binary representations was the wrong concept, and that
we actually needed to have type-specific operators that understand the
semantics of the data type.
But I think this example convinces me otherwise. What we really want
to do here is test whether two values are the same, and if you can
feed two values that are supposedly the same to some function and get
two different answers, well then they're not really the same, are
they?
Now, I grant that the array case is pretty weird. An array with an
all-zeroes null bitmap is basically semantically identical to one with
no null bitmap at all. But there are other such cases as well. You
can have two floats that print the same way except when
extra_float_digits=3, for example, and I think that's probably a
difference that we *wouldn't* want to paper over. You can have a
long-form numeric that represents a value that could have been
represented as a short-form numeric, which is similar to the array
case. There are probably other examples as well. But in each of
those cases, the point is that there *is* some operation which will
distinguish between the two supposedly-identical values, and therefore
they are not identical for all purposes. Therefore, I see no harm in
having an operator that tests for
are-these-values-identical-for-all-purposes. If that's useful for
RMVC, then there may be other legitimate uses for it as well.
And once we decide that's OK, I think we ought to document it. Sure,
it's a little confusing, but we can explain it, I think. It's a good
opportunity to point out to people that, most of the time, they really
want something else, like the equality operator for the default btree
opclass.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2013-09-18 15:11:53 | Re: Assertions in PL/PgSQL |
Previous Message | Bernd Helmle | 2013-09-18 15:00:05 | Re: psql should show disabled internal triggers |