From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Jonas Tehler <jonas(at)tehler(dot)se>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Duplicate data despite unique constraint |
Date: | 2016-09-02 13:02:53 |
Message-ID: | 3481cafa-cf8c-4401-722f-38eed301f9bd@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 09/02/2016 04:32 AM, Jonas Tehler wrote:
>
> Hi,
>
> We’re running Postgresql 9.4.5 on Amazon RDS. One of our tables
> looks something like this:
>
> CREATE TABLE users
> (
> ...
> email character varying(128) NOT NULL,
> ...
> CONSTRAINT users_email_key UNIQUE (email)
> )
>
> Despite this we have rows with very similar email values. I discovered
> the problem when I tried to add a column and got the following error:
>
> ActiveRecord::RecordNotUnique: PG::UniqueViolation: ERROR: could not
> create unique index "users_email_key"
> DETAIL: Key (email)=(xxx(at)yyy(dot)com <mailto:xxx(at)yyy(dot)com>) is duplicated.
> : ALTER TABLE "users" ADD COLUMN "client_branch" character varying(255)
> DEFAULT ‘beta'
>
> Now look at this:
>
> => select email from users where email = 'xxx(at)yyy(dot)com <mailto:xxx(at)yyy(dot)com>';
> email
> ---------------------------
> xxx(at)yyy(dot)com <mailto:xxx(at)yyy(dot)com>
> (1 row)
>
> => select email from users where email LIKE 'xxx(at)yyy(dot)com
> <mailto:xxx(at)yyy(dot)com>';
> email
> ---------------------------
> xxx(at)yyy(dot)com <mailto:xxx(at)yyy(dot)com>
> xxx(at)yyy(dot)com <mailto:xxx(at)yyy(dot)com>
> (2 rows)
>
> I have tried to compare the binary data in various ways, email::bytes,
> md5(email), encode(email::bytea, 'hex’), char_length(email) and it all
> looks the same for both rows.
>
> Any suggestions how I can discover the difference between the values and
> how they could have been added without triggering the constraint? I know
> that the values were added after the constraint was added.
select ctid, email from users where email LIKE 'xxx(at)yyy(dot)com';
https://www.postgresql.org/docs/9.5/static/ddl-system-columns.html
"ctid
The physical location of the row version within its table. Note
that although the ctid can be used to locate the row version very
quickly, a row's ctid will change if it is updated or moved by VACUUM
FULL. Therefore ctid is useless as a long-term row identifier. The OID,
or even better a user-defined serial number, should be used to identify
logical rows.
"
Best guess is the INDEX on the column is corrupted and needs to be
reindexed:
https://www.postgresql.org/docs/9.5/static/sql-reindex.html
>
> The data was added from a Ruby on Rails app that also has unique
> constraints on the email field and validation on the email format.
>
> / Jonas
>
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Albe Laurenz | 2016-09-02 13:06:13 | Re: Duplicate data despite unique constraint |
Previous Message | Jonas Tehler | 2016-09-02 11:32:56 | Duplicate data despite unique constraint |