BUG #8656: Duplicate data violating unique constraints

From: maciek(at)heroku(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #8656: Duplicate data violating unique constraints
Date: 2013-12-04 20:04:15
Message-ID: E1VoIgB-0005Ok-3X@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 8656
Logged by: Maciek Sakrejda
Email address: maciek(at)heroku(dot)com
PostgreSQL version: 9.3.1
Operating system: Ubuntu 12.04 LTS 64-bit
Description:

A customer has run into an issue where data in a single table was apparently
duplicated somehow, violating the unique constraint imposed by the primary
key:

=> select id, count(*) from post group by id having count(*) > 1;
id | count
---------+-------
1836573 | 2
1855409 | 2
1855421 | 2
1855634 | 2
1855665 | 2
1866749 | 2
1877913 | 2
1877914 | 2
1886591 | 2
1897572 | 2
(10 rows)

What's more, the table has 12 columns, and the data for each column for all
of these duplicate rows is identical between the two versions. This table
also has an updated_at column maintained by triggers, and the updated_at for
all of these duplicated rows seems to have happened in a brief window
between 2013-12-01 06:09:31.138317+00 and 2013-12-01 06:13:07.398258+00. As
far as we can tell, nothing unusual was happening in the application around
this time. In case the table schema itself is relevant, here it is (somewhat
anonymized at the customer's request):

Column | Type |
Modifiers
----------------------+--------------------------+------------------------------------------------------------
id | integer | not null default
nextval('post_id_seq'::regclass)
col2 | integer | not null
col3 | timestamp with time zone | not null
col4 | character varying(100) | not null
col5 | integer | not null
col6 | integer | not null
col7 | timestamp with time zone | not null
updated_at | timestamp with time zone | not null
col9 | timestamp with time zone | not null
col10 | integer | not null
col11 | integer | not null
col12 | character varying(100) | not null
Indexes:
"post_pkey" PRIMARY KEY, btree (id)
"post_col4_idx" UNIQUE CONSTRAINT, btree (col4)
"post_col2_col3_idx" btree (col2, col3) CLUSTER
Foreign-key constraints:
"post_col2_fkey" FOREIGN KEY (col2) REFERENCES other_table(id) ON DELETE
CASCADE
Triggers:
t1 BEFORE INSERT ON post FOR EACH ROW EXECUTE PROCEDURE on_insert()
t2 BEFORE UPDATE ON post FOR EACH ROW EXECUTE PROCEDURE on_update()

CREATE OR REPLACE FUNCTION on_insert()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
NEW.version = 1;
NEW.created_at = timezone('UTC', now());
NEW.updated_at = NEW.created_at;
RETURN NEW;
END;
$function$

CREATE OR REPLACE FUNCTION on_update()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
NEW.version = OLD.version + 1;
NEW.created_at = OLD.created_at;
NEW.updated_at = timezone('UTC', now());
RETURN NEW;
END;
$function$

The data has been deleted in the primary system, but it was captured in a
pg_dump backup, and I was able to restore that (except for the unique
constraints, obviously) in a separate database and can dig in further.

This is a fresh database, restored from a pg_dump backup this past Saturday,
*not* a promoted replica, so I don't think the recent replication issues
come into play here.

Any ideas?

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message plalg 2013-12-04 22:52:27 BUG #8657: Postgres 9.3 JDBC driver is unable to find the Foreign tables
Previous Message Tom Lane 2013-12-04 19:34:57 Re: BUG #8139: initdb: Misleading error message when current user not in /etc/passwd