BUG #3724: Duplicate values added to table despite unique index

From: "Mason Hale" <masonhale(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #3724: Duplicate values added to table despite unique index
Date: 2007-11-06 15:51:26
Message-ID: 200711061551.lA6FpQHS088313@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 3724
Logged by: Mason Hale
Email address: masonhale(at)gmail(dot)com
PostgreSQL version: 8.2.5
Operating system: Redhat Linux (kernel: Linux 2.6.18-8.1.15.el5PAE)
Description: Duplicate values added to table despite unique index
Details:

I discovered this issue when an update statement was resulting in a
duplicate key violation error, even though the update did not change any of
the columns in the unique index (and neither did the one trigger on this
table).

Here is the table description:

prod_2=> \d topic_version_page
Table "bdu.topic_version_page"
Column | Type | Modifiers
---------------------+-----------------------------+---------------
topic_version_id | integer | not null
page_id | integer | not null
link_score | double precision |
created_at | timestamp without time zone | default now()
updated_at | timestamp without time zone | default now()
is_entry_page | boolean | default false
linking_entry_count | integer | default 0
Indexes:
"index_topic_version_page_on_topic_version_id_and_page_id" UNIQUE, btree
(topic_version_id, page_id)
"index_topic_version_page_on_link_score" btree (link_score)
"index_topic_version_page_on_topic_version_id_and_created_at" btree
(topic_version_id, created_at)
Foreign-key constraints:
"topic_version_page_topic_version_id_fkey" FOREIGN KEY
(topic_version_id) REFERENCES topic_version(id) ON DELETE CASCADE
Triggers:
topic_version_page_updated_at_trigger BEFORE UPDATE ON
topic_version_page FOR EACH ROW EXECUTE PROCEDURE
update_updated_at_timestamp()

Note that there is a unique index on (topic_version_id, page_id).

Now look at the result of this query:

prod_2=> select page_id, count(*) from topic_version_page where
topic_version_id = 263 group by 1 having count(*) > 1;
page_id | count
-----------+-------
161335682 | 2
194359108 | 2
(2 rows)

Here we have two rows that violate the unique index constraint.

Looking at the rows in more detail:

prod_2=> select * from topic_version_page where topic_version_id = 263 and
page_id in (161335682, 194359108);
topic_version_id | page_id | link_score | created_at |
updated_at | is_entry_page | linking_entry_count
------------------+-----------+------------+----------------------------+---
-------------------------+---------------+---------------------
263 | 161335682 | 0 | 2007-10-13 02:40:49.864219 |
2007-11-01 15:58:57.268593 | f | 5
263 | 194359108 | 0 | 2007-10-25 13:34:20.654336 |
2007-10-25 13:34:20.654336 | f | 1
263 | 194359108 | 0 | 2007-10-25 13:34:20.654336 |
2007-11-04 13:08:03.011292 | f | 2
263 | 161335682 | 0 | 2007-10-13 02:40:49.864219 |
2007-11-04 13:08:03.011292 | f | 6
(4 rows)

We can see that each duplicate within a pair was created at the same time
(or at least within the same transaction), but that each pair was created at
a different time.

My expectation is that with the unique index in place this should not be
able to happen.

Other info that may be useful:

This database was created by taking a dump from a 8.2.4 database on 11/3 and
restoring into a new 8.2.5 database (on different hardware). The created_at
timestamps seem to indicate that these duplicates were created prior to the
dump being created.

However running the same query on the original 8.2.4 database returns zero
rows:

prod_1=> select page_id, count(*) from topic_version_page where
topic_version_id = 263 group by 1 having count(*) > 1;
page_id | count
---------+-------
(0 rows)

prod_1=>

Finally, I restored the *same* dump file in another 8.2.5 database instance
and that database does not have the duplicate rows either.

From this it seems to me that on the following occurred:

Some subsequent update operation resulted in two identical rows being
created in the same table. This likely happened at '2007-11-04
13:08:03.011292' which is the updated_at timestamp on two of the duplicate
rows (and is after the restore).

This looks like a bug related to update operations and unique indexes, but I
could of course be wrong.

Thanks for looking into it.

Mason

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2007-11-06 16:16:32 Re: BUG #3724: Duplicate values added to table despite unique index
Previous Message Sam Mason 2007-11-06 15:25:56 Re: BUG #3723: dropping an index that doesn't refer to table's columns