Duplicate key existant/index visibility bug in 9.3.3

From: Erik Jones <ejones(at)engineyard(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Duplicate key existant/index visibility bug in 9.3.3
Date: 2015-01-26 21:27:19
Message-ID: D43A52C1-79DD-4621-B7D2-B21BF66DB2C2@engineyard.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Greetings,

Today I had a client report an issue restoring a dump made from their production db on a testing server wherein there existed two rows for a table with identical primary key values. Upon investigation I found this to be true, but neither is visible via index usage:

(Server version: 9.3.3)

db=# explain select xmin, xmax, cmin, cmax, ctid, id, created_at from tags where id = 42982;
QUERY PLAN
-------------------------------------------------------------------------
Index Scan using groups_pkey on tags (cost=0.29..8.31 rows=1 width=34)
Index Cond: (id = 42982)
(2 rows)

db=# select xmin, xmax, cmin, cmax, ctid, id, created_at from tags where id = 42982;
xmin | xmax | cmin | cmax | ctid | id | created_at
------+------+------+------+------+----+------------
(0 rows)

db=# set enable_bitmapscan = off;
SET
db=# set enable_indexscan = off;
SET
db=# explain select xmin, xmax, cmin, cmax, ctid, id, created_at from tags where id = 42982;
QUERY PLAN
--------------------------------------------------------
Seq Scan on tags (cost=0.00..1541.94 rows=1 width=34)
Filter: (id = 42982)
(2 rows)

db=# select xmin, xmax, cmin, cmax, ctid, id, created_at from tags where id = 42982;
xmin | xmax | cmin | cmax | ctid | id | created_at
---------+---------+------+------+----------+-------+----------------------------
1195647 | 1195647 | 2 | 2 | (677,51) | 42982 | 2015-01-23 00:12:19.498942
1195648 | 1195647 | 1 | 1 | (677,52) | 42982 | 2015-01-23 00:12:19.498942
(2 rows)

There are other columns on the table and all values in the two entries are identical. Note that another table has a column that reference this table's primary key via a foreign key and there are multiple referencing rows there. The last even has the same xmin as the first above:

db=# select xmin, xmax, cmin, cmax, ctid, id, created_at, tag_id from tagged_items where tag_id = 42982 order by xmin::text::bigint desc;
xmin | xmax | cmin | cmax | ctid | id | created_at | tag_id
---------+------+------+------+-----------+--------+----------------------------+--------
1195647 | 0 | 0 | 0 | (361,52) | 130562 | 2015-01-23 01:14:55.809218 | 42982
1195628 | 0 | 0 | 0 | (360,89) | 130557 | 2015-01-23 01:14:55.391223 | 42982
1195619 | 0 | 0 | 0 | (361,44) | 130548 | 2015-01-23 01:14:55.252926 | 42982
1195575 | 0 | 0 | 0 | (360,72) | 130529 | 2015-01-23 01:14:54.66755 | 42982
1195536 | 0 | 0 | 0 | (360,60) | 130505 | 2015-01-23 01:14:53.841223 | 42982
1195528 | 0 | 0 | 0 | (361,22) | 130497 | 2015-01-23 01:14:53.725746 | 42982
1195490 | 0 | 0 | 0 | (361,16) | 130487 | 2015-01-23 01:14:53.21153 | 42982
1195489 | 0 | 0 | 0 | (360,50) | 130486 | 2015-01-23 01:14:53.197715 | 42982
1195470 | 0 | 0 | 0 | (361,10) | 130477 | 2015-01-23 01:14:52.896855 | 42982
1195402 | 0 | 0 | 0 | (360,28) | 130444 | 2015-01-23 01:14:52.020715 | 42982
<snip>

I'm assuming that the fact that the xmin values of the rest of the rows are from before those in the original table is indicative of update traffic since the original tags row was added and that the source of the problem is perhaps an update gone wrong, but let me know if I'm way off there.

So, a few questions:

1. Is there an existing bug that's been fixed by 9.3.5 that would account for this. I scanned the 9.3.4 and 9.3.5 release notes and while the first fix listed in the release notes for 9.3.4 (http://www.postgresql.org/docs/9.3/static/release-9-3-4.html) was what gave me the idea to disable index usage to see the rows this server has been up and running as a master since September so neither the standby nor the crash recovery setup from that bug's description applies.

2. Is there any other info that I can dig up that could be helpful here?

3. Would manually deleting one of those rows (likely using the ctid to specify which) be safe and make the remaining one visible? If so, I'm assuming I should delete the second given the xmin of the first referencing row in the other table?

Note that I've already confirmed that there are no other duplicate rows in this or other tables, at least with no clausing unique ids, via restore tests on a test server. So, I know I can fix this case with a dump/restore but if I can get away with simply deleting one of the offending tuples that would be ideal (i.e. no downtime for the client).

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2015-01-26 22:19:23 Re: Duplicate key existant/index visibility bug in 9.3.3
Previous Message derek.anderson 2015-01-26 21:04:37 BUG #12675: BIGINT Datatype performance