Re: BUG #4913: Row missing from primary key index

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: pgsql-bugs(at)postgresql(dot)org, mathieu(at)dezutter(dot)org ("Mathieu De Zutter")
Subject: Re: BUG #4913: Row missing from primary key index
Date: 2009-07-09 22:29:56
Message-ID: 877hyhebh7.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

>>>>> "Mathieu" == "Mathieu De Zutter" <mathieu(at)dezutter(dot)org> writes:

Mathieu> I have a table log_event with a primary key on an integer
Mathieu> 'id', called log_event_pkey.

Mathieu> The tables contains a duplicate for id = 15723018. The
Mathieu> duplicate (note that besides the id, all data differs)
Mathieu> doesn't seem to be known by the index at all.
[snip]

This was first reported on IRC and I spent a little time working with
the OP trying to dig up info to suggest a cause; here is the relevent
data (all provided by the OP at my request). I have not been able to
suggest a possible cause based on this.

shs=# select ctid,xmin,xmax,cmin,cmax,* from log_event where id = 15723018;
ctid | xmin | xmax | cmin | cmax | id | user_id | ip | action_id | object1_id | object2_id | event_timestamp
------------+-------+------+------+------+----------+---------+---------------+-----------+------------+------------+----------------------------
(79844,13) | 16410 | 0 | 0 | 0 | 15723018 | 0 | 79.65.40.251 | 4 | 10145 | | 2009-05-09 13:47:48.68558
(79870,70) | 16479 | 0 | 0 | 0 | 15723018 | 1 | 80.200.87.228 | 1 | | | 2009-05-09 13:47:33.441668
(2 rows)

Notice that the two rows seem entirely independent (different xmin).
The OP stated that his app generally does single-row inserts (with
some exceptions not relevent here); however, we found a nearby row
which shares the xmin:

(79870,16) | 16410 | 0 | 0 | 0 | 15722964 | 1 | 80.200.87.228 | 3 | 720 | | 2009-05-09 13:41:53.099589

(that row shows up correctly)

Using pg_filedump on the primary key index did not show any index
entries at all pointing to the (79844,13) row.

Hexdump of the 79844 page did not reveal anything out of the ordinary;
this is tuple (79844,13) starting at offset 1cc8:

00001cc0 00 00 00 00 00 00 00 00 1a 40 00 00 00 00 00 00 |(dot)(dot)(dot)(dot)(dot)(dot)(dot)(dot)(dot)(at)(dot)(dot)(dot)(dot)(dot)(dot)|
00001cd0 00 00 00 00 01 00 e4 37 0d 00 07 00 03 09 18 5f |.......7......._|
00001ce0 0a ea ef 00 00 00 00 00 00 00 00 00 0f 02 20 4f |.............. O|
00001cf0 41 28 fb 00 04 00 00 00 a1 27 00 00 00 00 00 00 |A(.......'......|

(full page at http://pastebin.com/m28036405)

There are additional rows in the table which also show similar issues
(rows found by seqscan but missing in indexscan). These also have xids
which are "suspicious" (apparently not in the expected sequence or dups
of other xids). e.g.:

these rows don't show up in the index:

(79854,50) | 16401 | 0 | 0 | 0 | 15722931 | 0 | 90.39.81.150 | 6 | | | 2009-05-09 13:42:06.278105
(79849,6) | 16406 | 0 | 0 | 0 | 15722932 | 0 | 78.150.166.244 | 5 | 940 | | 2009-05-09 13:45:49.012712
(79857,39) | 16408 | 0 | 0 | 0 | 15722933 | 0 | 64.17.2.164 | 2 | 22435 | | 2009-05-09 13:46:03.1664

but these do:

(79870,13) | 16406 | 0 | 0 | 0 | 15722961 | 1 | 80.200.87.228 | 71 | 29005 | | 2009-05-09 13:41:50.190479
(79870,14) | 16408 | 0 | 0 | 0 | 15722962 | 0 | 80.136.54.145 | 2 | 92765 | | 2009-05-09 13:41:50.439442

This seems to be associated with some odd jumps in xid, time and ID;
this set of rows is ordered by id:

(79835,21) | 16394 | 0 | 0 | 0 | 15722927 | 0 | 85.225.253.43 | 1 | | | 2009-05-09 13:39:43.943032
(79869,12) | 16396 | 0 | 0 | 0 | 15722928 | 0 | 66.235.124.132 | 4 | 1663 | | 2009-05-09 13:40:37.048477
(79844,12) | 16398 | 0 | 0 | 0 | 15722929 | 0 | 64.17.2.164 | 2 | 22324 | | 2009-05-09 13:40:58.781981
(79841,9) | 16399 | 0 | 0 | 0 | 15722930 | 0 | 78.137.163.133 | 2 | 4950 | | 2009-05-09 13:41:32.126199
(79854,50) | 16401 | 0 | 0 | 0 | 15722931 | 0 | 90.39.81.150 | 6 | | | 2009-05-09 13:42:06.278105
(79849,6) | 16406 | 0 | 0 | 0 | 15722932 | 0 | 78.150.166.244 | 5 | 940 | | 2009-05-09 13:45:49.012712
(79857,39) | 16408 | 0 | 0 | 0 | 15722933 | 0 | 64.17.2.164 | 2 | 22435 | | 2009-05-09 13:46:03.1664

(above 7 rows are not in index)

(79870,3) | 16393 | 0 | 0 | 0 | 15722951 | 0 | 77.202.22.228 | 56 | 0 | | 2009-05-09 13:39:48.460145
(79870,4) | 16394 | 0 | 0 | 0 | 15722952 | 0 | 77.202.22.228 | 3 | 37349 | | 2009-05-09 13:40:09.031336
(79870,5) | 16395 | 0 | 0 | 0 | 15722953 | 0 | 77.202.22.228 | 56 | 0 | | 2009-05-09 13:40:23.072695

(above 3 rows are)

(full list at http://pastebin.com/m16600dc8 - that list is from a seqscan,
so includes rows missing from the index)

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2009-07-09 23:08:21 Re: BUG #4912: server core dumps on xslt_process with empty param string
Previous Message Mathieu De Zutter 2009-07-09 21:53:06 BUG #4913: Row missing from primary key index