From: | Toby Murray <toby(dot)murray(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: Violation of primary key constraint |
Date: | 2013-02-01 04:57:02 |
Message-ID: | CAJeqKgvgFRawT8tL21XtVdw7jjh7BOJXceBLVr1FA+Tf6Pc_-A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Jan 31, 2013 at 5:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Toby Murray <toby(dot)murray(at)gmail(dot)com> writes:
>> I just had some interaction with RhodiumToad on IRC about a duplicated
>> primary key problem I ran into today. After some poking around he
>> suggested that I send this to -bugs since it seems like an interesting
>> error.
>
> I poked around in the PK index file (thanks for sending that) and could
> not find anything that looks wrong. There are a lot of duplicate keys
> (a few of the keys appear more than a thousand times) but I think this
> is just the result of update activity that hasn't been vacuumed away
> yet. I count 181340233 leaf index tuples bearing 168352931 distinct
> key values --- that makes for a dead-tuple fraction of 7.7% which is
> not quite enough to trigger an autovacuum, so it's not terribly
> surprising that the dups are still present.
>
> At this point it seems that it's not the index's fault. What seems more
> likely is that somehow the older heap entry failed to get marked "dead"
> after an UPDATE.
>
>> ... Especially the one with the ID
>> 26709186 since it hasn't been changed in OpenStreetMap in years so
>> there is no reason for it to have been touched in any way since the
>> import.
>
> Yeah, it's a bit hard to explain that this way unless there was an
> UPDATE that didn't change the timestamp or version. How sure are you
> that the updating process always changes those?
Pretty sure. The minutely change stream coming from OSM is generated
from all objects that were modified since the last diff was generated,
based on transaction numbers in their postgresql database. The way
changes get into that database is through the rails API which enforces
version number bumps on upload and sets its own timestamp based on
when the upload came in. The only way to apply an update that doesn't
change anything would be to apply a minutely diff from the past to an
up-to-date database. This does happen right when you start updating
from minutely diffs after a clean import but the overlap is a matter
of hours, just to ensure there is no gap - not something from 2008.
>> Here are some queries and their results that RhodiumToad had me run to
>> try and track things down:
>
> Could we see the full results of heap_page_items(get_raw_page()) for
> each of the pages where any of these tuples live, viz
>
> 11249625
> 1501614
> 11247884
> 1520052
> 1520056
> 11249780
> 1528888
> 11249622
>
Seems awkward to put inline so I made a text file with these results
and am attaching it. Hoping the mailing list allows attachments.
Toby
Attachment | Content-Type | Size |
---|---|---|
heap_page_items.txt | text/plain | 14.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Toby Murray | 2013-02-01 05:21:45 | Re: Violation of primary key constraint |
Previous Message | Pius Chan | 2013-02-01 01:27:34 | Re: BUG #7819: missing chunk number 0 for toast value 1235919 in pg_toast_35328 |