Quick Links

Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated

From:	john gale <john(at)smadness(dot)com>
To:	Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated
Date:	2014-02-26 21:57:24
Message-ID:	1E61A5BF-2388-4B35-861D-BC7A7323161B@smadness.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Feb 26, 2014, at 2:59 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:

> On 26 Únor 2014, 8:45, john gale wrote:
>
>> munin2=# delete from testruns where ctid = '(37069305,4)';
>> ERROR: tuple concurrently updated
>
> AFAIK this error is raised when a before trigger modifies the row that is
> being deleted. Imagine a trigger that does this
>
> UPDATE testruns SET mycolumn = 1 WHERE id = OLD.id;
> RETURN OLD;
>
> Given the way MVCC in postgres works (copying row when updating), the
> error makes sense. In 9.0 this worked by silently skipping the DELETE
> (incidentally, I had a few reports about tables that can't be deleted
> because of this in the past month).
>
> Anyway, do you have any triggers on the table? If yes, try to disable
> them. I suspect the data are corrupted in a way that causes update on the
> deleted row - either directly, or maybe because of a cascading effect.

There were a few triggers auto-created by a foreign key constraint but we removed the constraint, which removed the triggers:

The tuple error still exists, however:

munin2=# delete from testruns where ctid = '(37069305,4)';
ERROR: tuple concurrently updated
munin2=# select id from testruns where ctid = '(37069305,4)';
id
-----------
141908486
(1 row)

munin2=# delete from testruns where id = 141908486;
ERROR: tuple concurrently updated
munin2=# select * from testruns where id = 141908486;
ERROR: unexpected chunk number 0 (expected 1) for toast value 155900302 in pg_toast_16822

> I'm wondering if it might be caused by RI triggers - maybe yes, but I'm
> not aware of any RI trigger doing updates.
>
> That being said, I think that what you're doing is wrong. If you think you
> have a corrupted database, I'd strongly suggest doing dump/restore. Or how
> do you know there's no other corruption lurking in the files, slowly
> spreading to other parts of the database?

We're aware that we're shoveling dirt to patch a crack in a large highway. However at the moment we value uptime rather than strict integrity of the data (141mil rows allows some wiggle room), and since we don't modify these rows after they're inserted, I can't imagine how this kind of row corruption can "slowly spread to other parts of the database".

~ john

In response to

Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated at 2014-02-26 10:59:53 from Tomas Vondra

Browse pgsql-general by date

	From	Date	Subject
Next Message	Jeff Janes	2014-02-26 22:08:57	Re: Cancelling of autovacuums considered harmful
Previous Message	Scott Marlowe	2014-02-26 21:34:33	Re: Cancelling of autovacuums considered harmful