Re: BUG #7819: missing chunk number 0 for toast value 1235919 in pg_toast_35328

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pius Chan <pchan(at)contigo(dot)com>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>, Frank Moi <fmoi(at)contigo(dot)com>, Ken Yu <kyu(at)contigo(dot)com>, Vincent Lasmarias <vlasmarias(at)contigo(dot)com>, Vladimir Kosilov <vkosilov(at)contigo(dot)com>
Subject: Re: BUG #7819: missing chunk number 0 for toast value 1235919 in pg_toast_35328
Date: 2013-02-01 20:08:54
Message-ID: 21192.1359749334@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Pius Chan <pchan(at)contigo(dot)com> writes:
> Thanks for your prompt response. Yeah, I should have provided you with my testing scripts. BTW, during numerous tests, I felt that if there is no long holding transaction (the one used for middle-tier service master/slave failover), the database server is much quicker to recover the space left by dead-row and it is also hard to make the TOAST area grow. Therefore, it is hard for me to reproduce the ERROR if there is no long-holding open transaction. Do you have any insight to it?

I think the proximate cause is probably this case mentioned in
GetOldestXmin's comments:

* if allDbs is FALSE and there are no transactions running in the current
* database, GetOldestXmin() returns latestCompletedXid. If a transaction
* begins after that, its xmin will include in-progress transactions in other
* databases that started earlier, so another call will return a lower value.

So the trouble case is where autovacuum on the toast table starts at an
instant where nothing's running in the "test" database, but there are
pre-existing transaction(s) in the other database. Then later CLUSTER
starts at an instant where transactions are running in "test" and their
xmins include the pre-existing transactions.

So you need long-running transactions in another DB than the one where
the vacuuming/clustering action is happening, as well as some unlucky
timing. Assuming my theory is the correct one, of course.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2013-02-01 21:58:27 Re: Violation of primary key constraint
Previous Message Pius Chan 2013-02-01 19:43:13 Re: BUG #7819: missing chunk number 0 for toast value 1235919 in pg_toast_35328