Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

From: adsj(at)novozymes(dot)com (Adam =?utf-8?Q?Sj=C3=B8gren?=)
To: pgsql-general(at)postgresql(dot)org
Subject: Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
Date: 2018-04-10 11:52:20
Message-ID: 87efjnp9cb.fsf@tullinup.koldfront.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Pavan writes:

>> FWIW one of our support customers reported a very similar TOAST table
>> corruption issue last week which nearly caused an outage. After a lot of
>> analysis, I think I've now fully understood the reasons behind the
>> corruption, the underlying bug(s) and possible remedy. I am currently
>> working on writing a reproducible test case to demonstrate the problem and
>> writing the fix. More details on that soon.
>
> I've posted a reproducer and a proposed fix to -hackers [1]
>
> In the particular case that I investigated, a database crash/recovery was
> involved. But I think we should be able to create a scenario where OID
> wrap-around or a standby promotion triggers the problem. I don't know if
> any of that was involved in the cases reported on this thread, but I've a
> strong suspicion that the underlying bug is probably the same.

In our case there was no crash+recovery, but we do have a high write
load (and the problem occurring quite seldom), so it sounds like it
fits.

> [1] https://www.postgresql.org/message-id/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-%3Dh4OLosXHf9w%40mail.gmail.com

Impressive investigation!

Just a couple of data points: your script to reproduce the problem does
so for all three versions of PostgreSQL we have in use: 9.3.22, 9.6.8
and 10.3.

And I specifically tested on our production machine which exhibits the
problem, running 9.3.22, and your script reproduced the problem there as
well.

I.e. ending with:

ERROR: unexpected chunk number 0 (expected 1) for toast value 16396 in pg_toast_16384
REINDEX
ERROR: unexpected chunk number 0 (expected 1) for toast value 16396 in pg_toast_16384
VACUUM
ERROR: unexpected chunk number 0 (expected 1) for toast value 16396 in pg_toast_16384

Best regards,

Adam

--
"No more than that, but very powerful all the Adam Sjøgren
same; simple things are good." adsj(at)novozymes(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vikas Sharma 2018-04-10 13:50:06 Postgresql Split Brain: Which one is latest
Previous Message Alexandre Arruda 2018-04-10 11:31:20 Re: ERROR: found multixact from before relminmxid