FSM Corruption (was: Could not read block at end of the relation)

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: FSM Corruption (was: Could not read block at end of the relation)
Date: 2024-03-01 08:56:51
Message-ID: 1958255.PYKUYFuaPT@aivenlaptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Le mardi 27 février 2024, 11:34:14 CET Ronan Dunklau a écrit :
> I suspected the FSM could be corrupted in some way but taking a look at it
> just after the errors have been triggered, the offending (non
> existing)blocks are just not present in the FSM either.

I think I may have missed something on my first look. On other affected
clusters, the FSM is definitely corrupted. So it looks like we have an FSM
corruption bug on our hands.

The occurence of this bug happening makes it hard to reproduce, but it's
definitely frequent enough we witnessed it on a dozen PostgreSQL clusters.

In our case, we need to repair the FSM. The instructions on the wiki do work,
but maybe we should add something like the attached patch (modeled after the
same feature in pg_visibility) to make it possible to repair the FSM
corruption online. What do you think about it ?

The investigation of the corruption is still ongoing.

Best regards,

--
Ronan Dunklau

Attachment Content-Type Size
0001-Provide-a-pg_truncate_freespacemap-function.patch text/x-patch 2.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2024-03-01 09:12:43 Re: FSM Corruption (was: Could not read block at end of the relation)
Previous Message Tender Wang 2024-03-01 07:18:11 Re: "type with xxxx does not exist" when doing ExecMemoize()