From: | "Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | FSM doesn't recover after zeroing damaged page. |
Date: | 2025-02-07 00:15:17 |
Message-ID: | a61efc0b-9cfc-4f24-ac5d-ea6600d9ccbf@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
At the current master i found that if not the last page of
the FSM bottom layer was corrupted it is not restored after zeroing.
Here is reproduction like that:
1) Create a table with FSM of 4 pages:
create table t (int) as select * from generate_series(1, 1E6);
delete from t where ctid in (select ctid from t tablesample bernoulli (20));
SELECT pg_relation_filepath('t'); -- to know the filename with FSM
vacuum t;
2) Do checkpoint and stop the server.
3) Corrupt a byte in the third page. For instance, the lower byte of the CRC:
printf '\xAA' | dd of=/usr/local/pg12252-vanm/data/base/5/<filename_fsm> bs=1 seek=$((2*8192+8)) count=1 conv=notrunc
4) start server and execute: vacuum t; twice: to ensure that corrupted page
is fixed in memory, zeroed and a new header was written on it.
postgres=# vacuum t;
WARNING: page verification failed, calculated checksum 13869 but expected 13994
WARNING: invalid page in block 2 of relation base/5/16384_fsm; zeroing out page
VACUUM
postgres=# vacuum t; -- without warnings
VACUUM
5) Do checkpoint and restart the server. After vacuum t; the warnings appeared again:
postgres=# vacuum t;
WARNING: page verification failed, calculated checksum 13869 but expected 13994
WARNING: invalid page in block 2 of relation base/5/16384_fsm; zeroing out page
VACUUM
I noticed that the updated page is not written to disk because the
buffer where it is located is not marked dirty. Moreover MarkBufferDirtyHint(),
which is called for modified FSM pages, seems is not suitable here,
since as i suppose the corrupted page must be rewritten certainly, not for hint.
Therefore, maybe mark it dirty immediately after writing the new header?
Here is a small patch that does it and eliminates multiple warnings.
Would be glad if you take a look on it.
With the best regards,
--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-recovering-damaged-FSM-pages.patch | text/x-patch | 906 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2025-02-07 00:47:53 | Re: Show WAL write and fsync stats in pg_stat_io |
Previous Message | Tom Lane | 2025-02-07 00:15:09 | Re: Should we allow ALTER OPERATOR CLASS to ADD/DROP operators and procedures? |