pgsql: md: Add comment & assert to buffer-zeroing path in md[start]read

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: md: Add comment & assert to buffer-zeroing path in md[start]read
Date: 2025-04-01 18:54:58
Message-ID: E1tzglG-0029Xw-0q@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

md: Add comment & assert to buffer-zeroing path in md[start]readv()

mdreadv() has a codepath to zero out buffers when a read returns zero bytes,
guarded by a check for zero_damaged_pages || InRecovery.

The InRecovery codepath to zero out buffers in mdreadv() appears to be
unreachable. The only known paths to reach mdreadv()/mdstartreadv() in
recovery are XLogReadBufferExtended(), vm_readbuf(), and fsm_readbuf(), each
of which takes care to extend the relation if necessary. This looks to either
have been the case for a long time, or the code was never reachable.

The zero_damaged_pages path is incomplete, as missing segments are not
created.

Putting blocks into the buffer-pool that do not exist on disk is rather
problematic, as such blocks will, at least initially, not be found by scans
that rely on smgrnblocks(), as they are beyond EOF. It also can cause weird
problems with relation extension, as relation extension does not expect blocks
beyond EOF to exist.

Therefore we would like to remove that path.

mdstartreadv(), which I added in e5fe570b51c, does not implement this zeroing
logic. I had started a discussion about that a while ago (linked below), but
forgot to act on the conclusion of the discussion, namely to disable the
in-memory-zeroing behavior.

We could certainly implement equivalent zeroing logic in mdstartreadv(), but
it would have to be more complicated due to potential differences in the
zero_damaged_pages setting between the definer and completor of IO. Given that
we want to remove the logic, that does not seem worth implementing the
necessary logic.

For now, put an Assert(false) and comments documenting this choice into
mdreadv() and comments documenting the deprecation of the path in mdreadv()
and the non-implementation of it in mdstartreadv(). If we, during testing,
discover that we do need the path, we can implement it at that time.

Reviewed-by: Noah Misch <noah(at)leadboat(dot)com>
Discussion: https://postgr.es/m/postgr.es/m/20250330024513.ac.nmisch@google.com
Discussion: https://postgr.es/m/postgr.es/m/3qxxsnciyffyf3wyguiz4besdp5t5uxvv3utg75cbcszojlz7p@uibfzmnukkbd

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/00066aa1733d84109f7569a7202c3915d8289d3a

Modified Files
--------------
src/backend/storage/smgr/md.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

Browse pgsql-committers by date

  From Date Subject
Next Message Nathan Bossart 2025-04-01 19:37:58 pgsql: doc: Adjust some notes about pg_upgrade's file transfer modes.
Previous Message Andres Freund 2025-04-01 17:34:27 pgsql: aio: Add pg_aios view