Quick Links

Re: md.c vs elog.c vs smgrreleaseall() in barrier

From:	Noah Misch <noah(at)leadboat(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: md.c vs elog.c vs smgrreleaseall() in barrier
Date:	2025-03-19 19:55:53
Message-ID:	20250319195553.ef.nmisch@google.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Mar 17, 2025 at 07:52:02PM -0400, Andres Freund wrote:
> Here's a proposed patch for this. It turns out that the bug might already be
> reachable, even without defining FDDEBUG. There's a debug ereport() in
> register_dirty_segment() - but it's hard to reach in practice.
>
> I don't really know whether that means we ought to backpatch or not - which
> makes me want to be conservative and not backpatch.

Non-backpatch sounds fine.

> Subject: [PATCH v2 1/2] smgr: Hold interrupts in most smgr functions

> It seems better to put the HOLD_INTERRUPTS()/RESUME_INTERRUPTS() in smgr.c,
> instead of trying to push them down to md.c where possible: For one, every
> smgr implementation would be vulnerable, for another, a good bit of smgr.c
> code itself is affected too.

I'm fine with putting it in smgr.c. Academically, I don't see every smgr
implementation being vulnerable for most smgr entry points. For example, the
upthread backtrace happens because mdclose() undoes the fd-opening work of
mdnblocks(). Another smgr implementation could make its smgr_close callback a
no-op. smgrrelease() doesn't destroy anything important at the smgr level; it
is mdclose() that destroys state that md.c still needs.

> @@ -362,12 +397,16 @@ smgrreleaseall(void)
> if (SMgrRelationHash == NULL)
> return;
>
> + HOLD_INTERRUPTS();

Likely not important, but it's not clear to me why smgrdestroyall() and
smgrreleaseall() get HOLD_INTERRUPTS(), as opposed to relying on the holds in
smgrdestroy() and smgrrelease(). In contrast, smgrreleaserellocator() does
rely on smgrrelease() for the hold.

> +
> hash_seq_init(&status, SMgrRelationHash);
>
> while ((reln = (SMgrRelation) hash_seq_search(&status)) != NULL)
> {
> smgrrelease(reln);
> }
> +
> + RESUME_INTERRUPTS();
> }
>
> /*

> @@ -434,6 +481,8 @@ smgrdosyncall(SMgrRelation *rels, int nrels)
> if (nrels == 0)
> return;
>
> + HOLD_INTERRUPTS();
> +
> FlushRelationsAllBuffers(rels, nrels);

FlushRelationsAllBuffers() isn't part of smgr or md.c, so it's unlikely to
become sensitive to smgrrelease(). It may do a ton of I/O. Hence, I'd
HOLD_INTERRUPTS() after FlushRelationsAllBuffers(), not before.

>
> /*
> @@ -449,6 +498,8 @@ smgrdosyncall(SMgrRelation *rels, int nrels)
> smgrsw[which].smgr_immedsync(rels[i], forknum);

Long-term, someone might change this to hold interrupts once per immedsync
with a CFI between immedsync calls. That would be safe. It's not this
change's job, though. I'm mentioning it for the archives.

> }
> }
> +
> + RESUME_INTERRUPTS();
> }
>
> /*
> @@ -471,6 +522,8 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
> if (nrels == 0)
> return;
>
> + HOLD_INTERRUPTS();

I would move this below DropRelationsAllBuffers(), for reasons like
FlushRelationsAllBuffers() above.

> Subject: [PATCH v2 2/2] smgr: Make SMgrRelation initialization safer against
> errors
>
> In case the smgr_open callback failed, the ->pincount field would not be
> initialized and the relation would not be put onto the unpinned_relns list.
>
> This buglet was introduced in 21d9c3ee4ef7. As that commit is only in HEAD, no
> need to backpatch.
>
> Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl
> ---
> src/backend/storage/smgr/smgr.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
> index 53a09fe4aaa..24971304b85 100644
> --- a/src/backend/storage/smgr/smgr.c
> +++ b/src/backend/storage/smgr/smgr.c
> @@ -255,12 +255,12 @@ smgropen(RelFileLocator rlocator, ProcNumber backend)
> reln->smgr_cached_nblocks[i] = InvalidBlockNumber;
> reln->smgr_which = 0; /* we only have md.c at present */
>
> - /* implementation-specific initialization */
> - smgrsw[reln->smgr_which].smgr_open(reln);
> -
> /* it is not pinned yet */
> reln->pincount = 0;
> dlist_push_tail(&unpinned_relns, &reln->node);
> +
> + /* implementation-specific initialization */
> + smgrsw[reln->smgr_which].smgr_open(reln);
> }

I struggle to speculate about the merits of this, because mdopen() can't fail.
If mdopen() started to do things that could fail, mdnblocks() would be
reasonable in assuming those things are done. Hence, the long-term direction
should be more like destroying the new smgr entry in the event of error.

I would not make this change. I'd maybe add a comment that smgr_open
callbacks currently aren't allowed to fail, since smgropen() isn't ready to
clean up smgr-level state from a failed open. How do you see it?

In response to

Re: md.c vs elog.c vs smgrreleaseall() in barrier at 2025-03-17 23:52:02 from Andres Freund

Responses

Re: md.c vs elog.c vs smgrreleaseall() in barrier at 2025-03-19 22:45:20 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Thomas Munro	2025-03-19 20:11:43	Re: [PoC] Federated Authn/z with OAUTHBEARER
Previous Message	Tom Lane	2025-03-19 19:47:56	Re: Orphaned users in PG16 and above can only be managed by Superusers