Quick Links

Re: md.c vs elog.c vs smgrreleaseall() in barrier

From:	Noah Misch <noah(at)leadboat(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: md.c vs elog.c vs smgrreleaseall() in barrier
Date:	2025-03-20 00:45:14
Message-ID:	20250320004514.95.nmisch@google.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Mar 19, 2025 at 06:45:20PM -0400, Andres Freund wrote:
> On 2025-03-19 12:55:53 -0700, Noah Misch wrote:
> > On Mon, Mar 17, 2025 at 07:52:02PM -0400, Andres Freund wrote:
> > > @@ -362,12 +397,16 @@ smgrreleaseall(void)
> > > if (SMgrRelationHash == NULL)
> > > return;
> > >
> > > + HOLD_INTERRUPTS();
> >
> > Likely not important, but it's not clear to me why smgrdestroyall() and
> > smgrreleaseall() get HOLD_INTERRUPTS(), as opposed to relying on the holds in
> > smgrdestroy() and smgrrelease(). In contrast, smgrreleaserellocator() does
> > rely on smgrrelease() for the hold.
>
> It didn't seem particularly safe to allow interrupts, which in turn could
> change the list of open relations, while iterating over a linked list / a
> hashtable.

Fair.

> > > @@ -434,6 +481,8 @@ smgrdosyncall(SMgrRelation *rels, int nrels)
> > > if (nrels == 0)
> > > return;
> > >
> > > + HOLD_INTERRUPTS();
> > > +
> > > FlushRelationsAllBuffers(rels, nrels);
> >
> > FlushRelationsAllBuffers() isn't part of smgr or md.c, so it's unlikely to
> > become sensitive to smgrrelease(). It may do a ton of I/O. Hence, I'd
> > HOLD_INTERRUPTS() after FlushRelationsAllBuffers(), not before.
>
> Hm - we never would want to process interrupts while in
> FlushRelationsAllBuffers() or such, would we? I'm ok with changing it, I
> guess I just didn't see a reason not to use a wider scope.

If we get a query cancel or fast shutdown, it's better for the user to abort
the transaction rather than keep flushing. smgrDoPendingSyncs() calls here
rather late in the pre-commit actions, so failing is still supposed to be
fine. I think the code succeeds at making it fine to fail here.

> I guess I am a bit paranoid because gave me flashbacks to issues around
> smgrtruncate() failing after doing DropRelationBuffers(), that Thomas recently
> fixed (and I had worked on a few years before). But of course
> DropRelationBuffers() is way more dangerous than FlushRelationsAllBuffers().

Fair.

> > > }
> > > }
> > > +
> > > + RESUME_INTERRUPTS();
> > > }
> > >
> > > /*
> > > @@ -471,6 +522,8 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
> > > if (nrels == 0)
> > > return;
> > >
> > > + HOLD_INTERRUPTS();
> >
> > I would move this below DropRelationsAllBuffers(), for reasons like
> > FlushRelationsAllBuffers() above.
>
> I think that'd be unsafe. Once we dropped buffers from the buffer pool we
> can't just continue without also unlinking the underlying relation, otherwise
> an older view of the data later can be "revived from the dead" after an error,
> causing all manner of corruption.
>
> I suspect it's always called with interrupts held already though.

Ah, confirmed. If I put this assert at the top of smgrdounlinkall(),
check-world passes:

Assert(IsBinaryUpgrade || InRecovery || !INTERRUPTS_CAN_BE_PROCESSED());

> > > Subject: [PATCH v2 2/2] smgr: Make SMgrRelation initialization safer against
> > > errors
> > >
> > > In case the smgr_open callback failed, the ->pincount field would not be
> > > initialized and the relation would not be put onto the unpinned_relns list.
> > >
> > > This buglet was introduced in 21d9c3ee4ef7. As that commit is only in HEAD, no
> > > need to backpatch.
> > >
> > > Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl
> > > ---
> > > src/backend/storage/smgr/smgr.c | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
> > > index 53a09fe4aaa..24971304b85 100644
> > > --- a/src/backend/storage/smgr/smgr.c
> > > +++ b/src/backend/storage/smgr/smgr.c
> > > @@ -255,12 +255,12 @@ smgropen(RelFileLocator rlocator, ProcNumber backend)
> > > reln->smgr_cached_nblocks[i] = InvalidBlockNumber;
> > > reln->smgr_which = 0; /* we only have md.c at present */
> > >
> > > - /* implementation-specific initialization */
> > > - smgrsw[reln->smgr_which].smgr_open(reln);
> > > -
> > > /* it is not pinned yet */
> > > reln->pincount = 0;
> > > dlist_push_tail(&unpinned_relns, &reln->node);
> > > +
> > > + /* implementation-specific initialization */
> > > + smgrsw[reln->smgr_which].smgr_open(reln);
> > > }
> >
> > I struggle to speculate about the merits of this, because mdopen() can't fail.
> > If mdopen() started to do things that could fail, mdnblocks() would be
> > reasonable in assuming those things are done. Hence, the long-term direction
> > should be more like destroying the new smgr entry in the event of error.
> >
> > I would not make this change. I'd maybe add a comment that smgr_open
> > callbacks currently aren't allowed to fail, since smgropen() isn't ready to
> > clean up smgr-level state from a failed open. How do you see it?
>
> I see no disadvantage in the change - it seems strictly better to initialize
> pincount earlier. I agree that it might be a good idea to explicitly handle
> errors eventually, but that'd not be made harder by this change...

Okay. I suppose if mdopen() gained the ability to fail and mdnblocks() also
gained the ability to cure said failure, this change would make that okay.

In response to

Re: md.c vs elog.c vs smgrreleaseall() in barrier at 2025-03-19 22:45:20 from Andres Freund

Responses

Re: md.c vs elog.c vs smgrreleaseall() in barrier at 2025-03-20 19:53:11 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David G. Johnston	2025-03-20 00:49:49	Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message	Bruce Momjian	2025-03-19 23:59:28	Re: [PoC] Federated Authn/z with OAUTHBEARER