Re: md.c vs elog.c vs smgrreleaseall() in barrier

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: md.c vs elog.c vs smgrreleaseall() in barrier
Date: 2025-03-20 20:16:44
Message-ID: 20250320201644.3d.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 20, 2025 at 03:53:11PM -0400, Andres Freund wrote:
> I updated the patch with the following changes:
>
> - Remove the assertion from smgrtruncate() - it would need to assert that it's
> called in a critical section.
>
> Not sure why it's not already asserting that?
>
> The function header says:
> * ... This function should normally
> * be called in a critical section, but the current size must be checked
> * outside the critical section, and no interrupts or smgr functions relating
> * to this relation should be called in between.
>
> The "should normally" is bit weird imo, when would it be safe to *not* use
> it in a critical section?

I expect it would be okay in recovery, which is a crypto-critical-section
IIRC. All callers, including smgr_redo(), do have an explicit critical
section around the call. Hence, I gather we're no longer relying on any
exceptions to this one's need for a critical section.

> - added comments about the reason for HOLD_INTERRUPTS to smgrdounlinkall(),
> smgrdestroyall() and smgrreleaseall()

Perfect.

> I still am leaning against backpatching, but I'm not sure that's not just
> laziness.

It's also some risk reduction. One of these smgr APIs might have a useful
interruptibility that we're now blocking. (I'm not aware of one.)

> On 2025-03-19 17:45:14 -0700, Noah Misch wrote:
> > On Wed, Mar 19, 2025 at 06:45:20PM -0400, Andres Freund wrote:
> > > On 2025-03-19 12:55:53 -0700, Noah Misch wrote:
> > > > On Mon, Mar 17, 2025 at 07:52:02PM -0400, Andres Freund wrote:
> > > > > @@ -434,6 +481,8 @@ smgrdosyncall(SMgrRelation *rels, int nrels)
> > > > > if (nrels == 0)
> > > > > return;
> > > > >
> > > > > + HOLD_INTERRUPTS();
> > > > > +
> > > > > FlushRelationsAllBuffers(rels, nrels);
> > > >
> > > > FlushRelationsAllBuffers() isn't part of smgr or md.c, so it's unlikely to
> > > > become sensitive to smgrrelease(). It may do a ton of I/O. Hence, I'd
> > > > HOLD_INTERRUPTS() after FlushRelationsAllBuffers(), not before.
> > >
> > > Hm - we never would want to process interrupts while in
> > > FlushRelationsAllBuffers() or such, would we? I'm ok with changing it, I
> > > guess I just didn't see a reason not to use a wider scope.
> >
> > If we get a query cancel or fast shutdown, it's better for the user to abort
> > the transaction rather than keep flushing. smgrDoPendingSyncs() calls here
> > rather late in the pre-commit actions, so failing is still supposed to be
> > fine. I think the code succeeds at making it fine to fail here.
>
> But we don't actually intentionally accept interrupts in
> FlushRelationsAllBuffers()?

Yes. It would be reasonable for future work to add that.

> It would only happen as a side-effect of a
> non-error elog/ereport() processing interrupts, right?

Likely yes.

> It also looks like we couldn't accept interrupts when called by
> AbortTransaction(), because there we already are in a HOLD_INTERRUPTS()
> region. I'm pretty sure an error would trigger at least an assertion. But
> that's really an independent issue.

The only smgrdosyncall() caller is smgrDoPendingSyncs(), which doesn't call it
in the abort case. So I think we're good.

> Moved.

Thanks.

> > > I suspect it's always called with interrupts held already though.
> >
> > Ah, confirmed. If I put this assert at the top of smgrdounlinkall(),
> > check-world passes:
> >
> > Assert(IsBinaryUpgrade || InRecovery || !INTERRUPTS_CAN_BE_PROCESSED());
>
> I just made it hold interrupts for now, hope that makes sense?

Yep.

Patch looks perfect.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-03-20 20:23:13 Re: optimize file transfer in pg_upgrade
Previous Message Robert Haas 2025-03-20 20:10:32 Re: Have postgres.bki self-identify