From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? |
Date: | 2022-12-31 05:36:02 |
Message-ID: | 20221231053602.GB1565918@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Dec 31, 2022 at 10:06:53AM +1300, Thomas Munro wrote:
> On Thu, Dec 29, 2022 at 9:40 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Tue, Jul 26, 2022 at 07:22:52PM -0400, Tom Lane wrote:
> > > Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > > > ... The regular expression machinery is capable of
> > > > consuming a lot of CPU, and does CANCEL_REQUESTED(nfa->v->re)
> > > > frequently to avoid getting stuck. With the patch as it stands, that
> > > > would never be true.
> > >
> > > Surely that can't be too hard to fix. We might have to refactor
> > > the code around QueryCancelPending a little bit so that callers
> > > can ask "do we need a query cancel now?" without actually triggering
> > > a longjmp ... but why would that be problematic?
> >
> > It could work. The problems are like those of making code safe to run in a
> > signal handler. You can use e.g. snprintf in rcancelrequested(), but you
> > still can't use palloc() or ereport(). I see at least these strategies:
> >
> > 1. Accept that recovery conflict checks run after a regex call completes.
> > 2. Have rcancelrequested() return true unconditionally if we need a conflict
> > check. If there's no actual conflict, restart the regex.
> > 3. Have rcancelrequested() run the conflict check, including elog-using
> > PostgreSQL code. On longjmp(), accept the leak of regex mallocs.
> > 4. Have rcancelrequested() run the conflict check, including elog-using
> > PostgreSQL code. On longjmp(), escalate to FATAL.
> > 5. Write the conflict check code to dutifully avoid longjmp().
> > 6. Convert src/backend/regex to use palloc, so longjmp() is fine.
>
> Thanks! I appreciate the help getting unstuck here. I'd thought
> about some of these but not all. I also considered a couple more:
>
> 7. Do a CFI() in a try/catch if INTERRUPTS_PENDING_CONDITION() is
> true, and copy the error somewhere to be re-thrown later after the
> regexp code exits with REG_CANCEL.
> 8. Do a CFI() in a try/catch if INTERRUPTS_PENDING_CONDITION() is
> true, and call a new regexp function that will free everything before
> re-throwing.
>
> After Tom's response I spent some time trying to figure out how to
> make a SOFT_CHECK_FOR_INTERRUPTS(), which would return a value to
> indicate that it would like to throw. I think it would need to re-arm
> various flags and introduce a programming rule for all interrupt
> processing routines that if they fired once under a soft check they
> must fire again later under a non-soft check. That all seems a bit
> complicated, and a general mechanism like that seemed like overkill
> for a single user, which led me to idea #7.
>
> Idea #8 is a realisation that twisting oneself into a pretzel to avoid
> having to change the regexp code or its REG_CANCEL control flow may be
> a bit silly. If the only thing it really needs to do is free some
> memory, maybe the regexp module should provide a function that frees
> everything that is safe to call from our rcancelrequested callback, so
> we can do so before we longjmp back to Kansas. Then the REG_CANCEL
> code paths would be effectively unreachable in PostgreSQL. I don't
> know if it's better or worse than your idea #6, "use palloc instead,
> it already has garbage collection, duh", but it's a different take on
> the same realisation that this is just about free().
PG_TRY() isn't free, so it's nice that (6) doesn't add one. If (6) fails in
some not-yet-revealed way, (8) could get more relevant.
> I guess idea #6 must be pretty easy to try: just point that MALLOC()
> macro to palloc(), and do a plain old CFI() in rcancelrequested().
> Why do you suggest #3 as an interim measure?
No strong reason. I think I suggested it because it's a strict subset of (6),
but I didn't think through in detail. (I've never modified src/backend/regex
and have barely read its code, for whatever that's worth.)
> Do we fear that palloc() might hurt regexp performance?
Nah. I don't recall any place in PostgreSQL where performance is an argument
for raw malloc() calls.
> > Incidentally, the affected test
> > contains comment "# DROP TABLE containing block which standby has in a pinned
> > buffer". The standby holds no pin at that moment; the LOCK TABLE pins system
> > catalog pages, but it drops every pin it acquires.
>
> Oh, I guess the comment is just wrong? There are earlier sections
> concerned with buffer pins, but the section "RECOVERY CONFLICT 3" is
> about locks.
Yes.
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2022-12-31 06:16:40 | Re: allowing for control over SET ROLE |
Previous Message | jian he | 2022-12-31 05:09:10 | Re: Infinite Interval |