Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

From: Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error
Date: 2024-08-20 03:39:42
Message-ID: CAGjhLkNZjeR99DZ5oaHk0p2BzH=okzOLsBb-+vwiGakRtffqFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com> 于2024年8月20日周二 11:32写道:

>
>
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 于2024年8月19日周一 22:12写道:
>
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> > On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com>
>> wrote:
>> >> If the code in PG_TRY contains any non local control flow other than
>> >> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot
>> >> be called, then the PG_exception_stack will point to the memory whose
>> >> stack frame has been released. So after that, when the pg_re_throw
>> >> called, __longjmp() will crash and report Segmentation fault error.
>> >>
>> >> Addition to sigjmp_buf, add another field 'int magic' which is next to
>> >> the sigjum_buf in the local stack frame memory. The magic's value is
>> always
>> >> 'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if
>> >> the magic's value is still '0x12345678', if not, that means the memory
>> >> where the 'PG_exception_stack' points to has been released, and the
>> 'sigbuf'
>> >> must be invalid.
>>
>> > This is an interesting idea. I suspect if we do this we want a
>> > different magic number and a different error message than what you
>> > chose here, but those are minor details.
>>
>> I would suggest just adding an Assert; I doubt this check would be
>> useful in production.
>>
>
> Agree, an Assert is enough.
>
>>
>> > I'm not sure how reliable this would be at actually finding problems,
>> > though.
>>
>> Yeah, that's the big problem. I don't have any confidence at all
>> that this would detect misuse. It'd require that the old stack
>> frame gets overwritten, which might not happen for a long time,
>> and it'd require that somebody eventually do a longjmp, which again
>> might not happen for a long time --- and when those did happen, the
>> error would be detected in someplace far away from the actual bug,
>> with little evidence remaining to help you localize it.
>>
>
> Exactly, it cannot tell you which PG_TRY left the invalid sigjmp_buf,
> but to implement that is easy I think, recording the line num maybe.
>
> I think this is still useful, at least it tell you that the error is due
> to the
> PG_TRY.
>
>>
>> Also, as soon as some outer level of PG_TRY is exited in the proper
>> way, the dangling stack pointer will be fixed up. That means there's
>> a fairly narrow time frame in which the overwrite and longjmp must
>> happen for this to catch a bug.
>>
>>
> Yes, if the outer level PG_TRY call pg_re_throw instead of ereport,
> the dangling stack pointer will be fixed up. It's would be great to fix
> that
> up in any case. But I have no idea how to implement that now.
>
>
Sorry, "Yes, if the outer level PG_TRY call pg_re_throw instead of
ereport, " should
be "Yes, if the outer level PG_TRY reset the PG_exception_stack."

In pg_re_throw, if we could use '_local_sigjmp_buf' instead of the
> global var PG_exception_stack, that would be great. We don't
> need to worry about the invalid sigjum_buf.
>
> So on the whole I doubt this'd be terribly useful in this form;
>> and I don't like the amount of code churn required.
>>
>> > Personally, I don't think I've ever made this particular mistake. I
>> > think a much more common usage error is exiting the catch-block
>> > without either throwing an error or rolling back a subtransaction. But
>> > YMMV, of course.
>>
>> We have had multiple instances of code "return"ing out of a PG_TRY,
>> so I fully agree that some better way to detect that would be good.
>> But maybe we ought to think about static analysis for that.
>>
>> regards, tom lane
>>
>
>
> --
> Best regards !
> Xiaoran Wang
>

--
Best regards !
Xiaoran Wang

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-08-20 03:44:03 Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error
Previous Message jian he 2024-08-20 03:37:53 Re: documentation structure