From: | Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error |
Date: | 2024-08-20 03:32:40 |
Message-ID: | CAGjhLkNa_bru6HGjQ3o33eXZLR4nkQJLrHfqtwjMuY2NjZKfoA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 于2024年8月19日周一 22:12写道:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com>
> wrote:
> >> If the code in PG_TRY contains any non local control flow other than
> >> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot
> >> be called, then the PG_exception_stack will point to the memory whose
> >> stack frame has been released. So after that, when the pg_re_throw
> >> called, __longjmp() will crash and report Segmentation fault error.
> >>
> >> Addition to sigjmp_buf, add another field 'int magic' which is next to
> >> the sigjum_buf in the local stack frame memory. The magic's value is
> always
> >> 'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if
> >> the magic's value is still '0x12345678', if not, that means the memory
> >> where the 'PG_exception_stack' points to has been released, and the
> 'sigbuf'
> >> must be invalid.
>
> > This is an interesting idea. I suspect if we do this we want a
> > different magic number and a different error message than what you
> > chose here, but those are minor details.
>
> I would suggest just adding an Assert; I doubt this check would be
> useful in production.
>
Agree, an Assert is enough.
>
> > I'm not sure how reliable this would be at actually finding problems,
> > though.
>
> Yeah, that's the big problem. I don't have any confidence at all
> that this would detect misuse. It'd require that the old stack
> frame gets overwritten, which might not happen for a long time,
> and it'd require that somebody eventually do a longjmp, which again
> might not happen for a long time --- and when those did happen, the
> error would be detected in someplace far away from the actual bug,
> with little evidence remaining to help you localize it.
>
Exactly, it cannot tell you which PG_TRY left the invalid sigjmp_buf,
but to implement that is easy I think, recording the line num maybe.
I think this is still useful, at least it tell you that the error is due
to the
PG_TRY.
>
> Also, as soon as some outer level of PG_TRY is exited in the proper
> way, the dangling stack pointer will be fixed up. That means there's
> a fairly narrow time frame in which the overwrite and longjmp must
> happen for this to catch a bug.
>
>
Yes, if the outer level PG_TRY call pg_re_throw instead of ereport,
the dangling stack pointer will be fixed up. It's would be great to fix
that
up in any case. But I have no idea how to implement that now.
In pg_re_throw, if we could use '_local_sigjmp_buf' instead of the
global var PG_exception_stack, that would be great. We don't
need to worry about the invalid sigjum_buf.
So on the whole I doubt this'd be terribly useful in this form;
> and I don't like the amount of code churn required.
>
> > Personally, I don't think I've ever made this particular mistake. I
> > think a much more common usage error is exiting the catch-block
> > without either throwing an error or rolling back a subtransaction. But
> > YMMV, of course.
>
> We have had multiple instances of code "return"ing out of a PG_TRY,
> so I fully agree that some better way to detect that would be good.
> But maybe we ought to think about static analysis for that.
>
> regards, tom lane
>
--
Best regards !
Xiaoran Wang
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2024-08-20 03:37:53 | Re: documentation structure |
Previous Message | jian he | 2024-08-20 03:18:00 | Re: documentation structure |