Re: PATCH: backtraces for error messages

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: backtraces for error messages
Date: 2018-06-25 09:08:41
Message-ID: CAMsr+YGdJTGygmGhRhfn+8DLi6o+Teq+tcA-Dr3kK+8vYqwzCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 25 June 2018 at 14:21, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
> wrote:

> Hi.
>
> At Mon, 25 Jun 2018 09:32:36 +0800, Craig Ringer <craig(at)2ndquadrant(dot)com>
> wrote in <CAMsr+YGBw9tgKRGxyihVeMzmjQx_2t8D17tE7t5-0gMdW7S6UA(at)mail(dot)
> gmail.com>
> > On 21 June 2018 at 19:09, Kyotaro HORIGUCHI <
> horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
> > > wrote:
> >
> > I think this for assertion failure is no problem but I'm not sure
> > > for other cases.
> >
> >
> > I think it's pretty strongly desirable for PANIC.
>
> Ah, I forgot about that. I agree to that. The cost to collect the
> information is not a problem on PANIC. However still I don't
> think stack trace is valuable on all PANIC messages. I can accept
> the guc to control it but it is preferable that this works fine
> without such an extra setup.
>

Places such as?

>
> > > We could set proper context description or other
> > > additional information in error messages before just dumping a
> > > trace for known cases.
> > >
> >
> > Yeah. The trouble there is that there are a _lot_ of places to touch for
> > such things, and inevitably the one you really want to see will be
> > something that didn't get suitably annotated.
>
> Agreed, it is the reality. Instaed, can't we make a new error
> classes PANIC_STACKDUMP and ERROR_STACKDUMP to explicitly
> restrict stack dump for elog()? Or elog_stackdump() and elog()
> is also fine for me. Using it is easier than proper
> annotating. It would be perfect if we could invent an automated
> way but I don't think it is realistic.
>

That needlessly complicates error severity levels with information not
really related to the severity. -1 from me.

> Mmm. If I understand you correctly, I mean that perf doesn't dump
> a backtrace on a probe point but trace points are usable to take
> a symbolic backtrace. (I'm sorry that I cannot provide an example
> since stap doesn't work in my box..)
>

perf record --call-graph dwarf -e sdt_postgresql:checkpoint__start -u
postgres
perf report -g

> If your intention is to take back traces without any setting (I
> think it is preferable), it should be restricted to the required
> points. It can be achieved by the additional error classes or
> substitute error output functions.
>

Who's classifying all the possible points?

Which PANICs or assertion failures do you want to exempt?

I definitely do not want to emit stacks for everything, like my patch
currently does. It's just a proof of concept. Later on I'll want control on
a fine grained level at runtime of when that happens, but that's out of
scope for this. For now the goal is emit stacks at times it's obviously
pretty sensible to have a stack, and do it in a way that doesn't require
per-error-site maintenance/changes or create backport hassle.

> As just an idea but can't we use an definition file on that
> LOCATION of error messages that needs to dump a backtrace are
> listed? That list is usually empty and should be very short if
> any. The LOCATION information is easily obtained from a verbose
> error message itself if once shown but a bit hard to find
> otherwise..
>

That's again veering into selective logging control territory. Rather than
doing it for stack dump control only, it should be part of a broader
control over dynamic and scoped verbosity, selective logging, and log
options, like Pavan raised. I see stacks as just one knob that can be
turned on/off here.

> (That reminds me, I need to chat with Devrim about creating a longer lived
> > debuginfo + old versions rpms repo for Pg its self, if not the accessory
> > bits and pieces. I'm so constantly frustrated by not being able to get
> > needed debuginfo packages to investigate some core or running system
> > problem because they've been purged from the PGDG yum repo as soon as a
> new
> > version comes out.)
>
> We in our department take care to preserve them for ourselves for
> the necessity of supporting older systems. I sometimes feel that
> It is very helpful if they were available on the official

Maybe if I can get some interest in that, you might be willing to
contribute your archives as a starter so we have them for back-versions?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2018-06-25 09:32:23 Re: libpq compression
Previous Message Magnus Hagander 2018-06-25 08:57:29 Re: Incorrect fsync handling in pg_basebackup's tar_finish