From: | Mark Wong <mark(at)2ndQuadrant(dot)com> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Why is infinite_recurse test suddenly failing? |
Date: | 2019-05-14 15:31:37 |
Message-ID: | 20190514153137.GC10216@2ndQuadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, May 10, 2019 at 05:26:43PM -0400, Andrew Dunstan wrote:
>
> On 5/10/19 3:35 PM, Tom Lane wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> >> On 2019-05-10 11:38:57 -0400, Tom Lane wrote:
> >>> I am wondering if, somehow, the stack depth limit seen by the postmaster
> >>> sometimes doesn't apply to its children. That would be pretty wacko
> >>> kernel behavior, especially if it's only intermittently true.
> >>> But we're running out of other explanations.
> >> I wonder if this is a SIGSEGV that actually signals an OOM
> >> situation. Linux, if it can't actually extend the stack on-demand due to
> >> OOM, sends a SIGSEGV. The signal has that information, but
> >> unfortunately the buildfarm code doesn't print it. p $_siginfo would
> >> show us some of that...
> >> Mark, how tight is the memory on that machine? Does dmesg have any other
> >> information (often segfaults are logged by the kernel with the code
> >> IIRC).
> > It does sort of smell like a resource exhaustion problem, especially
> > if all these buildfarm animals are VMs running on the same underlying
> > platform. But why would that manifest as "you can't have a measly two
> > megabytes of stack" and not as any other sort of OOM symptom?
> >
> > Mark, if you don't mind modding your local copies of the buildfarm
> > script, I think what Andres is asking for is a pretty trivial addition
> > in PGBuild/Utils.pm's sub get_stack_trace:
> >
> > my $cmdfile = "./gdbcmd";
> > my $handle;
> > open($handle, '>', $cmdfile) || die "opening $cmdfile: $!";
> > print $handle "bt\n";
> > + print $handle "p $_siginfo\n";
> > close($handle);
> >
> >
>
>
> I think we'll need to write that as:
>
>
> print $handle 'p $_siginfo',"\n";
Ok, I have this added to everyone now.
I think I also have caught up on this thread, but let me know if I
missed anything.
Regards,
Mark
--
Mark Wong
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2019-05-14 15:32:52 | Re: Inconsistent error message wording for REINDEX CONCURRENTLY |
Previous Message | Mark Wong | 2019-05-14 15:12:07 | Re: Why is infinite_recurse test suddenly failing? |