Re: Should we add a compiler warning for large stack frames?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Should we add a compiler warning for large stack frames?
Date: 2024-04-12 02:15:22
Message-ID: 20240412021522.auey5gkugc4ls5do@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-04-11 15:07:11 -0700, Andres Freund wrote:
> On 2024-04-11 16:35:58 -0400, Tom Lane wrote:
> > Indeed. I recall reading, not long ago, some Linux kernel docs to the
> > effect that automatic stack growth is triggered by a reference into
> > the page just below what is currently mapped as your stack, and
> > therefore allocating a stack frame greater than one page has the
> > potential to cause SIGSEGV rather than the desired stack extension.
> > (If you feel like digging in the archives, I think this was brought
> > up in the last round of lets-add-some-more-check_stack_depth-calls.)
>
> I think it's more than a single page, but I'm not entirely sure either. I
> think some compilers inject artificial stack accesses when extending the stack
> by a lot, but I don't remember the details.
>
> There certainly was the issue that strict memory overcommit does not reliably
> work with larger stack extensions.
>
> Could be worth writing a test program for...

It looks like it's a mess.

In the good cases the kernel doesn't map anything within ulimit -R of the
stack, and the stack is extended whenever memory in that range is accessed.
Nothing is mapped into that region unless MAP_FIXED is used.

However, in some cases linux maps the heap and the stack fairly close to each
other at program startup. I've observed this with an executable compiled with
-static-pie and executed with randomization disabled (via setarch -R). In
that case the the layout at program start is

...
7ffff7fff000-7ffff8021000 rw-p 00000000 00:00 0 [heap]
7ffffffdd000-7ffffffff000 rw-p 00000000 00:00 0 [stack]

Here the start of the heap and the end of the stack are only 128MB appart. The
heap grows upwards, the stack downwards.

Which means that if glibc allocates a bunch of memory via sbrk() and the stack
grows, they clash into each other.

I think this may be a glibc bug. If I compile with musl instead, this doesn't
happen, because musl stops using sbrk() style allocations before stack and
program break get too close to each other.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-04-12 02:15:27 Re: post-freeze damage control
Previous Message jian he 2024-04-12 02:11:54 Re: Can't find not null constraint, but \d+ shows that