Re: stress test for parallel workers

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Mark Wong <mark(at)2ndquadrant(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: stress test for parallel workers
Date: 2019-10-13 00:06:32
Message-ID: 25033.1570925192@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> In short, my current belief is that Linux PPC64 fails when trying
> to deliver a signal if there's right around 2KB of stack remaining,
> even though it should be able to expand the stack and press on.

I figured I should try to remove some variables from the equation
by demonstrating this claim without involving Postgres. The attached
test program eats some stack space and then waits to be sent SIGUSR1.
For some values of "some stack space", it dumps core:

[tgl(at)postgresql-fedora ~]$ gcc -g -Wall -O1 stacktest.c
[tgl(at)postgresql-fedora ~]$ ./a.out 1240000 &
[1] 11796
[tgl(at)postgresql-fedora ~]$ kill -USR1 11796
[tgl(at)postgresql-fedora ~]$ signal delivered, stack base 0x7fffdc160000 top 0x7fffdc031420 (1240032 used)

[1]+ Done ./a.out 1240000
[tgl(at)postgresql-fedora ~]$ ./a.out 1242000 &
[1] 11797
[tgl(at)postgresql-fedora ~]$ kill -USR1 11797
[tgl(at)postgresql-fedora ~]$
[1]+ Segmentation fault (core dumped) ./a.out 1242000
[tgl(at)postgresql-fedora ~]$ uname -a
Linux postgresql-fedora.novalocal 4.18.19-100.fc27.ppc64le #1 SMP Wed Nov 14 21:53:32 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

I don't think any further proof is required that this is
a kernel bug. Where would be a good place to file it?

regards, tom lane

Attachment Content-Type Size
stacktest.c text/x-c 1.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-10-13 00:09:06 Re: v12.0: ERROR: could not find pathkey item to sort
Previous Message Noah Misch 2019-10-12 22:57:47 Re: [HACKERS] Deadlock in XLogInsert at AIX