pgsql: Paper over regression failures in infinite_recurse() on PPC64 Li

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Paper over regression failures in infinite_recurse() on PPC64 Li
Date: 2020-10-13 21:45:22
Message-ID: E1kSS6k-0003zX-0F@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Paper over regression failures in infinite_recurse() on PPC64 Linux.

Our infinite_recurse() test to verify sane stack-overrun behavior
is affected by a bug of the Linux kernel on PPC64: it will get SIGSEGV
if it receives a signal when the stack depth is (a) over 1MB and
(b) within a few kB of filling the current physical stack allocation.
See https://bugzilla.kernel.org/show_bug.cgi?id=205183.

Since this test is a bit time-consuming and we run it in parallel with
test scripts that do a lot of DDL, it can be expected to get an sinval
catchup interrupt at some point, leading to failure if the timing is
wrong. This has caused more than 100 buildfarm failures over the
past year or so.

While a fix exists for the kernel bug, it might be years before that
propagates into all production kernels, particularly in some of the
older distros we have in the buildfarm. For now, let's just back off
and not run this test on Linux PPC64; that loses nothing in test
coverage so far as our own code is concerned.

To do that, split this test into a new script infinite_recurse.sql
and skip the test when the platform name is powerpc64...-linux-gnu.

Back-patch to v12. Branches before that have not been seen to get
this failure. No doubt that's because the "errors" test was not
run in parallel with other tests before commit 798070ec0, greatly
reducing the odds of an sinval catchup being necessary.

I also back-patched 3c8553547 into v12, just so the new regression
script would look the same in all branches having it.

Discussion: https://postgr.es/m/3479046.1602607848@sss.pgh.pa.us
Discussion: https://postgr.es/m/20190723162703.GM22387%40telsasoft.com

Branch
------
REL_13_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/855b6f287100f3eab24df0a83998db251ac4fd09

Modified Files
--------------
src/test/regress/expected/errors.out | 10 --------
src/test/regress/expected/infinite_recurse.out | 24 ++++++++++++++++++++
src/test/regress/expected/infinite_recurse_1.out | 16 +++++++++++++
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/errors.sql | 9 --------
src/test/regress/sql/infinite_recurse.sql | 29 ++++++++++++++++++++++++
7 files changed, 71 insertions(+), 20 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2020-10-14 05:55:36 pgsql: Correct error message
Previous Message Tom Lane 2020-10-13 21:45:21 pgsql: Paper over regression failures in infinite_recurse() on PPC64 Li