Re: Several buildfarm animals fail tests because of shared memory error

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Robins Tharakan <tharakan(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Several buildfarm animals fail tests because of shared memory error
Date: 2025-01-09 05:00:01
Message-ID: 35d87371-f3ab-42c8-9aac-bb39ab5bd987@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Robins,

22.12.2024 09:27, Robins Tharakan wrote:
> - The only info about leafhopper may be relevant is that it's one of the newest machines (Graviton4) so it comes with
> a recent hardware / kernel / stock gcc 11.4.1.
>

Could you please take a look at leafhopper. which is producing weird test
failures rather often? For example,
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2024-12-16%2023%3A43%3A03 - REL_15_STABLE
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2024-12-21%2022%3A18%3A04 - REL_16_STABLE
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2025-01-02%2009%3A21%3A04 - REL_17_STABLE

--- /home/bf/proj/bf/build-farm-17/REL_16_STABLE/pgsql.build/src/test/regress/expected/select_parallel.out 2024-12-21
22:18:03.844773742 +0000
+++ /home/bf/proj/bf/build-farm-17/REL_16_STABLE/pgsql.build/src/test/recovery/tmp_check/results/select_parallel.out
2024-12-21 22:23:28.264849796 +0000
@@ -551,7 +551,7 @@
    ->  Nested Loop (actual rows=98000 loops=1)
          ->  Seq Scan on tenk2 (actual rows=10 loops=1)
                Filter: (thousand = 0)
-               Rows Removed by Filter: 9990
+               Rows Removed by Filter: 9395

Or:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2024-12-18%2023%3A35%3A04 - master
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2025-01-02%2009%3A22%3A04 - master
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2025-01-08%2007%3A38%3A03 - master
#   Failed test 'regression tests pass'
#   at t/027_stream_regress.pl line 95.
#          got: '256'
#     expected: '0'
# Looks like you failed 1 test of 9.
[23:42:59] t/027_stream_regress.pl ...............
...
diff -U3 /home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/regress/expected/memoize.out
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/results/memoize.out
--- /home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/regress/expected/memoize.out 2024-12-18 23:35:04.318987642
+0000
+++ /home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/results/memoize.out 2024-12-18
23:42:24.806028990 +0000
@@ -179,7 +179,7 @@
                Hits: 980  Misses: 20  Evictions: Zero  Overflows: 0  Memory Usage: NkB
                ->  Seq Scan on tenk1 t2 (actual rows=1 loops=N)
                      Filter: ((t1.twenty = unique1) AND (t1.two = two))
-                     Rows Removed by Filter: 9999
+                     Rows Removed by Filter: 9775
 (12 rows)

Maybe you could try to reproduce such failures without buildfarm client, just
by running select_parallel, for example, with the attached patch applied.
I mean running `make check` with parallel_schedule like:
...
# ----------
# Run these alone so they don't run out of parallel workers
# select_parallel depends on create_misc
# ----------
test: select_parallel
test: select_parallel
test: select_parallel
....
(e.g. with 100 repetitions)

Or
TESTS="test_setup copy create_misc create_index $(printf "select_parallel %.0s" {1..100})" make check-tests

Best regards,
Alexander Lakhin
Neon (https://neon.tech)

Attachment Content-Type Size
select_parallel-repeatable.patch text/x-patch 798 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2025-01-09 05:31:03 Orphaned users in PG16 and above can only be managed by Superusers
Previous Message Michael Paquier 2025-01-09 04:53:48 Re: WAL-logging facility for pgstats kinds