From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | The 026_overwrite_contrecord test might fail on extremely slow animals |
Date: | 2025-03-30 09:00:00 |
Message-ID: | 9ffdb19a-7a89-424e-925a-dd981c37f0ba@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello hackers,
A couple of buildfarm failures produced by skink, which runs
tests under Valgrind: [1], [2], with the following diagnostics:
31/324 postgresql:recovery / recovery/026_overwrite_contrecord ERROR 325.72s (exit status 255 or signal
127 SIGinvalid)
# Initializing node "standby" from backup "backup" of node "primary"
### Enabling streaming replication for node "standby"
### Starting node "standby"
# Running: pg_ctl -w -D
/home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/recovery/026_overwrite_contrecord/data/t_026_overwrite_contrecord_standby_data/pgdata
-l
/home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/recovery/026_overwrite_contrecord/log/026_overwrite_contrecord_standby.log
-o --cluster-name=standby start
waiting for server to start........ stopped waiting
pg_ctl: could not start server
Examine the log output.
# pg_ctl start failed; see logfile for details:
/home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/recovery/026_overwrite_contrecord/log/026_overwrite_contrecord_standby.log
# No postmaster PID for node "standby"
[14:44:39.051](7.936s) Bail out! pg_ctl start failed
026_overwrite_contrecord_standby.log:
2025-03-14 14:44:38.533 UTC [1558222][startup][:0] LOG: database system was interrupted; last known up at 2025-03-14
14:44:30 UTC
2025-03-14 14:44:38.806 UTC [1558222][startup][:0] LOG: invalid checkpoint record
2025-03-14 14:44:38.808 UTC [1558222][startup][:0] PANIC: could not locate a valid checkpoint record at 0/2094248
2025-03-14 14:44:38.937 UTC [1555866][postmaster][:0] LOG: startup process (PID 1558222) was terminated by signal 6:
Aborted
2025-03-14 14:44:38.937 UTC [1555866][postmaster][:0] LOG: aborting startup due to startup process failure
026_overwrite_contrecord_primary.log:
2025-03-14 14:44:30.536 UTC [1553014][client backend][4/2:0] LOG: statement: SELECT
pg_walfile_name(pg_current_wal_insert_lsn())
2025-03-14 14:44:30.621 UTC [1519069][checkpointer][:0] LOG: checkpoint complete: wrote 109 buffers (85.2%), wrote 3
SLRU buffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=12.404 s, sync=0.013 s, total=12.574 s; sync files=28,
longest=0.004 s, average=0.001 s; distance=8299 kB, estimate=8299 kB; lsn=0/2094248, redo lsn=0/1FA5A48
2025-03-14 14:44:31.119 UTC [1519026][postmaster][:0] LOG: received immediate shutdown request
indicates that this test might fail when a checkpoint performed during the
test execution (that is, when the duration of the test up to
$node->stop('immediate');
exceeds 300 secs).
This failure can be easily reproduced with:
--- a/src/test/recovery/t/026_overwrite_contrecord.pl
+++ b/src/test/recovery/t/026_overwrite_contrecord.pl
@@ -24,2 +24,3 @@ autovacuum = off
wal_keep_size = 1GB
+checkpoint_timeout=30s
));
@@ -61,2 +62,3 @@ $node->safe_psql('postgres',
#$node->safe_psql('postgres', qq{create table foo ()});
+sleep(40);
my $endfile = $node->safe_psql('postgres',
The last 5 runs of the test on skink on master show the following duration of the test:
29/331 postgresql:recovery / recovery/026_overwrite_contrecord OK 317.37s 3 subtests passed
29/331 postgresql:recovery / recovery/026_overwrite_contrecord OK 412.78s 3 subtests passed
29/331 postgresql:recovery / recovery/026_overwrite_contrecord OK 448.46s 3 subtests passed
29/331 postgresql:recovery / recovery/026_overwrite_contrecord OK 445.73s 3 subtests passed
29/331 postgresql:recovery / recovery/026_overwrite_contrecord OK 383.50s 3 subtests passed
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-09%2020%3A23%3A09 - REL_17_STABLE
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-14%2013%3A52%3A09 - master
Best regards,
Alexander Lakhin
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2025-03-30 10:14:01 | Re: Options to control remote transactions’ access/deferrable modes in postgres_fdw |
Previous Message | Gurjeet Singh | 2025-03-30 06:46:49 | Minor edits to README.tuplock, and a question |