Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman
Date: 2022-04-16 20:56:33
Message-ID: CA+hUKGL=BN-Fk8JbbWGtL2HK1tJxQRg+4Czs76v+bSN7Si0A5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Tue, Apr 12, 2022 at 3:49 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> All that stuff leads me to the attached. Thoughts?

Under valgrind I got "Undefined subroutine &main::usleep called at
t/002_archiving.pl line 103" so I added "use Time::HiRes qw(usleep);",
and now I get past the first 4 tests with your patch, but then
promotion times out, not sure why:

+++ tap check in src/test/recovery +++
t/002_archiving.pl ..
ok 1 - check content from archives
ok 2 - archive_cleanup_command executed on checkpoint
ok 3 - recovery_end_command not executed yet
# found 00000002.history after 14 attempts
ok 4 - recovery_end_command executed after promotion
Bailout called. Further testing stopped: command "pg_ctl -D
/home/tmunro/projects/postgresql/src/test/recovery/tmp_check/t_002_archiving_standby2_data/pgdata
-l /home/tmunro/projects/postgresql/src/test/recovery/tmp_check/log/002_archiving_standby2.log
promote" exited with value 1

Since it's quite painful to run TAP tests under valgrind, I found a
place to stick a plain old sleep to repro these problems:

--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -1035,7 +1035,7 @@ sub enable_restoring
my $copy_command =
$PostgreSQL::Test::Utils::windows_os
? qq{copy "$path\\\\%f" "%p"}
- : qq{cp "$path/%f" "%p"};
+ : qq{sleep 1 && cp "$path/%f" "%p"};

Soon I'll push the fix to the slowness that xlogprefetcher.c
accidentally introduced to continuous archive recovery, ie the problem
of calling a failing restore_command repeatedly as we approach the end
of a WAL segment instead of just once every 5 seconds after we run out
of data, and after that you'll probably need to revert that fix
locally to repro this.

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2022-04-16 21:48:30 pgsql: pgstat: Use correct lock level in pgstat_drop_all_entries().
Previous Message Tom Lane 2022-04-16 20:26:57 Re: pgsql: Fix some trailing whitespace in documentation files

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-04-16 21:36:33 Re: Crash in new pgstats code
Previous Message Jesper Pedersen 2022-04-16 19:42:05 Re: GSoC: pgmoneta: Write-Ahead Log (WAL) infrastructure (2022)