Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Pang <robertpang(at)google(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Back-patch of: avoid multiple hard links to same WAL file after a crash
Date: 2025-04-05 19:13:39
Message-ID: 20250405191339.8d.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 05, 2025 at 11:07:13AM -0400, Tom Lane wrote:
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> > On Wed, Apr 02, 2025 at 05:29:00PM -0700, Noah Misch wrote:
> >> Here it is. Making it fail three times took looping 1383s, 5841s, and 2594s.
> >> Hence, it couldn't be expected to catch the regression before commit, but it
> >> would have made sufficient buildfarm and CI noise in the day after commit.
>
> > Hmm. Not much of a fan of the addition of a test that has less than
> > 1% of reproducibility for the problem, even if it's good to see that
> > this can be made portable to run down to v13.
>
> Yeah, it's good to have a test but I doubt we should commit it.
> Too many buildfarm cycles will be expended for too little result.

Current extent of our archive recovery restartpoint test coverage:

$ grep -c 'restartpoint starting' $(grep -rl 'restored log file' **/log) | grep -v :0
src/bin/pg_combinebackup/tmp_check/log/002_compare_backups_pitr1.log:1
src/test/recovery/tmp_check/log/020_archive_status_standby2.log:1
src/test/recovery/tmp_check/log/002_archiving_standby.log:1
src/test/recovery/tmp_check/log/020_archive_status_standby.log:1
src/test/recovery/tmp_check/log/035_standby_logical_decoding_standby.log:2

Since the 2025-02 releases made non-toy-size archive recoveries fail easily,
that's not enough. If the proposed 3-second test is the wrong thing, what
instead?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2025-04-05 19:29:21 Re: Enhancing Memory Context Statistics Reporting
Previous Message Anton A. Melnikov 2025-04-05 19:10:02 Re: Use XLOG_CONTROL_FILE macro everywhere?