Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Pang <robertpang(at)google(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Back-patch of: avoid multiple hard links to same WAL file after a crash
Date: 2025-04-05 22:42:02
Message-ID: Z_GxukueQn7lDs1u@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 05, 2025 at 12:13:39PM -0700, Noah Misch wrote:
> Since the 2025-02 releases made non-toy-size archive recoveries fail easily,
> that's not enough. If the proposed 3-second test is the wrong thing, what
> instead?

I don't have a good idea about that in ~16, TBH, but I am sure to not
be a fan of the low reproducibility rate of this test as proposed.
It's not perfect, but as the design to fix the original race condition
has been introduced in v15, why not begin with a test in 17~ using
some injection points? This should be good enough while having a good
reproduction rate as the order of the actions in the restart points
would be controlled.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-04-05 23:00:13 Re: Draft for basic NUMA observability
Previous Message Andres Freund 2025-04-05 22:29:22 Re: Draft for basic NUMA observability