From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Pang <robertpang(at)google(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Back-patch of: avoid multiple hard links to same WAL file after a crash |
Date: | 2024-12-19 05:44:53 |
Message-ID: | Z2Oy1Z2nMVmTM5L5@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 18, 2024 at 08:51:20PM -0500, Andres Freund wrote:
> I don't think the issue is actually quite as unlikely to be hit as reasoned in
> the commit message. The crash has indeed to happen between the link() and
> unlink() - but at the end of a checkpoint we do that operations hundreds of
> times in a row on a busy server. And that's just after potentially doing lots
> of write IO during a checkpoint, filling up drive write caches / eating up
> IOPS/bandwidth disk quots.
Looks so, yep. Your timing and the report's timing are interesting.
I've been double-checking the code to refresh myself with the problem,
and I don't see a reason to not apply something like the attached set
down to v13 for all these remaining branches (minus an edit of the
commit message).
Thoughts?
--
Michael
Attachment | Content-Type | Size |
---|---|---|
0001-Replace-durable_rename_excl-by-durable_rename-ta-v15.patch | text/x-diff | 6.7 KB |
0001-Replace-durable_rename_excl-by-durable_rename-ta-v14.patch | text/x-diff | 5.9 KB |
0001-Replace-durable_rename_excl-by-durable_rename-ta-v13.patch | text/x-diff | 5.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-12-19 05:49:38 | Re: Fix for pageinspect bug in PG 17 |
Previous Message | Michael Paquier | 2024-12-19 04:21:54 | Re: per backend I/O statistics |