From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Robert Pang <robertpang(at)google(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Back-patch of: avoid multiple hard links to same WAL file after a crash |
Date: | 2024-12-18 16:38:19 |
Message-ID: | Z2L6e1w-xABVTBRR@nathan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 17, 2024 at 04:50:16PM -0800, Robert Pang wrote:
> We recently observed a few cases where Postgres running on Linux
> encountered an issue with WAL segment files. Specifically, two WAL
> segments were linked to the same physical file after Postgres ran out
> of memory and the OOM killer terminated one of its processes. This
> resulted in the WAL segments overwriting each other and Postgres
> failing a later recovery.
Yikes!
> We found this fix [1] that has been applied to Postgres 16, but the
> cases we observed were running Postgres 15. Given that older major
> versions will be supported for a good number of years, and the
> potential for irrecoverability exists (even if rare), we would like to
> discuss the possibility of back-patching this fix.
IMHO this is a good time to reevaluate. It looks like we originally didn't
back-patch out of an abundance of caution, but now that this one has had
time to bake, I think it's worth seriously considering, especially now that
we have a report from the field.
--
nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2024-12-18 16:51:48 | Re: Regression tests fail on OpenBSD due to low semmns value |
Previous Message | Tom Lane | 2024-12-18 16:23:23 | Re: Regression tests fail on OpenBSD due to low semmns value |